Combining optical character recognition with paper ECG digitization

Academic Article


  • Objective: We propose a MATLAB-based tool to convert electrocardiography (ECG) waveforms from paper-based ECG records into digitized ECG signals that is vendor-agnostic. The tool is packaged as an open source standalone graphical user interface (GUI) based application. Methods and procedures: To reach this objective we: (1) preprocess the ECG records, which includes skew correction, background grid removal and linear filtering; (2) segment ECG signals using Connected Components Analysis (CCA); (3) implement Optical Character Recognition (OCR) for removal of overlapping ECG lead characters and for interfacing of patients' demographic information with their research records or their electronic medical record (EMR). The ECG digitization results are validated through a reader study where clinically salient features, such as intervals of QRST complex, between the paper ECG records and the digitized ECG records are compared. Results: Comparison of clinically important features between the paper-based ECG records and the digitized ECG signals, reveals intra- and inter-observer correlations of 0.86-0.99 and 0.79-0.94, respectively. The kappa statistic was found to average at 0.86 and 0.72 for intra- and inter-observer correlations, respectively. Conclusion: The clinically salient features of the ECG waveforms such as the intervals of QRST complex, are preserved during the digitization procedure. Clinical and Healthcare Impact: This open-source digitization tool can be used as a research resource to digitize paper ECG records thereby enabling development of new prediction algorithms to risk stratify individuals with cardiovascular disease, and/or allow for development of ECG-based cardiovascular diagnoses relying upon automated digital algorithms.
  • Digital Object Identifier (doi)

    Author List

  • Ganesh S; Bhatti PT; Alkhalaf M; Gupta S; Shah AJ; Tridandapani S
  • Volume

  • 9