WaveFormer_ECG_Slot13_Presentation FINAL.pptx

SaiKoundinya7 7 views 16 slides Aug 30, 2025
Slide 1
Slide 1 of 16
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16

About This Presentation

Introduction

Accurate ECG interpretation from scanned strips is vital in resource limited settings. Prior deep learning models often ignored hierarchical relationships among EC abnormalities, limiting performance. We hypothesized that combining bimodal co-attention (across ECG images and continuous...


Slide Content

Slot No. 13 WaveFormer: Bimodal Co-Attention and Graph-Augmented Hierarchical ECG Strip Classification Understanding AI-based ECG diagnosis from paper-like images using rhythm–morphology fusion and clinical label structure (PTB-XL dataset)

Disclosure of Financial Interests No relevant financial relationships or conflict of interests

Aims and Objectives ECGs are often printed or scanned in resource-limited settings. Digital waveform capture is not always available. Accurate interpretation remains challenging for AI models. Existing models ignore class hierarchy and ECG rhythm/morphology interplay. Goal: Build a clinically consistent, AI-powered diagnostic tool.

Related Work Prior ECG AI Approaches: 1D Signal Models (e.g., CNNs, LSTMs, ResNet ) operate on waveform data (Chen et al., IEEE JBHI, 2022) Image-Based Models convert signals into spectrograms or stylized ECGs, but rarely use paper-like ECG images Most models use flat multilabel heads or target small class subsets (e.g., 5–10 labels) ( Siontis et al., Nat Rev Cardiol , 2023) Common Limitations: Class hierarchy (e.g., MI → AMI/PMI) is ignored in label design and model logic Many models assume single-label outputs, not multilabel cardiac scenarios Few attempts exist to train on 17 PTB-XL classes, especially using realistic ECG image formats Our Contributions: First to classify all 17 PTB-XL diagnostic classes (5 superclasses , 11 subclasses, and NORM) Uses both ECG image (morphology) and CWT (rhythm) features with bidirectional co-attention Enforces diagnostic consistency via a Graph Neural Network (GNN) encoding class relationships Enables soft multilabel prediction, supporting realistic clinical co-diagnoses

Methodology: Dataset: PTB-XL Overview 17,441 ECG signals for training, 2,291 for testing. 26 total diagnostic labels: 5 superclasses, 21 subclasses. Multilabel format – more than one diagnosis per ECG is possible.

ECG Preprocessing Pipeline Raw voltages plotted to mimic real ECG printouts. Gridlines removed using Hough Transform. Waveforms enhanced with adaptive thresholding and binarized. Each lead is isolated using bounding boxes. CWT (wavelet) maps created from final image using Morlet wavelets.

ECG Pre-Processing

AI Model: WaveFormer Two inputs: ECG image + corresponding CWT map. Separate CNN blocks extract features from each input. Both streams fed into transformer models (ViT). Co attention matches rhythm and morphology info. GNN-based label embeddings maintain clinical hierarchy.

WaveFormer

Stepwise Diagnosis Prediction Stage 1 : Normal vs Abnormal – if Normal, stop here. Stage 2 : Detect major abnormality type (superclass) if sigmoid logits > 0.2. Stage 3 : Predict specific disease (subclass) only if relevant superclass is active. Soft thresholds allow multiple diagnoses if likely. Final subclass prediction = sigmoid(subclass) × sigmoid(superclass) > 0.13 Example: An ECG may show both conduction disorder and ischemia.

Training Summary & Results Batch size: 128 Optimizer: Adam, learning rate 0.0001, weight decay 0.001. Cosine annealing learning rate scheduler. Loss: Weighted Binary Focal Cross Entropy . Weighted F1 score: 0.84 | Macro AUC: 0.92 | Weighted Accuracy: 88% Strong performance across rare and common conditions.

ECG Superclass Performance

ECG All Class Performance

Implications and Conclusion AI can interpret paper-like ECG images directly. Multilabel and hierarchy-aware predictions mirror how doctors think. Model can handle overlapping diagnoses robustly. Potential to support diagnostics in low-resource or digitization-limited areas. Designed for real-world use, not just ideal datasets.