Wearable electrocardiogram (ECG) devices such as smartwatches and ambulatory monitors generate large-scale continuous cardiac data suitable for arrhythmia detection in real-world settings. However, the development of supervised machine learning models is limited by the scarcity of expert-annotated ECG data, class imbalance due to rare arrhythmias, and privacy constraints that restrict data sharing. These challenges make it difficult for traditional deep learning approaches to scale effectively in clinical applications.This work proposes a self-supervised contrastive learning framework that leverages large volumes of unlabeled wearable ECG data to learn meaningful cardiac representations. Using ECG-specific data augmentations, the model is trained to maximize agreement between different views of the same signal while distinguishing between different segments. A deep encoder produces latent embeddings, which are optimized through a contrastive loss, and later adapted for arrhythmia classification using a lightweight classifier with minimal labeled data.The proposed approach reduces dependence on expert annotations, improves generalization across devices and populations, and supports privacy-preserving training. Overall, it offers a scalable and efficient pathway for wearable-based arrhythmia detection, potentially enabling earlier diagnosis and broader deployment of cardiac AI systems in resource-limited healthcare settings.
Chest X-ray report generation is time-consuming and contributes to radiologist workload and burnout, motivating the need for AI systems that can reduce cognitive burden while preserving clinical accuracy. Although encoder-decoder models can generate reports from images, they often suffer from hallucinations, producing findings that are not present or missing real abnormalities due to lack of explicit grounding in evidence, making them unreliable for clinical use. To address this, we propose a cross-modal retrieval framework that generates reports by retrieving and assembling clinically validated sentences from existing radiology reports rather than generating text from scratch. The system uses contrastive learning to align chest X-ray image patches with report sentences in a shared embedding space, enabling retrieval of the most relevant clinical descriptions. A patch encoder extracts visual features, a sentence encoder represents report text, and a retrieval module identifies semantically matching sentences, which are then composed into a coherent final report. Because all outputs are sourced from real clinical reports, the method substantially reduces hallucinations while improving factual reliability and interpretability. This retrieval-based approach offers a scalable and safer alternative to generative models and can be evaluated on datasets such as MIMIC-CXR and CheXpert for clinical accuracy and retrieval performance.
Long COVID (post-acute sequelae of SARS-CoV-2 infection, PASC) affects roughly 10–30% of COVID-19 survivors and is marked by persistent symptoms such as fatigue, cognitive dysfunction (“brain fog”), shortness of breath, loss of smell, and post-exertional malaise that can last for months or years, while its underlying biological mechanisms and validated diagnostic biomarkers remain unclear. The condition is highly heterogeneous, with patients showing different recovery patterns and no clearly defined clinical subtypes, and the scarcity of labeled datasets further limits the use of supervised machine learning methods for phenotyping. To address this, we propose a self-supervised contrastive multi-view learning framework that integrates three temporal data modalities—pre-infection electronic health records, acute-phase clinical and biomarker data (e.g., CRP, ferritin, D-dimer, lymphocyte counts), and post-acute symptom trajectories—using separate encoders and a shared latent space aligned through contrastive learning without requiring phenotype labels, followed by unsupervised clustering to identify potential subtypes. By exploiting the natural temporal linkage within each patient and contrasts across patients, this approach enables data-driven discovery of long COVID phenotypes, supports early prediction of subgroup membership, and may ultimately inform personalized treatment strategies, clinical trial design, and improved understanding of disease mechanisms.
Emergency department chief complaints and triage notes are early indicators of health changes during infectious disease outbreaks. These records, made before confirmatory testing, provide a presyndromic view of population health. Traditional syndromic surveillance relies on predefined syndrome categories, which may not align with novel pathogens. Early outbreaks often present as sparse, ambiguous symptom clusters, resulting in few labeled examples for automated detection. This framework suggests using contrastive learning with prototypical networks for few-shot detection of emerging infectious disease syndromes from free-text notes. It leverages historical data to create a robust clinical text embedding space, with a small set of labeled examples defining new syndromes. The system includes a contrastive pre-training encoder, prototypical network, and few-shot classifier. The encoder learns from unlabelled historical notes, and the prototypical network creates syndrome prototypes from a few labeled examples. This framework is designed for situations where public health officials observe early suspect cases but lack mature labeled datasets. It can identify early clusters by comparing incoming notes to emerging syndrome prototypes. Contrastive learning with prototypical networks enables proactive presyndromic surveillance, allowing rapid adaptation during the early phase of an outbreak without relying on large labeled datasets.