Adverse drug reactions (ADRs) are a major global health issue, contributing to significant morbidity, mortality, and healthcare costs. Many ADRs are detected only after widespread drug use, reflecting limitations of pre-market trials in capturing real-world patient variability. Although electronic health records (EHRs) collected between 2017 and 2023 provide rich data for post-market surveillance, they remain underused for systematic ADR detection. Current pharmacovigilance methods rely heavily on spontaneous reporting systems, which suffer from underreporting and bias, while supervised machine learning approaches require labeled ADR data that are often unavailable for rare or novel events. This paper proposes a variational autoencoder (VAE)-based unsupervised framework to detect ADR signals from multimodal EHR data, including clinical notes and laboratory results. The model learns normal patient data distributions and identifies deviations as potential safety signals without requiring labeled ADR examples. A multimodal architecture combines natural language processing of clinical notes with structured laboratory encoders, forming a shared latent space for anomaly detection based on reconstruction error. The framework enables detection of unknown ADRs by flagging abnormal patterns in patient records across large datasets from 2017 to 2023. Its unsupervised nature makes it suitable for identifying rare or previously unrecognized drug safety issues. Overall, this approach offers a scalable, proactive pharmacovigilance strategy that shifts drug safety monitoring from reactive reporting to predictive detection using routine EHR data.
Healthcare billing fraud imposes major financial losses globally, costing public and private payers hundreds of billions annually. It exploits fragmented healthcare payment systems where multiple insurers process overlapping patient populations without coordination, creating blind spots that enable sophisticated cross-payer fraud schemes. Individual payers cannot detect patterns such as duplicate billing across Medicare and commercial insurers because current detection models operate within isolated organizational and regulatory boundaries. Strict privacy laws like HIPAA and GDPR further prevent sharing patient-level claims data, limiting centralized analytics. To address this, a federated anomaly detection framework is proposed in which autoencoders are trained locally at each payer without exchanging raw data. Each institution learns normal billing patterns through reconstruction-based unsupervised learning and identifies anomalies via reconstruction error. A central server aggregates encoder parameters using FedAvg, optionally with differential privacy, to build a globally informed model while preserving data locality. The resulting system enables detection of cross-payer fraud patterns, such as double billing and unbundling, that single-payer systems miss, while transmitting only model parameters through secure channels. This approach provides a privacy-preserving, scalable solution for multi-payer healthcare fraud detection under strict regulatory constraints.