Type 1 diabetes mellitus requires exogenous insulin and accurate glucose forecasting is critical for closed-loop artificial pancreas systems. While continuous glucose monitoring provides real-time data, purely data-driven recurrent neural networks may produce physiologically implausible predictions, and purely mechanistic models cannot fully capture individual variability in insulin sensitivity, meal absorption, or exercise response. This framework proposes a physics-guided recurrent neural network that integrates insulin delivery records, carbohydrate intake, and physical activity data. It combines a mechanistic glucose–insulin compartmental model with a residual LSTM network that learns patient-specific deviations, supported by a physics-based loss function enforcing physiological constraints such as non-negativity and realistic glucose dynamics. By merging physiological modeling with deep learning, the system preserves biological plausibility while adapting to individual patient patterns. Incorporating multimodal wearable and device data enables more accurate, longer-horizon glucose predictions, supporting safer and more proactive insulin dosing in closed-loop diabetes management.
Type 1 diabetes affects millions of individuals worldwide, requiring lifelong exogenous insulin administration to manage blood glucose concentrations and prevent both acute complications such as hypoglycemia and long-term microvascular and macrovascular complications [1, 2]. The development of closed-loop artificial pancreas systems, which automatically adjust insulin delivery based on continuous glucose monitoring readings, has substantially improved glycemic outcomes, but these systems rely critically on accurate prediction of future glucose concentrations typically 30 to 60 minutes ahead [3, 4]. Accurate forecasting at this horizon enables proactive insulin adjustments that prevent glucose excursions rather than reacting to them after they occur, yet current prediction algorithms face fundamental limitations in capturing the complex, multi-factorial nature of glucose regulation in free-living conditions [5, 6].
Current approaches to glucose prediction using continuous glucose monitoring data alone treat glucose time series as purely statistical signals, ignoring the known physiological mechanisms that drive glucose appearance from meals, insulin-dependent glucose utilization, and endogenous glucose production [7, 8]. Machine learning models including support vector regression and artificial neural networks trained exclusively on historical glucose measurements achieve reasonable performance under controlled conditions but degrade substantially when faced with unannounced meals, varying insulin sensitivities, or physical activity [9, 10]. These purely data-driven methods fail to incorporate prior knowledge about glucose-insulin physiology, leading to predictions that may be mathematically optimal with respect to training data but physiologically implausible or clinically dangerous [11, 12].
The widespread adoption of continuous glucose monitors, insulin pumps with data logging capabilities, and consumer wearable devices has created unprecedented opportunities to integrate multiple data streams for glucose prediction [13, 14]. Modern insulin pumps record basal rates, bolus timing and doses, and derived features such as insulin-on-board; meal logging applications allow patients to record carbohydrate estimates; and wrist-worn fitness trackers provide heart rate, step count, and accelerometry data that serve as proxies for physical activity intensity and duration [15, 16]. Studies have demonstrated that incorporating meal and insulin information improves prediction accuracy, while physical activity data is particularly important because exercise induces prolonged increases in insulin sensitivity that can cause late-onset hypoglycemia hours after activity completion [17, 18].
This paper presents a conceptual framework for physics-guided recurrent neural network prediction of blood glucose in type 1 diabetes that explicitly integrates mechanistic physiological modeling with data-driven residual learning using insulin pump data, meal information, and physical activity measurements [19, 20].
Glucose homeostasis in humans is maintained through a complex interplay of insulin secretion from pancreatic beta cells in response to rising glucose concentrations, insulin-mediated glucose uptake by peripheral tissues primarily skeletal muscle and adipose tissue, and endogenous glucose production from the liver [1, 2]. The minimal model developed by Bergman and colleagues describes this system through differential equations representing plasma glucose concentration, insulin action on glucose utilization, and glucose appearance from meals, providing a mathematically tractable framework for glucose prediction that has been extensively validated in clinical studies [3, 4]. Type 1 diabetes entirely lacks endogenous insulin secretion, so patients must administer exogenous insulin subcutaneously, which enters the systemic circulation with substantial delays compared to physiological insulin release, fundamentally altering the dynamics of glucose regulation [5, 6].
Continuous glucose monitoring systems measure interstitial glucose concentrations via a subcutaneous electrochemical sensor, providing real-time readings typically every five minutes that reveal glucose trends and variability not captured by intermittent fingerstick measurements [7, 8]. The relationship between interstitial glucose and blood glucose is characterized by a physiological time lag of approximately five to fifteen minutes due to glucose diffusion from capillaries to interstitial fluid, and sensor measurements are subject to calibration error, signal noise, and dropout artifacts that complicate prediction [9, 10]. Despite these limitations, continuous glucose monitoring has become the standard of care for intensive diabetes management and serves as the primary input for closed-loop control algorithms, though historical glucose values alone provide an incomplete basis for accurate forecasting [11, 12].
Continuous subcutaneous insulin infusion through insulin pumps provides detailed records of insulin delivery including programmed basal rates that vary throughout the day, meal boluses delivered before eating, and correction boluses administered to address hyperglycemia [13, 14]. The temporal pattern and timing of insulin delivery relative to meals is critically important because subcutaneously administered insulin has a delayed onset of action of fifteen to thirty minutes and a peak effect at sixty to ninety minutes, meaning that bolus timing errors of even fifteen minutes substantially affect postprandial glucose excursions [15, 16]. Derived features such as insulin-on-board, which estimates the amount of previously delivered insulin still active in the subcutaneous depot and plasma compartment, capture the decaying effect of insulin over a three to five hour duration and are essential for preventing insulin stacking that causes delayed hypoglycemia [17, 18].
Physical activity induces multiple physiological changes that profoundly alter glucose dynamics in type 1 diabetes, including increased insulin-independent glucose uptake by contracting muscles, enhanced insulin sensitivity that persists for hours after exercise cessation, and altered counterregulatory hormone responses that affect hepatic glucose production [19, 20]. Moderate-intensity aerobic exercise can increase glucose utilization two to five fold during activity and elevate insulin sensitivity for up to twenty-four hours post-exercise, creating substantial risk for late-onset hypoglycemia that is poorly captured by models trained on sedentary data alone [21, 22]. Wearable devices measuring heart rate, heart rate variability, step count, and accelerometry provide real-time proxies for activity intensity and duration that have been shown to improve glucose prediction when integrated into machine learning models, though the delayed and prolonged nature of exercise effects requires careful temporal alignment in prediction architectures [23, 24].
The proposed physics-guided recurrent neural network framework operates through a two-stage prediction pipeline in which the mechanistic compartmental model first generates a baseline glucose forecast based on insulin delivery and meal carbohydrate inputs, and a residual long short-term memory network then learns to predict the discrepancy between this mechanistic forecast and observed glucose values [1, 2]. The final glucose prediction is computed as the sum of the mechanistic model output and the recurrent neural network residual prediction, ensuring that the overall forecast inherits the physical plausibility of the mechanistic model while gaining the ability to capture patient-specific patterns and unmodeled physiological processes [3, 4]. Input data including continuous glucose monitoring readings, insulin pump records, meal logs, and physical activity measurements from wearables are processed through a data fusion layer that aligns temporally disparate signals and extracts features relevant to glucose dynamics at multiple timescales [5, 6].
The framework assumes that patients use an insulin pump with data logging capabilities, a continuous glucose monitor providing readings at five-minute intervals, a method for recording meal carbohydrate estimates, and a wearable device capable of measuring heart rate and accelerometry data [7, 8]. Real-time access to all data streams is assumed, which is consistent with current commercial artificial pancreas systems that integrate continuous glucose monitors and insulin pumps, and with consumer wearables that transmit data to smartphones via Bluetooth [9, 10]. Patient-specific calibration is assumed to be possible through an initial period of data collection during which mechanistic model parameters such as insulin sensitivity factor and carbohydrate ratio are estimated from clinical data or optimized using patient history [11, 12].
Four design principles guide the framework architecture: physics-informed inductive bias ensures that predictions respect fundamental physiological constraints; patient-adaptive learning allows the model to personalize to individual glucose dynamics; multi-input data fusion systematically integrates heterogeneous data streams without assuming equal importance or temporal alignment; and uncertainty-aware prediction acknowledges that forecast confidence decreases with prediction horizon while enabling risk-sensitive decision-making [13, 14]. These principles are implemented through the mechanistic base model providing structure, the recurrent neural network providing flexibility, a physics-guided loss function enforcing constraints, and temporal attention mechanisms weighting recent versus historical information based on patient-specific patterns [15, 16]. The framework prioritizes parsimony in model complexity to facilitate deployment on embedded devices in artificial pancreas systems while maintaining sufficient capacity to capture clinically relevant glucose dynamics [17, 18].
Figure 1 illustrates the hierarchical conceptual architecture through which multimodal diabetes data are transformed into physiologically constrained and patient-adaptive glucose forecasts.

Figure 1. Conceptual architecture of a physics-guided recurrent neural network for physiologically constrained blood glucose prediction in type 1 diabetes.
Table 1 decomposes the proposed framework into distinct functional layers to clarify how multimodal inputs, mechanistic structure, residual learning, and constraint-based optimization contribute complementary rather than redundant predictive roles.
Table 1. Functional decomposition of the physics-guided recurrent neural network framework across modeling layers, data dependencies, physiological roles, and expected failure modes.
Framework layer | Primary inputs | Core computational role | Physiological or clinical function | What this layer adds beyond the others | Principal vulnerability if isolated |
Continuous glucose monitoring stream | Interstitial glucose readings, trend history, short-term variability | Provides the observed glucose state and recent temporal trajectory | Anchors prediction to real-time glycemic status | Supplies the immediate state signal against which all other inputs are interpreted | Cannot explain impending changes caused by insulin, meals, or exercise when used alone |
Insulin delivery encoding | Basal rates, bolus timing, bolus dose, insulin-on-board | Represents exogenous insulin exposure over time | Captures delayed glucose-lowering effects central to pump therapy | Introduces actionable treatment information unavailable in glucose-only models | Susceptible to error if insulin absorption variability is ignored |
Meal information encoding | Carbohydrate amount, meal timing, pre-bolus interval, optional meal type | Represents exogenous glucose appearance | Explains postprandial excursions and timing mismatches between food and insulin | Adds anticipatory information for rising glucose not evident in current CGM alone | Performance degrades with inaccurate meal logging or missing meal announcements |
Physical activity encoding | Heart rate, HRV, step count, accelerometry intensity, lagged activity summaries | Captures acute and delayed exercise-related physiological perturbation | Represents shifts in glucose utilization and prolonged changes in insulin sensitivity | Extends predictive reach into delayed hypoglycemia and recovery periods | Can be noisy, indirect, and difficult to temporally align with glucose effects |
Temporal harmonization and feature fusion layer | All synchronized multimodal inputs | Aligns heterogeneous signals to a shared time base and feature structure | Makes physiologically meaningful cross-signal interpretation possible | Converts disparate device outputs into one coherent prediction substrate | Misalignment propagates downstream and distorts causal timing relationships |
Mechanistic compartmental base model | Insulin inputs, meal inputs, patient physiological parameters, current state estimates | Generates a baseline glucose trajectory from known physiological dynamics | Embeds inductive bias and preserves physical realism | Provides structured prior knowledge and stable extrapolation | Fixed parameters limit adaptation to day-level variability, stress, illness, or exercise response |
Residual GRU network | Mechanistic prediction residuals, multimodal history, encoded contextual features | Learns structured deviations between mechanistic estimates and observed glucose | Personalizes prediction to patient-specific and context-sensitive patterns | Captures nonlinear residual effects not represented in compartmental equations | Without guidance, may overfit or generate accurate but physiologically implausible outputs |
Physics-guided loss function | Final prediction, observed CGM, constraint terms | Penalizes non-physiological predictions during training | Enforces safety-relevant plausibility and bounded dynamics | Prevents the hybrid model from optimizing purely statistical fit at the expense of realism | Overly strong constraints may suppress useful adaptation; weak constraints may not prevent unsafe forecasts |
Hybrid forecast assembly | Mechanistic baseline plus learned residual correction | Produces the final glucose prediction | Integrates plausibility and personalization in a single output | Operationalizes the manuscript’s central hybrid modeling claim | Inherits errors if either the mechanistic baseline or residual correction is poorly calibrated |
Evaluation and deployment layer | Horizon-stratified predictions, clinical event labels, validation partitions | Assesses technical and clinical utility under realistic deployment conditions | Links model performance to closed-loop safety and decision support value | Distinguishes statistically good forecasts from clinically useful forecasts | Superficial evaluation can hide failure in high-risk scenarios such as exercise or postprandial periods |
The mechanistic base model consists of a compartmental differential equation system describing plasma glucose concentration, interstitial glucose concentration, plasma insulin concentration, and glucose appearance rate from meals, implemented as a discrete-time model with five-minute sampling to align with continuous glucose monitoring data [1, 2]. The glucose compartment dynamics are governed by insulin-dependent glucose utilization following a Michaelis-Menten kinetic formulation, insulin-independent glucose utilization representing basal brain glucose consumption, and endogenous glucose production that is suppressed by elevated insulin concentrations [3, 4]. The meal absorption submodel converts carbohydrate intake into a glucose appearance rate using a double-exponential function with parameters describing the rate of gastric emptying and intestinal absorption, both of which exhibit substantial inter-patient and intra-patient variability that mechanistic models cannot fully capture [5, 6].
The mechanistic model requires specification of several patient-specific parameters including insulin sensitivity factor representing the glucose-lowering effect per unit of insulin, carbohydrate ratio determining insulin required per gram of carbohydrate, basal insulin requirements varying by time of day, and parameters describing the time course of subcutaneous insulin absorption [7, 8]. Initial parameter estimates can be obtained from clinical data such as total daily insulin dose and body weight using published formulas, followed by optimization using a limited amount of patient history data to refine parameters within physiologically plausible ranges [9, 10]. The inability of fixed-parameter mechanistic models to adapt to day-to-day variations in insulin sensitivity caused by physical activity, illness, or stress is precisely the limitation that the recurrent neural network component addresses by learning patient-specific residual patterns [11, 12].
Gated recurrent unit networks are selected as the recurrent neural network architecture for this framework due to their ability to capture long-range temporal dependencies in glucose time series while requiring fewer parameters than long short-term memory networks, reducing the risk of overfitting when training data is limited [1, 2]. Both gated recurrent unit and long short-term memory architectures have been extensively compared for glucose prediction tasks, with studies demonstrating that both achieve comparable performance when sufficient training data is available, but gated recurrent units train faster and generalize better in patient-specific models with limited historical data [3, 4]. The recurrent network processes sequences of variable length to accommodate missing data episodes common in real-world continuous glucose monitoring and handles irregular sampling intervals that occur when patients do not consistently wear devices or log meals [5, 6].
The feature vector input to the recurrent neural network at each time step includes the residual between current mechanistic model glucose prediction and observed continuous glucose monitoring value, the last three mechanistic model predictions to provide temporal context, recent insulin delivery history represented as both raw bolus events and derived insulin-on-board, and meal carbohydrate estimates with timing relative to current time [7, 8]. Physical activity features processed through a dedicated encoding layer include minute-by-minute heart rate, heart rate variability calculated from inter-beat intervals, step count aggregated over one-minute windows, and accelerometry-derived activity intensity classification for sedentary, light, moderate, and vigorous activity [9, 10]. Feature normalization is performed patient-specifically using running statistics to account for inter-individual differences in glucose ranges, insulin doses, and activity levels while preserving temporal dynamics relevant to prediction [11, 12].
The residual learning paradigm trains the recurrent neural network to predict the difference between the mechanistic model output and the observed glucose value, rather than predicting absolute glucose directly, which simplifies the learning problem because the residual distribution typically has lower variance and is centered near zero compared to raw glucose measurements [13, 14]. By focusing on learning the correction term, the recurrent neural network does not need to rediscover fundamental glucose-insulin dynamics already encoded in the mechanistic model, and can instead allocate its representational capacity to capturing patient-specific phenomena including variations in insulin absorption rate, meal absorption variability, and the delayed effects of physical activity on insulin sensitivity [15, 16]. The residual predictor incorporates features extracted from the mechanistic model's internal states alongside raw input data, enabling the recurrent network to identify situations in which the mechanistic model systematically over-predicts or under-predicts glucose concentrations and adapt its correction accordingly [17, 18].
Insulin pump data is encoded as a structured time series with events recorded at their precise administration times, then resampled to five-minute intervals aligned with continuous glucose monitoring readings using a zero-order hold for basal rates and impulse functions for bolus deliveries [1, 2]. Basal insulin delivery is represented as the total units delivered during each five-minute interval, while meal and correction boluses are encoded as both the bolus amount and a binary indicator of bolus occurrence to capture the discrete event nature of meal-time insulin administration [3, 4]. Derived features including insulin-on-board are computed using a linear decay model with patient-specific duration of insulin action typically set to four hours, providing a time-varying estimate of previously delivered insulin that remains physiologically active and influences ongoing glucose disposal [5, 6].
Meal information is encoded as the estimated carbohydrate content in grams, the reported meal time, and optionally a meal type classification distinguishing fast-absorbing meals such as liquids from slow-absorbing meals such as high-fat or high-protein foods that produce prolonged postprandial glucose elevations [7, 8]. The framework implements a meal pulse function that converts the carbohydrate amount into a glucose appearance rate using a two-compartment absorption model with population-average parameters for the time to peak absorption and total absorption duration, which the recurrent neural network later refines through residual learning [9, 10]. Pre-bolus timing, defined as the interval between insulin bolus administration and meal consumption, is encoded as a separate feature because pre-meal insulin delivery significantly reduces postprandial glucose excursions compared to simultaneous or post-meal bolusing, and failure to account for bolus timing errors is a major source of prediction inaccuracy [11, 12]. Nutritional factors including macronutrient composition beyond simple carbohydrate counting have been shown to affect glucose dynamics, and the framework can optionally incorporate fat and protein estimates when available [20].
Physical activity encoding transforms raw wearable sensor data into features relevant to glucose dynamics, including minute-by-minute heart rate, heart rate variability computed as the standard deviation of normal-to-normal intervals, step count, and accelerometer-derived activity intensity classified into sedentary, light, moderate, and vigorous categories using validated threshold algorithms [13, 14]. The framework incorporates lagged activity features extending up to six hours prior to the prediction time, addressing the well-documented phenomenon that exercise effects on insulin sensitivity persist for hours after activity completion and can cause late-onset hypoglycemia that pure short-term models miss [15, 16]. Activity features are normalized relative to each patient's resting heart rate and typical activity patterns to account for inter-individual differences in fitness and baseline activity levels, ensuring that absolute heart rate values are interpretable in context [17, 18]. Wristband accelerometer data collected in free-living conditions has been specifically validated for activity detection in type 1 diabetes populations, supporting the use of consumer wearables for this framework [13]. Physical activity and psychological stress detection further improve glucose prediction accuracy by capturing sympathetic nervous system effects on glucose mobilization, which the framework can incorporate through heart rate variability features [12].
The standard loss component of the framework is the mean squared error between the final glucose prediction and the observed continuous glucose monitoring value, computed across all time steps in the training sequence to optimize point prediction accuracy [1, 2]. Mean squared error is selected over mean absolute error because it more strongly penalizes large prediction errors that are clinically dangerous, such as failing to predict an impending hypoglycemic event or substantially overestimating glucose during hyperglycemia, while still providing differentiable gradients for neural network training [3, 4]. The prediction loss is computed on the final combined output of the mechanistic model plus recurrent neural network residual, ensuring that the entire framework optimizes end-to-end prediction performance rather than optimizing the mechanistic model and residual network separately [5, 6].
The physics constraint loss adds penalty terms that increase when predictions violate fundamental physiological principles, including negative glucose concentrations which are physically impossible, rates of glucose change exceeding maximum physiologically plausible limits of approximately plus-or-minus five milligrams per deciliter per minute, and patterns inconsistent with insulin action such as glucose decreasing when no insulin is present and no physical activity occurred [7, 8]. Additional constraint terms penalize predictions that imply completely inactive insulin action following a large bolus or that ignore the known saturable nature of glucose utilization, with each penalty weighted according to the severity of the physiological violation [9, 10]. These constraint penalties are designed as differentiable functions that can be incorporated directly into the loss function during training, encouraging the network to learn solutions that satisfy physical constraints rather than simply constraining outputs after training [11, 12]. Long-term glucose forecasting studies have demonstrated that deconvolution of the continuous glucose monitoring signal can improve physiological consistency, providing an alternative constraint mechanism that the framework could adopt [4].
The total loss function is formulated as
Short-horizon prediction of fifteen to thirty minutes is primarily used for hypoglycemia alerting and near-term insulin suspension decisions, and at this timescale the mechanistic model alone often achieves reasonable accuracy because glucose dynamics are dominated by recently administered insulin and absorbed meals with relatively little contribution from physical activity effects [1, 2]. The recurrent neural network residual component still provides benefit at short horizons by correcting for patient-specific deviations in insulin absorption rate and meal response, but the primary value of physics guidance is ensuring that predictions remain physically plausible during rapid glucose excursions [3, 4]. Short-horizon predictions can be generated with high confidence and low uncertainty, making them suitable for safety-critical applications such as predictive low glucose suspend systems that automatically halt insulin delivery when hypoglycemia is anticipated within thirty minutes [5, 6]. During physical activity, different learning techniques have been comparatively evaluated for short-horizon forecasting, with results indicating that recurrent architectures outperform static models during exercise periods [25].
Long-horizon prediction of forty-five to ninety minutes is required for proactive insulin dosing in fully closed-loop artificial pancreas systems, enabling the controller to increase or decrease insulin delivery before glucose excursions occur rather than reacting to them after onset [7, 8]. At this timescale, both the mechanistic model and a pure data-driven recurrent neural network degrade substantially, but the physics-guided framework maintains performance because the mechanistic model continues to provide a plausible baseline trajectory while the residual network captures the delayed effects of physical activity on insulin sensitivity and the prolonged time course of meal absorption [9, 10]. Physical activity features become increasingly important at longer horizons because exercise-induced changes in insulin sensitivity manifest over hours, and failure to incorporate activity data at these horizons leads to systematic overprediction of glucose following activity or underprediction during sedentary recovery periods [11, 12]. A hybrid CNN-GRU model has been proposed specifically for real-time glucose forecasting in IoT-based diabetes management, demonstrating the feasibility of convolutional-recurrent architectures for extended prediction horizons that the framework could adopt as an alternative to pure gated recurrent unit networks [26]. Long-term prediction using a CNN-LSTM-based deep neural network has shown particular promise for horizons beyond sixty minutes, supporting the framework's emphasis on residual learning for extended forecasts [17].
Standard regression metrics including root mean squared error and mean absolute error are computed between predicted and observed glucose values across all prediction horizons, with error reported both as absolute glucose values in milligrams per deciliter and as percentage error relative to observed glucose to facilitate comparison across studies with different glucose ranges [1, 2]. Prediction error growth with increasing horizon is characterized by fitting error-versus-time curves, with the rate of error increase serving as a key metric for comparing frameworks because slower error growth indicates better capture of long-term glucose dynamics [3, 4]. The percentage of predictions falling within clinically acceptable error zones defined by the consensus error grid for diabetes technology is reported, with Zone A representing predictions that would lead to clinically correct treatment decisions and Zone B representing benign errors that would not harm the patient [5, 6]. The glucose variability impact index and prediction consistency index offer alternative accuracy assessment methods that account for baseline variability, which the framework should incorporate as secondary metrics to avoid overestimating performance on high-variability patients [18].
Time-in-range metrics defined as the percentage of prediction times at which the forecasted glucose falls within seventy to one hundred eighty milligrams per deciliter are reported alongside time below range and time above range, providing clinically interpretable measures that reflect the actual treatment implications of prediction errors [7, 8]. Hypoglycemia detection rate is defined as the sensitivity and specificity of the framework for predicting glucose below seventy milligrams per deciliter and below fifty-four milligrams per deciliter at various prediction horizons, with particular emphasis on the false negative rate because missed hypoglycemia events represent the most dangerous failure mode [9, 10]. The time-to-event prediction accuracy for hypoglycemia and hyperglycemia crossings is evaluated using receiver operating characteristic curves, with the area under the curve quantifying the framework's ability to discriminate between impending dangerous glucose excursions and stable or safe conditions [11, 12]. Enhanced blood glucose prediction using smartwatch data has demonstrated improved clinical metrics compared to continuous glucose monitoring alone, supporting the framework's inclusion of wearable-derived physical activity features [24]. Quantifying the impact of physical activity on future glucose trends using machine learning has shown that clinical metrics improve substantially when activity is explicitly modeled, validating the framework's emphasis on activity encoding [27].
Temporal validation splits the dataset into contiguous training, validation, and test periods, ensuring that the framework is evaluated on data collected after the training data to simulate real-world deployment where models cannot see future data [13, 14]. Cross-patient validation trains on data from a subset of patients and tests on held-out patients, assessing the framework's ability to generalize to new individuals without patient-specific retraining, while cross-dataset validation trains on one clinical study and tests on an independent study to evaluate robustness to differences in population characteristics and measurement devices [15, 16]. All validation protocols report prediction performance stratified by key clinical scenarios including postprandial periods, overnight fasting, and periods containing physical activity, because overall accuracy metrics can mask clinically important performance differences in high-risk situations [17, 18]. Different levels of data fusion for physical activity integration have been systematically compared, with results indicating that late fusion approaches maintain better generalization across datasets than early fusion, informing the framework's design choice to encode activity features separately before residual learning [28]. Interpreting machine learning models using SHAP analysis has revealed that physical activity features contribute most significantly at longer prediction horizons, providing empirical justification for the framework's horizon-dependent feature weighting [19]. Prediction of blood glucose levels using LSTM neural networks has been validated on multiple public datasets, establishing benchmarking protocols that the framework should adopt for fair comparison with existing methods [29].
Table 2 situates the proposed framework against pure data-driven and pure mechanistic alternatives to show why a physics-guided residual architecture is theoretically better aligned with the safety, adaptability, and horizon-length demands of type 1 diabetes prediction.
Table 2. Comparative analytical matrix contrasting pure data-driven, pure mechanistic, and physics-guided residual recurrent approaches for blood glucose prediction in type 1 diabetes.
Analytical dimension | Pure data-driven recurrent model | Pure mechanistic compartmental model | Physics-guided residual recurrent framework |
Dominant modeling logic | Learns temporal regularities directly from observed data | Simulates glucose dynamics from predefined physiological equations | Uses mechanistic simulation as a baseline and learns structured residual corrections |
Relationship to physiology | Implicit and weak unless physiology is engineered into features | Explicit and central | Explicit in the base model and reinforced during training through constraint penalties |
Ability to represent patient-specific variability | Moderate to high when abundant individualized data are available | Limited unless parameters are repeatedly recalibrated | High, because recurrent residual learning adapts to person-specific deviations around a physiological baseline |
Dependence on large training datasets | High | Low to moderate | Moderate, because residual learning is easier than learning full glucose dynamics from scratch |
Behavior under unseen contexts | Often unstable, especially with unannounced meals or unusual activity | More stable structurally but less flexible behaviorally | More stable than pure neural models and more adaptive than pure mechanistic models |
Physiological plausibility of outputs | Not guaranteed | Usually strong within model assumptions | Stronger than pure data-driven approaches because plausibility is supported by both architecture and loss design |
Capacity to use physical activity information | Possible, but often weakly structured and horizon-dependent | Usually incomplete unless extended by additional physiological submodels | Strong, because activity features can modify learned residuals while the base model maintains trajectory structure |
Performance at short horizons | Often competitive | Reasonable to good | Strong, with added safety and correction capacity |
Performance at extended horizons | Frequently degrades as omitted physiology accumulates | Frequently degrades because fixed assumptions cannot track individual context shifts | Best positioned to maintain performance because delayed exercise and absorption effects can be learned as residual structure |
Interpretability | Limited and often post hoc | High at the system-structure level | Intermediate to high, because deviations can be interpreted relative to the mechanistic baseline |
Safety relevance for closed-loop insulin delivery | Concerning if implausible predictions drive dosing decisions | Safer structurally, but may miss individualized risk states | Strongest conceptual fit because it balances safety-oriented plausibility with adaptive responsiveness |
Computational burden for real-time deployment | Moderate | Low to moderate | Moderate, but still feasible if the recurrent module remains compact |
Sensitivity to missing meal or activity data | High | High when required inputs are absent | High but potentially more resilient if residual patterns learn partial compensation |
Main conceptual strength | Flexible pattern recognition | Physiological grounding | Integration of grounding, adaptability, and safety constraints |
Main conceptual limitation | May optimize fit without respecting biology | May respect biology without capturing real-world individuality | Requires multi-stream data quality, calibration, and careful balance between constraints and flexibility |
This paper has presented a conceptual framework for physics-guided recurrent neural network prediction of blood glucose in type 1 diabetes that systematically integrates mechanistic physiological modeling with data-driven residual learning using insulin pump records, meal carbohydrate estimates, and physical activity measurements from wearable devices. The framework addresses the fundamental limitation of purely data-driven approaches, which ignore known physiology and may produce physically implausible predictions, and purely mechanistic approaches, which cannot capture the substantial patient-specific variability that characterizes real-world glucose dynamics. By combining a compartmental glucose-insulin model as an inductive bias with a gated recurrent unit network that learns patient-specific residuals and a physics-guided loss function that enforces physiological constraints, the framework maintains physical plausibility while achieving the flexibility required for personalized glucose forecasting.
The key advantages of this framework include physiologically plausible predictions that never violate fundamental constraints, extended prediction horizons of up to ninety minutes enabled by integration of physical activity data, and patient-adaptivity that captures individual variations in insulin sensitivity, meal absorption, and exercise response. The framework can be implemented using data from commercially available devices including continuous glucose monitors, insulin pumps with data logging, and consumer wearables, making it suitable for deployment in existing artificial pancreas system architectures without requiring additional sensors or patient burden. The residual learning paradigm ensures that the recurrent neural network component focuses its representational capacity on correcting systematic mechanistic model errors rather than learning glucose dynamics from scratch, improving data efficiency and reducing the amount of patient-specific training data required.
Several limitations must be acknowledged in this conceptual framework. The framework requires simultaneous access to multiple data streams, which may not be available for all patients or in all clinical settings, and missing data from any sensor degrades prediction performance. Patient-specific calibration of mechanistic model parameters requires an initial data collection period, and the optimal calibration duration may vary substantially across individuals depending on their glucose variability and activity patterns. Validation on diverse patient populations including children, older adults, and individuals with widely varying insulin sensitivity is necessary to establish generalizability, and the framework's performance under real-world conditions with sensor noise, calibration errors, and unannounced meals must be rigorously evaluated.
Implementation of this framework on public datasets including OhioT1DM and T1DMS is the immediate next step, enabling direct comparison with existing glucose prediction approaches and identification of remaining technical challenges before clinical deployment. Integration with artificial pancreas systems requires real-time inference at five-minute intervals with computational requirements compatible with embedded devices, which is feasible given the modest size of the proposed gated recurrent unit network and the ability to precompute mechanistic model states efficiently. Future extensions include probabilistic prediction outputs for risk-based insulin dosing, transfer learning methods to reduce patient-specific calibration requirements, and attention mechanisms that dynamically weight recent versus historical information based on detected changes in physiological state such as the onset of physical activity or illness.
None
None
None
None
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.