Prostate cancer metastasis to bone and lymph nodes marks a critical transition to incurable disease, with five-year survival dropping dramatically compared to localized disease. Early identification of patients at high risk of metastasis enables timely intensification of treatment, including androgen deprivation therapy, salvage radiation, or systemic therapies. Current deep survival models that integrate serial PSA measurements and genomic risk scores achieve high predictive accuracy for time-to-metastasis but operate as black boxes, providing no explanation for why a particular patient is predicted to have early or late metastasis. Clinicians cannot trust or act upon predictions without understanding which PSA features or genomic markers drive the risk assessment. We present an explainable deep survival framework that combines a deep survival model for time-to-metastasis prediction with Integrated Gradients attribution, a method that distributes the model's hazard prediction among input features. The framework produces patient-specific explanations showing how each serial PSA value and each genomic score component contributes to the predicted metastasis hazard. The framework consists of three core components: (1) a deep survival model (DeepSurv architecture) with a PSA time-series encoder and genomic risk encoder, (2) Integrated Gradients attribution computed over the hazard function, and (3) visualization tools for individual and population-level interpretations. Integrated Gradients attributes the predicted hazard to individual PSA measurements across time and specific genomic markers, enabling clinicians to distinguish between risk driven by rapid PSA kinetics versus high genomic risk scores. This interpretability transforms a black-box survival prediction into an actionable clinical decision support tool.
Prostate cancer metastasis represents a decisive clinical turning point, as patients with metastatic disease face substantially reduced survival and require intensive systemic therapies rather than localized curative interventions. Sundararajan et al. established the axiomatic foundations for attributing neural network predictions to input features, a principle that can be extended to survival models for cancer prognosis [1]. Clinical management decisions following primary treatment—such as whether to initiate salvage radiation, add androgen deprivation therapy, or pursue active surveillance—depend critically on accurate assessment of individual metastasis risk [2].
Multiple clinical biomarkers have demonstrated prognostic value for prostate cancer outcomes, yet they are rarely integrated into a unified predictive framework. Serial PSA measurements, including PSA velocity (ng/mL/year), PSA doubling time, and absolute PSA values, provide dynamic information about disease activity [3]. Concurrently, genomic risk scores such as Decipher (22-gene panel), Oncotype DX Genomic Prostate Score (17-gene panel), Prolaris (cell cycle progression), and Polaris (homologous recombination deficiency) independently predict metastasis and prostate cancer-specific mortality [4]. However, clinical practice typically treats these as separate data streams rather than integrating them into a multimodal predictive model.
Deep survival models have demonstrated superior performance compared to traditional Cox proportional hazards models for time-to-event prediction with complex, high-dimensional inputs. Yao et al. developed DeepSurv, a deep neural network that extends the Cox model by learning nonlinear hazard functions directly from patient data [5]. Wang et al. introduced DeepHit, which handles competing risks without the proportional hazards assumption and learns the joint distribution of event times [6]. Despite their predictive power, these models are not interpretable by design, which limits their adoption in clinical settings where understanding the rationale behind a risk prediction is essential for treatment decisions [7, 8].
Prostate cancer most commonly metastasizes to bone (spine, pelvis, ribs, long bones) and lymph nodes (pelvic, retroperitoneal), with visceral metastases to liver or lung occurring less frequently but indicating poor prognosis. Time to metastasis is a clinically meaningful endpoint that drives treatment intensification decisions, including the initiation of androgen deprivation therapy, docetaxel, or novel hormonal agents [8, 9]. Risk groups for metastasis are defined by clinical stage, Gleason score, and baseline PSA, but these categorical assignments fail to capture individual variability in disease trajectory.
Serial PSA measurements capture dynamic patterns of disease activity, with PSA velocity (rate of change in ng/mL per year) and PSA doubling time (time in months for PSA to double) showing strong associations with metastasis risk and prostate cancer-specific mortality. Baseline PSA prior to treatment, PSA nadir following definitive therapy, and post-treatment PSA kinetics each provide complementary prognostic information [10, 11]. The irregular timing and varying numbers of PSA measurements across patients pose challenges for standard time-series methods but can be addressed using sequence models with masking.
Genomic risk scores derived from tumor tissue provide molecular prognostic information independent of clinical variables. The Decipher score (22-gene panel) predicts metastasis and prostate cancer-specific mortality across multiple validation cohorts, with higher scores indicating greater genomic aggression [12, 13]. The Oncotype DX Genomic Prostate Score (17-gene panel) predicts adverse pathology and biochemical recurrence, while Prolaris (cell cycle progression score) and Polaris (homologous recombination deficiency score) offer additional but partially overlapping risk information.
DeepSurv implements a Cox proportional hazards model within a deep neural network, learning a nonlinear function such that the hazard function
The framework accepts two input modalities: a sequence of serial PSA measurements with associated timestamps (irregularly sampled) and a genomic risk score vector (continuous values from one or more commercial assays such as Decipher or Oncotype DX). The deep survival model outputs a predicted hazard function over follow-up time, and Integrated Gradients computes per-input-feature attributions by integrating the gradient of the hazard output with respect to each input along a straight-line path from a baseline input to the actual input [17].
Figure 1 illustrates the proposed explainable deep survival framework linking multimodal prostate cancer inputs, DeepSurv-based hazard prediction, Integrated Gradients attribution, and clinically actionable metastasis-risk interpretation.

Figure 1. Explainable deep survival architecture for prostate cancer metastasis-risk prediction using serial PSA trajectories, genomic risk scores, and Integrated Gradients attribution.
The framework assumes that each patient has at least three PSA measurements recorded following primary treatment (surgery or radiation) and that a genomic risk score is available from biopsy or surgical specimen, which may be obtained at diagnosis or post-prostatectomy. Metastasis status (event or censored) is assumed to be reliably determined through imaging (bone scan, CT, PSMA-PET) or clinical documentation, with the time from primary treatment to metastasis or last follow-up recorded as the survival time [18].
Three design principles guide the framework: interpretability (every prediction must be accompanied by feature attributions that a clinician can understand), time-awareness (PSA values are attributed to specific time points, enabling explanations such as "PSA at month 18 contributed most to increased risk"), and clinical actionability (explanations should suggest specific interventions, such as shortening surveillance intervals or initiating systemic therapy). The framework must also handle right-censored data without bias, as patients who have not metastasized by the end of follow-up provide partial information [19].
Table 1 maps each component of the proposed framework to its technical function, clinical interpretation, explanation output, and decision-support relevance.
Table 1. Conceptual Mapping between Model Components, Clinical Meaning, and Explanation Outputs
Framework component | Technical role in the survival model | Clinical meaning | Explanation output generated | Decision-support value |
Serial PSA values | Provide longitudinal biomarker input to the time-series encoder | Captures post-treatment disease activity and biochemical recurrence patterns | Attribution assigned to each PSA measurement at each recorded time point | Identifies whether risk is driven by specific PSA elevations or sustained trajectory change |
PSA timestamps and intervals | Encode irregular measurement spacing and temporal context | Reflects surveillance timing and rate of PSA evolution | Attribution linked to clinically meaningful months after treatment | Helps clinicians locate the temporal window in which risk becomes most informative |
PSA velocity and doubling-time signals | Represent dynamic change in tumor activity over time | Indicates aggressive biochemical progression when PSA rises rapidly | Positive attribution to rapid PSA kinetics | Supports intensified monitoring, salvage therapy consideration, or systemic escalation |
Genomic risk score vector | Provides molecular prognostic input independent of PSA kinetics | Reflects intrinsic tumor aggressiveness and metastatic potential | Attribution assigned to Decipher, Oncotype DX GPS, Prolaris, or Polaris components | Distinguishes biology-driven risk from trajectory-driven risk |
PSA time-series encoder | Learns nonlinear temporal representation from irregular PSA sequences | Summarizes the patient’s evolving post-treatment disease course | Time-specific PSA attribution pattern | Explains whether early, mid, or late PSA measurements dominate predicted hazard |
Genomic MLP encoder | Learns interactions among genomic scores | Captures complementary molecular risk information across assays | Assay-specific attribution profile | Clarifies which molecular signal contributes most to predicted metastasis hazard |
DeepSurv hazard network | Estimates individualized time-to-metastasis hazard | Converts multimodal patient information into survival-risk estimates | Horizon-specific hazard attribution at 1, 2, and 5 years | Aligns predictions with clinically relevant treatment windows |
Integrated Gradients module | Decomposes hazard output into feature-level contributions | Makes model reasoning inspectable by clinicians | Signed positive or negative contribution for each PSA value and genomic feature | Enables verification, trust calibration, and clinician override |
Population-level attribution aggregation | Summarizes explanation patterns across cohorts | Reveals common prognostic drivers across patient subgroups | Group-level ranking of PSA windows and genomic markers | Supports simplified risk-score design and subgroup-specific clinical interpretation |
The PSA time-series encoder processes irregularly sampled measurements using a Long Short-Term Memory (LSTM) or transformer architecture with time point encodings that capture both PSA values and the intervals between measurements. Missing data between recorded measurements is handled through masking or imputation, while variable-length sequences are padded to a maximum length with attention masks to exclude padded positions from computation [20]. For each time point, the encoder outputs a hidden state that summarizes PSA history up to that measurement, which can be pooled or passed to subsequent layers.
Genomic risk scores are encoded through a small multilayer perceptron (MLP) with one or two hidden layers that maps the input vector (e.g., a single Decipher score value between 0 and 100, or a panel of multiple scores) into a latent representation. When multiple genomic assays are available, the encoder learns interactions among them, allowing the model to weight complementary information from Decipher, Oncotype DX, and other assays relative to their empirical prognostic value [21]. The encoded genomic representation is concatenated with the pooled PSA sequence representation to form the full input feature vector for the hazard prediction layer.
Following the DeepSurv architecture, the model predicts the hazard function as
Integrated Gradients attributes the difference between the model's output at the actual input x and its output at a baseline input x' to each input feature i, computed as the integral of the gradient along the straight-line path from x' to x:
Attributions are computed separately for predicted hazard at multiple clinically relevant time horizons—1 year, 2 years, and 5 years following primary treatment—enabling clinicians to understand how feature contributions evolve over time. For each time horizon, the attribution scores are aggregated across the hazard prediction, and the final interpretation presents the relative contribution of each PSA measurement (by time point) and each genomic score component. Standardization across patients allows comparison of attribution magnitudes, and the sign of each attribution indicates whether the feature increases (positive) or decreases (negative) the predicted hazard relative to baseline [24].
For an individual patient, the framework produces a structured explanation stating how each PSA measurement and each genomic score component contributed to the predicted metastasis hazard at specified time horizons, such as "Your PSA value of 2.1 ng/mL at month 12 contributed +15% to your predicted 2-year metastasis risk, while your Decipher score of 0.65 contributed +30%." These attributions enable clinicians to identify whether hazard is driven by rapid PSA kinetics, high genomic risk, or unfavorable combinations thereof, and the baseline reference (zero PSA, median genomic score) makes the direction and magnitude of each contribution interpretable [25].
Across a patient cohort, the framework aggregates attribution scores to identify which PSA time points and which genomic markers are most informative for metastasis prediction on average, revealing that PSA velocity during months 6 to 18 post-treatment may be more prognostic than any single absolute PSA value. Population-level analysis can also compare attribution patterns between risk groups, showing that genomic scores dominate predictions for intermediate-risk patients while PSA kinetics dominate for high-risk patients, or vice versa [26]. These patterns can guide clinical understanding of how different prognostic factors operate across disease stages and inform the design of simplified risk scores for settings where full deep learning deployment is infeasible.
Actionable explanations directly inform treatment decisions by identifying modifiable or monitorable drivers of risk; for example, if the framework attributes high 2-year metastasis risk primarily to a PSA doubling time of less than three months, the clinician may escalate from semi-annual to quarterly PSA monitoring, initiate salvage radiation, or add a brief course of androgen deprivation therapy. Conversely, if high risk is attributed almost entirely to an elevated genomic score with stable PSA kinetics, the focus may shift toward systemic therapy rather than local salvage interventions, as the genomic risk reflects intrinsic tumor biology less responsive to local treatment [27]. The explanation format is designed to be readable at the point of care, requiring no specialized data science training.
Clinician trust in the framework depends fundamentally on the plausibility and consistency of its explanations: when the model attributes high 2-year metastasis risk to rapidly rising PSA (PSA doubling time less than three months) within the first year after radical prostatectomy, this aligns perfectly with established clinical knowledge that biochemical recurrence with short doubling time strongly predicts subsequent metastatic progression. Conversely, when the model attributes high risk to a single low PSA value while ignoring a clearly rising trend over multiple measurements, such contradictions would rightly prompt model skepticism and potential clinician override, creating a natural verification mechanism where clinical expertise serves as a check on model behavior rather than blind acceptance of its outputs [28].
Beyond pointwise plausibility, clinicians require consistency across similar patients and stability over time: if two patients with nearly identical PSA trajectories and Decipher scores receive substantially different attributions, or if the same patient evaluated twice with minimally updated PSA values receives dramatically different explanations, trust erodes rapidly, as clinicians perceive the model as capricious rather than reliable. The framework should therefore report attribution confidence intervals or stability metrics alongside point estimates, and prospective user studies should measure how explanation consistency affects clinician willingness to modify treatment plans based on model recommendations.
Over time, consistent alignment between attributions and clinical expectations builds the trust necessary for routine adoption, while persistent contradictions signal the need for model retraining, architecture revision, or recalibration of the attribution baseline. A phased adoption strategy is recommended: initially, the framework serves as a silent decision-support tool whose explanations are reviewed retrospectively in tumor boards; after achieving high plausibility scores (e.g., >85% of explanations rated as clinically sensible by urologists), the framework can transition to prospective use with clinician override authority; and only after demonstrating improved clinical outcomes (e.g., reduced time to metastasis detection or more appropriate treatment intensification) should autonomous recommendation be considered [28]. This cautious pathway respects the primacy of clinical judgment while systematically building the evidence base for XAI in prostate cancer survival prediction.
Table 2 provides an evaluation matrix linking survival-model performance, explanation reliability, subgroup robustness, and clinical adoption readiness.
Table 2. Evaluation Matrix for Predictive Performance, Explanation Quality, and Clinical Adoption Readiness
Evaluation domain | Metric or assessment | What it tests | Minimum desirable interpretation | Why it matters for clinical deployment |
Survival discrimination | Concordance index | Whether higher-risk patients are correctly ranked before lower-risk patients | C-index above 0.7 acceptable; above 0.8 strong, depending on censoring and cohort characteristics | Determines whether the model meaningfully stratifies metastasis risk |
Time-specific discrimination | Time-dependent AUC at 1, 2, and 5 years | Whether the model separates patients who metastasize within clinically relevant windows | Strong performance should be consistent across short- and medium-term horizons | Supports treatment decisions tied to actionable follow-up periods |
Calibration | Brier score and calibration plots | Whether predicted survival probabilities match observed metastasis rates | Predicted risk should correspond closely to observed event frequency | Prevents overconfident or systematically biased treatment recommendations |
Explanation faithfulness | Feature perturbation or masking tests | Whether high-attribution features truly influence the model output | Removing highly attributed features should substantially change hazard estimates | Ensures explanations reflect model behavior rather than post-hoc artifacts |
Explanation stability | Small input-perturbation sensitivity analysis | Whether minor PSA measurement variation causes large attribution shifts | Clinically negligible PSA variation should not radically change explanations | Protects against unreliable explanations caused by measurement noise |
Clinical plausibility | Blinded urologist or radiation oncologist rating | Whether attributions align with expert clinical reasoning | Most explanations should receive high plausibility ratings across subgroups | Builds clinician trust and identifies implausible reasoning patterns |
Subgroup reliability | Stratified evaluation by risk group, age, race, treatment type, and genomic-score availability | Whether predictions and explanations remain valid across patient populations | No subgroup should show substantial degradation in accuracy or plausibility | Reduces risk of inequitable decision support |
Baseline comparison | Cox model, standard DeepSurv, random survival forest, and SHAP comparison | Whether the proposed framework improves prediction or explanation utility over alternatives | Should retain predictive performance while adding patient-specific interpretability | Justifies the added complexity of explainable deep survival modeling |
Deployment readiness | Prospective external validation and silent-mode tumor-board review | Whether the framework performs reliably outside retrospective development data | Strong external performance and high clinician plausibility before active use | Prevents premature clinical deployment of an unvalidated model |
Adoption safety | Clinician override and explanation review workflow | Whether clinicians can challenge or reject implausible recommendations | Override should remain explicit during early deployment | Maintains clinical accountability and prevents blind automation dependence |
Concordance index (C-index) measures the model's ability to order patients by risk, with values above 0.7 generally considered acceptable and above 0.8 indicating strong discriminative performance for time-to-metastasis prediction, though the baseline prevalence and censoring rate in the validation cohort must be reported to enable meaningful comparisons across studies. Time-dependent area under the ROC curve (tdAUC) evaluates discrimination at specific clinically relevant time horizons (1, 2, and 5 years), capturing whether the model distinguishes patients who metastasize within each window from those who do not, which is more directly interpretable for treatment decisions than a single global concordance measure. The Brier score assesses calibration by measuring the squared difference between predicted survival probabilities and observed outcomes, with lower values indicating better-calibrated predictions; a well-calibrated model's predicted 20% 5-year metastasis risk should correspond to approximately one in five patients actually metastasizing within five years [29]. These metrics collectively characterize predictive accuracy across different aspects of survival performance and should be reported alongside explanation quality metrics, as a model with excellent discrimination but poor calibration may produce systematically biased risk estimates, while a well-calibrated model with modest discrimination may still support clinical decision-making when combined with trustworthy explanations.
Faithfulness measures whether features identified as important by Integrated Gradients actually affect the model's prediction when perturbed: if removing or masking a high-attribution PSA value (e.g., setting it to baseline zero) substantially changes the predicted hazard while perturbing a low-attribution value produces minimal change, the explanations are faithful to the model's actual decision boundary. Conversely, if a feature receives high attribution but the predicted hazard remains unchanged after masking, the explanation is misleading and could cause clinicians to focus on irrelevant variables while overlooking true drivers of risk. Stability requires that small, clinically negligible perturbations to the input (e.g., adding ±0.1 ng/mL measurement error to PSA values) produce similar attribution scores; highly unstable explanations undermine clinical confidence because clinicians cannot rely on attributions that change dramatically with routine measurement variability or with minor differences in how PSA values are recorded across laboratories.
Plausibility is assessed by having board-certified urologists or radiation oncologists review a blinded set of patient explanations (inputs, predictions, and attributions) and rate whether the attributed features align with clinical reasoning on a Likert scale from 1 (completely implausible) to 5 (highly plausible). High plausibility scores indicate that explanations are clinically sensible even if the underlying model remains a black box, and these scores should be disaggregated by patient subgroup (e.g., low-risk vs. high-risk, African American vs. white, younger vs. older) to identify populations where explanations systematically fail to align with clinical expectations. A minimum acceptable threshold might be that at least 80% of explanations receive plausibility scores of 4 or 5, with no subgroup falling below 70%, and qualitative feedback should be collected to identify recurring patterns of implausibility that could be addressed through baseline recalibration or model retraining.
The explainable framework should be compared against several baselines: a standard DeepSurv model without attribution capabilities (to assess whether adding Integrated Gradients degrades predictive performance, as gradient computation and the path integral approximation might introduce numerical instability or optimization challenges), a traditional Cox proportional hazards model with the same inputs (to evaluate whether deep learning provides meaningful improvement over a well-established, interpretable baseline that already offers coefficient-based explanations). A random survival forest (an ensemble of survival trees) serves as a non-neural, non-proportional-hazards alternative that can handle nonlinearities and interactions while providing variable importance measures, though tree-based importance scores are global rather than patient-specific and cannot attribute risk to temporal sequences of PSA measurements in the same way Integrated Gradients can.
For explanation quality, Integrated Gradients should be compared with SHAP (SHapley Additive exPlanations) applied to the same deep survival model, evaluating both computational efficiency (integrated gradients requires one gradient evaluation per integration step, typically 50-100 steps, while SHAP requires an exponential number of coalition evaluations unless approximation methods like KernelSHAP are used, which may still be slower for high-dimensional time-series inputs) and clinically assessed plausibility using the urologist review protocol described above. However, SHAP's exponential computational cost for many features—particularly when the PSA time-series encoder produces dozens or hundreds of temporal features—may disadvantage it in time-sensitive clinical applications where explanations must be generated within seconds of a patient visit, while Integrated Gradients scales linearly with input dimensionality and can be further accelerated through gradient checkpointing or parallel integration step computation.
Integrated Gradients requires an integral approximation along the path from baseline to input, which is computationally expensive when repeated for each patient and each time horizon; practical deployment may require reducing the number of integration steps or precomputing gradients for common input patterns. Baseline selection significantly affects attribution magnitudes: an alternative baseline, such as the population median PSA trajectory rather than zero PSA, would produce different numerical attributions, though the relative ordering of feature importance is often preserved [1]. Correlated features—such as closely spaced PSA values that rise together—pose a challenge because Integrated Gradients distributes credit among correlated inputs arbitrarily, potentially attributing risk to only one of several similarly informative measurements despite their joint contribution [23].
The framework provides correlational attributions describing which input features influenced the model's hazard prediction, not causal explanations of why metastasis will or will not occur; a high attribution to PSA velocity does not prove that PSA velocity causes metastasis—only that the model learned to associate it with metastasis risk. Prospective validation in external cohorts is essential before clinical deployment, as retrospective datasets may contain unmeasured confounding or selection bias that affects both the predictive model and its explanations [24]. Genomic score availability varies substantially across clinical settings, with community practices less likely to order Decipher or Oncotype DX testing than academic centers, potentially limiting the framework's applicability to patients without molecular profiling.
This manuscript has presented an explainable deep survival framework that integrates serial PSA measurements and genomic risk scores to predict time-to-metastasis in prostate cancer while providing feature-level attributions using Integrated Gradients. The framework transforms black-box hazard predictions into interpretable explanations that identify which specific PSA values at which time points and which genomic markers drive an individual patient's predicted metastasis risk, addressing a critical barrier to clinical adoption of deep learning for survival analysis.
The key advantages of this approach include the ability to attribute risk to clinically meaningful temporal patterns (PSA velocity, doubling time, absolute values) and molecular features (Decipher, Oncotype DX, Prolaris, Polaris scores) within a unified model, enabling clinicians to distinguish between risk driven by dynamic disease activity versus intrinsic tumor biology. Actionable explanations directly inform treatment decisions, from intensified surveillance to salvage therapy to systemic treatment, while providing a natural mechanism for clinician verification and override when explanations contradict established knowledge.
Several limitations must be acknowledged: Integrated Gradients imposes computational costs that may challenge real-time clinical use, the attribution baseline requires careful selection, correlated PSA measurements can produce unstable credit assignment, and explanations are correlational rather than causal. Prospective validation in external, multi-institutional cohorts is required before any clinical deployment, and the current framework does not address competing risks such as non-prostate-cancer mortality in older patients.
We call for implementation of this explainable survival framework on large-scale prostate cancer cohorts, including Surveillance, Epidemiology, and End Results (SEER) data linked to longitudinal PSA, Veterans Affairs (VA) corporate data warehouse with comprehensive genomic and outcomes data, CaPSURE (Cancer of the Prostate Strategic Urologic Research Endeavor) registry, and multi-institutional radical prostatectomy registries with complete follow-up. Such implementations will enable rigorous evaluation of explanation quality, clinical utility, and generalizability across diverse patient populations and healthcare settings.
None
None
None
None
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.