Survival prediction for oncology patients in emergency care
GBM model with AutoML to estimate survival in oncology patients seen in emergency care and support time-sensitive clinical decisions.
Clinical context
The transition from curative care to palliative care remains a challenge in clinical practice, especially for cancer patients seen in emergency care. In these settings, decisions need to be made quickly, often using incomplete, fragmented or time-sensitive clinical information.
Problem addressed
Physicians may overestimate survival in severely ill oncology patients, making timely conversations about palliative care, treatment intensity and expectations harder. Traditional prognostic scales may also require laboratory, functional or subjective information that is not always available at emergency admission.
Project objective
Develop a predictive application capable of estimating survival for oncology patients in emergency care, supporting risk stratification and clinical discussion about prognosis, care and treatment priorities.
Situation
Oncology emergency services need to make fast decisions for high-risk patients, often with fragmented information and uncertainty about prognosis.
Task
Build a predictive tool to estimate survival and support discussions on care prioritization, treatment intensity and palliative care approach.
Action
Structured real oncology emergency data, defined the target based on palliative care physician assessments, prepared clinical, demographic and vital-sign features, trained a GBM model with AutoML and interpreted results using SHAP.
Result
The model reached a ROC AUC of 0.912 and AUPR of 0.875, and was deployed in an app to support individual analysis, clinical interpretation and care discussion.
Key metrics
Data source
Data were extracted from patient encounters in an oncology hospital emergency department, using electronic health record data. The cohort focused on patients seen in a closed-door oncology emergency service for previously linked patients.
Model target
The target variable was built from historical records of palliative care physicians' work process. These professionals classified patients using clinical experience and factors such as functional status, oncology diagnosis, comorbidities and treatment response. The model was structured to distinguish patients with short survival from those with a higher probability of longer survival.
Features used
Features included demographic information, clinical data, care history, vital signs and data from the last outpatient visit. Main feature groups: - Age - Sex - Main diagnosis, ICD - Recent hospitalization history - Heart rate - Respiratory rate - Oxygen saturation - Mean blood pressure - Weight - Height - BMI - ECOG - Time between last outpatient visit and emergency admission - Status, trend and clinical priority, when available
Methodological approach
The analysis followed a healthcare data project workflow: - Understanding the clinical context - Defining the care problem - Organizing the target based on palliative care assessment - Extracting and preparing electronic health record data - Handling clinical, categorical and numerical variables - Exploratory analysis of features - Modeling with AutoML - Selecting a GBM model - Evaluating discrimination and error metrics - Interpreting results with SHAP - Deploying a demonstration app
Model
The final model was a GBM selected by AutoML. This approach was used because it handles clinical tabular data, nonlinear relationships, feature interactions and different predictor types. The application provides model interpretation and allows users to evaluate global performance, ROC curve and feature importance.
About the AUPR metric
AUPR represents the area under the Precision-Recall curve, useful for evaluating performance when classes are unevenly distributed.
Comparison with traditional scales
Traditional prognostic scales, such as the PaP Score, perform well in clinical contexts but may be limited in emergency care because they depend on laboratory data, subjective assessments or extensive data collection. The technical material notes that the PaP Score commonly reports AUC values between 0.75 and 0.85, while the developed model reached a ROC AUC of 0.912 in the app.
SHAP interpretation
Interpretation was performed using SHAP to show how each variable contributed to increasing or decreasing the predicted probability. Features such as ECOG, diagnosis/ICD, vital signs, recent hospitalization, age, BMI, time since last outpatient visit, status and clinical trend appeared as relevant factors in explaining predictions.
How to use the output
The model should be used as a decision support tool, not as a replacement for clinical judgment. Its output helps structure discussions about prognosis, palliative care assessment, care intensity, communication with family and care priorities.
Limitations and caveats
- The model is retrospective and requires prospective validation before real clinical use. - The dataset comes from a specific oncology hospital and closed-door emergency care context. - Interpretation must consider data availability, quality and timing of registration. - The output is probabilistic, not deterministic. - Clinical use requires governance, local validation, workflow integration and continuous monitoring.
Deliverables
- Interactive web application - GBM model via AutoML - Clinical data preparation pipeline - Global performance report - ROC curve - Error and discrimination metrics - SHAP interpretation - Organization of clinical, demographic and care-related features - Demonstration interface for decision support - Technical project presentation
Learnings
- Predictive healthcare projects need to start with the clinical problem, not the algorithm. - Target definition may depend on expert knowledge, such as palliative care assessment. - Survival models in emergency care need to balance performance, interpretability and applicability in real workflows. - SHAP helps connect the model to clinical discussion. - The value of the project lies in supporting difficult, time-sensitive and context-dependent decisions.