Personalized survival probabilities for SARS-CoV-2 positive patients by explainable machine learning

Research output: Contribution to journalJournal articleResearchpeer-review


  • Fulltext

    Final published version, 2.74 MB, PDF document

Interpretable risk assessment of SARS-CoV-2 positive patients can aid clinicians to implement precision medicine. Here we trained a machine learning model to predict mortality within 12 weeks of a first positive SARS-CoV-2 test. By leveraging data on 33,938 confirmed SARS-CoV-2 cases in eastern Denmark, we considered 2723 variables extracted from electronic health records (EHR) including demographics, diagnoses, medications, laboratory test results and vital parameters. A discrete-time framework for survival modelling enabled us to predict personalized survival curves and explain individual risk factors. Performance on the test set was measured with a weighted concordance index of 0.95 and an area under the curve for precision-recall of 0.71. Age, sex, number of medications, previous hospitalizations and lymphocyte counts were identified as top mortality risk factors. Our explainable survival model developed on EHR data also revealed temporal dynamics of the 22 selected risk factors. Upon further validation, this model may allow direct reporting of personalized survival probabilities in routine care.

Original languageEnglish
Article number13879
JournalScientific Reports
Publication statusPublished - 2022

Bibliographical note

Publisher Copyright:
© 2022, The Author(s).

ID: 319804358