An interpretable machine learning model for covid-19 screening

Gustavo Carreiro Pinasco; Eduardo Moreno Júdice de Mattos Farina; Fabiano Novaes Barcellos Filho; Willer  França Fiorotti; Matheus Coradini Mariano Ferreira; Sheila Cristina de Souza Cruz; Andre   Louzada Colodette; Luciene Rossati Loureiro; Tatiane Comério; Dilzilene Cunha Sivirino Farias; Eliane  de Fátima Almeida Lima; Katia Valéria Manhambusque

doi:10.36311/jhgd.v32.13324

Authors

Gustavo Carreiro Pinasco Universidade Federal do Espírito Santo – UFES, Brazil;
Eduardo Moreno Júdice de Mattos Farina Universidade Federal de São Paulo – UNIFESP, Brazil;
Fabiano Novaes Barcellos Filho cEscola Superior de Ciências da Santa Casa de Misericórdia de Vitória – EMESCAM, Brazil;
Willer França Fiorotti cEscola Superior de Ciências da Santa Casa de Misericórdia de Vitória – EMESCAM, Brazil;
Matheus Coradini Mariano Ferreira dPrefeitura Municipal de Vitória, Brazil.
Sheila Cristina de Souza Cruz dPrefeitura Municipal de Vitória, Brazil.
Andre Louzada Colodette cEscola Superior de Ciências da Santa Casa de Misericórdia de Vitória – EMESCAM, Brazil;
Luciene Rossati Loureiro dPrefeitura Municipal de Vitória, Brazil.
Tatiane Comério dPrefeitura Municipal de Vitória, Brazil.
Dilzilene Cunha Sivirino Farias dPrefeitura Municipal de Vitória, Brazil.
Eliane de Fátima Almeida Lima aUniversidade Federal do Espírito Santo – UFES, Brazil;
Katia Valéria Manhambusque aUniversidade Federal do Espírito Santo – UFES, Brazil;

DOI:

https://doi.org/10.36311/jhgd.v32.13324

Keywords:

COVID-19, machine learning, artificial intelligence, pandemia

Abstract

Introduction: the Coronavirus Disease 2019 (COVID-19) is a viral disease which has been declared a pandemic by the WHO. Diagnostic tests are expensive and are not always available. Researches using machine learning (ML) approach for diagnosing SARS-CoV-2 infection have been proposed in the literature to reduce cost and allow better control of the pandemic.

Objective: we aim to develop a machine learning model to predict if a patient has COVID-19 with epidemiological data and clinical features.

Methods: we used six ML algorithms for COVID-19 screening through diagnostic prediction and did an interpretative analysis using SHAP models and feature importances.

Results: our best model was XGBoost (XGB) which obtained an area under the ROC curve of 0.752, a sensitivity of 90%, a specificity of 40%, a positive predictive value (PPV) of 42.16%, and a negative predictive value (NPV) of 91.0%. The best predictors were fever, cough, history of international travel less than 14 days ago, male gender, and nasal congestion, respectively.

Conclusion: We conclude that ML is an important tool for screening with high sensitivity, compared to rapid tests, and can be used to empower clinical precision in COVID-19, a disease in which symptoms are very unspecific.

Downloads

Download data is not yet available.

References

WHO Coronavirus Disease (COVID-19) Dashboard. World Health Organization – Avaliable from: <https://covid19.who.int> (2020).

Bustin, S. & Nolan, T. RT-qPCR Testing of SARS-CoV-2: A Primer. International Journal of Molecular Sciences 21, 3004 (2020). DOI: https://doi.org/10.3390/ijms21083004

Wynants, L. et al. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ 369, m1328 (2020). DOI: https://doi.org/10.1136/bmj.m1328

Peiffer-Smadja, N., Maatoug, R., Lescure, FX. et al. Machine Learning for COVID-19 needs global collaboration and data-sharing. Nat Mach Intell 2, 293–294 (2020). DOI: https://doi.org/10.1038/s42256-020-0181-6

Meng, Z. et al. Development and utilization of an intelligent application for aiding COVID-19 diagnosis. medRxiv, (2020). DOI: https://doi.org/10.1101/2020.03.18.20035816

Yan, L. et al. A machine learning-based model for survival prediction in patients with severe COVID-19 infection. medRxiv, (2020). DOI: https://doi.org/10.1101/2020.02.27.20028027

Zangirolami-Raimundo, J., Echeimberg, J. & Leone, C. Research methodology topics: Cross-sectional studies. Journal of Human Growth and Development 28, 356–360 (2018).DOI: https://doi.org/10.7322/jhgd.152198

Orientações para o Manejo de Pacientes de COVID-19. Federal Government of Brazil (2020). Preprint at: <https://www.gov.br/saude/pt-br>.

Cascella, M., Rajnik, M., Cuomo, A., Dulebohn, S. & Napoli, R. Features, Evaluation, and Treatment of Coronavirus. (StatPearls Publishing LLC., 2020).

McIntosh, K., Hirsch, M. & Bloom, A. Coronavirus disease 2019 (COVID-19): Epidemiology, virology, and prevention. Uptodate (2020). Preprint at <https://www.uptodate.com/contents/coronavirus-disease-2019-covid-19-epidemiolog y-virology-clinical-features-diagnosis-and-prevention#H3103904400>

Batista, G., Prati, R. & Monard, M. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explorations Newsletter 6, 20-29 (2004). DOI: https://doi.org/10.1145/1007730.1007735

Moons, K. et al. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): Explanation and Elaboration. Annals of Internal Medicine 162, W1-W73 (2015).DOI: https://doi.org/10.7326/M14-0698

Shah, P., Kendall, F., Khozin, S. et al. Artificial intelligence and machine learning in clinical development: a translational perspective. npj Digit. Med. 2, 69 (2019). DOI: https://doi.org/10.1038/s41746-019-0148-3

Finding a role for AI in the pandemic. Nat Mach Intell 2, 291 (2020). DOI: https://doi.org/10.1038/s42256-020-0196-z

Batista, A., Miraglia, J., Donato, T. & Chiavegatto Filho, A. COVID-19 diagnosis prediction in emergency care patients: a machine learning approach. medRxiv (2020). DOI: https://doi.org/10.1101/2020.04.04.20052092

Ribeiro, M., Singh, S. & Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. Association for Computing Machinery 1135–1144 (2016). DOI: https://doi.org/10.1145/2939672.2939778

Lundberg, S. & Lee, S. A Unified Approach to Interpreting Model Predictions. 31st Conference on Neural Information Processing Systems NIPS (2017). Preprint at <https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67 767-Abstract.html>

Couzin-Frankel, J. The mystery of the pandemic’s ‘happy hypoxia’. Science 368, 455-456 (2020). DOI: https://doi.org/10.1126/science.368.6490.455

An interpretable machine learning model for covid-19 screening

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

Keywords

Latest publications