Prediction of COVID-19 From Hemogram Results and Age Using Machine Learning

Elena Caires Silveira



Introduction: The rapid global dissemination of COVID-19 culminated in the mobilization of great technological efforts aimed at its better understanding and control. In this context, Machine Learning gains notoriety, and its application has been widely documented for pathophysiological, diagnostic, therapeutic, prognostic and monitoring of COVID-19 purposes. The present study aimed to build a model for the prediction of the diagnosis of COVID-19 based on blood count results and age of patients and to identify the main characteristics taken into account by the algorithm for the predictive decision.

Material and Methods: Anonymous data from 1157 patients made available by the COVID-19 Data Sharing / BR repository were used. The work took place in two distinct stages: description and analysis of the data; and construction of the predictive model.

Results: With the exception of hemoglobin measurement, mean corpuscular volume, red cell distribution width, mean platelet volume and neutrophil-lymphocyte ratio, there was a statistically significant association of all other hematological parameters assessed with COVID-19. The predictive model developed from the XGBoost classifier reached an accuracy of 80.0% with a sensitivity of 75.6% and specificity of 82.0%. The variables that had the greatest influence on the predictive decision were basophil, eosinophil and leukocyte measurements. The present study confirms the potential of using blood count results, a widely available and accessible test, in the context of the diagnostic evaluation and pathophysiological investigation of COVID-19.

Conclusion: This work highlights the relevance of the systematization and dissemination of data related to COVID-19 for use in new research.


Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. The species severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2. Nature Microbiology. 2020; 5(4): 536-44. PMID: 32123347 DOI: 10.1038/s41564-020-0695-z

Velavan TP, Meyer CG. The COVID-19 epidemic. Trop Med Int Health. 2020; 25(3): 278-80. PMID: 32052514 DOI: 10.1111/tmi.13383

Ouassou H, Kharchoufa L, Bouhrim M, Daoudi NE, Imtara H, Bencheikh N, et al. The pathogenesis of coronavirus disease 2019 (COVID-19): Evaluation and prevention. J Immunol Res. 2020; 2020(10): 1-7.

Lu R, Zhao X, Li J, Niu P, Yang B, Wu H, et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: Implications for virus origins and receptor binding. Lancet. 2020; 395(10224): 565-74. PMID: 32007145 DOI: 10.1016/S0140-6736(20)30251-8

Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature. 2020; 579(7798): 270-73. PMID: 32015507 DOI: 10.1038/s41586-020-2012-7

Zhu N, Zhang D, Wang W, Li X, Yang B, Song J, et al. A novel coronavirus from patients with pneumonia in China, 2019. N Engl J Med. 2020; 382(8): 727‐33. PMID: 31978945 DOI: 10.1056/NEJMoa2001017

Liu J, Liao X, Qian S, Yuan J, Wang F, Liu Y, et al. Community transmission of severe acute respiratory syndrome coronavirus 2, Shenzhen, China, 2020. Emerg Infect Dis. 2020; 26(6): 1320-3. PMID: 32125269 DOI: 10.3201/eid2606.200239

World Health Organization. Report of the WHO-China joint mission on coronavirus disease 2019 (COVID-19) [Internet]. 2020 [cited: 20 Jul 2020]. Available from:

Luo L, Liu D, Liao X, Wu X, Jing Q, Zheng J, et al. Modes of contact and risk of transmission in COVID-19 among close contacts (pre-print). MedRxiv. 2020. DOI: 10.1101/2020.03.24.20042606

Van Doremalen N, Bushmaker T, Morris DH, Holbrook MG, Gamble A, Williamson BN, et al. Aerosol and surface stability of SARS-CoV-2 as compared with SARS-CoV-1. N Engl J Med. 2020; 382(16): 1564-7. PMID: 32182409 DOI: 10.1056/NEJMc2004973

Wu S, Wang Y, Jin X, Tian J, Liu J, Mao Y. Environmental contamination by SARS-CoV-2 in a designated hospital for coronavirus disease 2019. Am J Infect Control. 2020;48(8):910-4. PMID: 32407826 DOI: 10.1016/j.ajic.2020.05.003

World Health Organization. Advice on the use of masks in the context of COVID-19 [Internet]. 2020 [cited: 20 Jul 2020]. Available from:

Stadnytskyi V, Bax CE, Bax A, Anfinrud P. The airborne lifetime of small speech droplets and their potential importance in SARS-CoV-2 transmission. Proc Ntl Acad Sci. 2020; 117(22): 11875-7. PMID: 32404416 DOI: 10.1073/pnas.2006874117

Somsen GA, van Rijn C, Kooij S, Bem RA, Bonn D. Small droplet aerosols in poorly ventilated spaces and SARS-CoV-2 transmission. Lancet Respir Med. 2020; 8(7): 658-9.

Singhal T. A review of coronavirus disease 2019 (COVID-19). Indian J Pediatr. 2020; 87(4): 281-6. PMID: 32166607 doi: 10.1007/s12098-020-03263-6

Tan L, Wang Q, Zhang D, Ding J, Huang Q, Tang YQ, et al. Lymphopenia predicts disease severity of COVID-19: A descriptive and predictive study. Signal Transduct Target Ther. 2020; 5(1): 33. PMID: 32296069 DOI: 10.1038/s41392-020-0148-4

Terpos E, Ntanasis‐Stathopoulos I, Elalamy I, Kastritis E, Sergentanis TN, Politou M, et al. Hematological findings and complications of COVID‐19. Am J Hematol. 2020;95(7):834-47. PMID: 32282949 DOI: 10.1002/ajh.25829

World Health Organization. Director general's opening remarks at the media briefing on COVID19 [Internet]. 2020 [cited: 1 Jul 2020]. Available from:

Alimadadi A, Aryal S, Manandhar I, Munroe PB, Joe B, Cheng X. Artificial intelligence and machine learning to fight COVID-19. Physiol Genomics. 2020; 52(4): 200-2. PMID: 32216577 DOI: 10.1152/physiolgenomics.00029.2020

Li L, Qin L, Xu Z, Yin Y, Wang X, Kong B, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: Evaluation of the diagnostic accuracy. Radiology. 2020; 296(2): E65-71. PMID: 32191588 DOI: 10.1148/radiol.2020200905

Wu JT, Leung K, Leung GM. Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study. Lancet. 2020; 395(10225): 689-97. PMID: 32014114 DOI: 10.1016/S0140-6736(20)30260-9

Vaishya R, Javaid M, Khan IH, Haleem A. Artificial intelligence (AI) applications for COVID-19 pandemic. Diabetes & Metabolic Syndrome: Clinical Research & Reviews. 2020; 14(4): 337-9.

McCall B. COVID-19 and artificial intelligence: protecting health-care workers and curbing the spread. Lancet Digit Health. 2020; 2(4): e166-7. PMID: 32289116 DOI: 10.1016/S2589-7500(20)30054-6

Deo RC. Machine learning in medicine. Circulation. 2015; 132(20): 1920-30. PMID: 26572668 DOI: 10.1161/CIRCULATIONAHA.115.001593

Kan A. Machine learning applications in cell image analysis. Immunol Cell Biol. 2017; 95(6): 525-30. PMID: 28294138 DOI: 10.1038/icb.2017.16

Cabitza F, Banfi G. Machine learning in laboratory medicine: Waiting for the flood? Clin Chem Lab Med. 2018; 56(4): 516-24. PMID: 29055936 DOI: 10.1515/cclm-2017-0287

XGBoost Developers. XGBoost documentation [Internet]. 2020 [cited: 1 Aug 2020]. Avaliable from:

Chen T, Guestrin C. XGBoost: A scalable tree boosting system. International Conference on Knowledge Discovery and Data Mining. Arxiv; 2016.

Ferrari D, Motta A, Strollo M, Banfi G, Locatelli M. Routine blood tests as a potential diagnostic tool for COVID-19. Clin Chem Lab Med. 2020; 58(7): 1095-9. PMID: 32301746 DOI: 10.1515/cclm-2020-0398

Shimoni Z, Glick J, Froom P. Clinical utility fo the full blood count in identifying patients with pandemic Influenza A (H1N1). Journal of Infection. 2013; 66(6): 545-7. PMID: 23318262 DOI: 10.1016/j.jinf.2013.01.001

Qin C, Zhou L, Hu Z, Zhang S, Yang S, Tao Y, et al. Dysregulation of immune response in patients with coronavirus 2019 (COVID-19) in Wuhan, China. Clin Infect Dis. 2020; 12: ciaa248. PMID: 32161940 DOI: 10.1093/cid/ciaa248

Rodriguez L, Pekkarinen P, Tadepally LK, Tan Z, Consiglio CR, Pou C, et al. Systems-level immunomonitoring from acute to recovery phase of severe COVID-19. Cell reports. Medicine. 2020 :100078. PMID: 32838342 DOI: 10.1016/j.xcrm.2020.100078

Tahamtan A, Ardebili A. Real-time RT-PCR in COVID-19 detection: Issues affecting the results. Expert Rev Mol Diagn. 2020; 20(5): 453-4. PMID: 32297805 DOI: 10.1080/14737159.2020.1757437

West CP, Montori VM, Sampathkumar P. COVID-19 testing: The threat of false-negative results. Mayo Clin Proc. 2020; 95(6): 1127-9. PMID: 32376102 DOI: 10.1016/j.mayocp.2020.04.004

Xiao AT, Tong YX, Zhang S. False negative of RT‐PCR and prolonged nucleic acid conversion in COVID‐19: Rather than recurrence. J Med Virol. 2020. [Epub ahead of print]. PMID: 32270882 DOI: 10.1002/jmv.25855



  • There are currently no refbacks.