• Logo
  • HamaraJournals


Data Fuels Detection: How to Prevent Epidemics Using Data



Data for prevention and tracking of disease should begin prior to the outbreak. The bottleneck for early detecting outbreaks is data. The data are collected from different points of care and aggregated, then analyzed centrally to warn us about what is happening. However, this current pandemic has not utilized data for prevention and tracking in a meaningful way. We believe the prevention problem is the data problem and it should be addressed to prevent the future pandemics  in an effective way.

Dear Editor

The SARS-CoV-2 pandemic has raised many questions for the research community, including how this illness originated. A team of researchers at the University of Cambridge have used data to determine this virus jumped to humans between mid-September and early December [1]. Data also proved valuable in other areas of medical research, it has been used to determine the safety of a drug, the dosage of a drug, and where to conduct clinical trials, among other data informed decisions.

However, this current pandemic has not utilized data for prevention and tracking in a meaningful way. Data for prevention and tracking of disease should begin prior to the outbreak. Barardi and colleagues noted that although viruses can be present in aquatic environments, environmental monitoring did not routinely look for virions in drinking water supplies [2]. Data suggests that patients with SARS-CoV-2 shed the virus in their fecal matter, and therefore, had monitoring systems been in place prior to the start of the pandemic, one could argue that the virus presented in sewage prior to the first human diagnosis [3]. Many nations already monitor their environments for air and water pollution, these systems could be adapted to monitor for other types of particles, such as virions. Researchers have developed techniques to monitor environmental samples for the presence of virions [4].

The spread of the SARS-CoV-2 virus from China to the rest of the world indicates a problem in the health security intelligence gathering and/or sharing apparatus. One would think that with all the intelligence sharing and technology in place today, such a spread could be prevented. It is important to examine what factors went wrong in this instance, to prevent the next pandemic. Events such as highly infectious diseases that can influence the entire population of the world must be monitored by a global task force. We need a strong world-wide information, and infrastructure independent of the country's development level that can collect health data, including environmental, infectious disease, and patient data at the point of care. So that if something is missed at the country level for any reason, the global task force can notice through an alarm system for establishing just in time prevention strategies in the infection site, cluster, country or the region. The duality of a global system, in addition to national systems, can prevent deadly deceptions by nations looking to avoid being blamed because a disease originates within their borders.

The bottleneck for early detecting outbreaks is data. The data are collected from different points of care and aggregated, then analyzed centrally to warn us about what is happening. The problem is the detection of patient zero. For example, physicians thought that SARS-CoV-2 was the flu, and of course in most of the flu cases they simply said you have a cold. The problem we have in the first cases of an epidemic is a problem in data collection. The first contact of people with the health system is through physician offices or clinics, and in the very severe cases emergency departments in the hospital. Leveraging big data and intelligent analytics for public health could improve pandemic response, because the world is more connected than ever before and many mathematical modeling techniques are available [5].

Moving to a more data driven approach would likely require an education campaign for the general public, and strict guidelines to ensure privacy rights are not violated [5]. Another idea could be a system that tracks disease data on a per capita basis, by dividing confirmed cases and deaths by national populations [6].

Currently, there is no integrated system to collect data at the point of entry. We need a system that includes data from WHO and national health authorities that at a minimum, must include some symptoms as notifiable symptoms for a full range of respiratory or gastrointestinal symptoms (with/without judgment about the disease behind symptoms) that can be related to contagious and infectious pathogens.

We need a web based global system that is simple, but equipped with real time analytics for real-time warning, real-time reporting, and real-time actions at local, national and international levels. Focus must be on the symptoms, as we don't have laboratory testing for all of the cases presenting to the different spectrum of healthcare systems world-wide. When some "managers" can't collect, aggregate, and analyze their numbers in the right way and in real-time even after the epidemics, how can we expect to notice the early alert for preventing the epidemics. Hashimoto et al. found that adjustments to specificity and sensitivity resulted in earlier detection and better management of epidemics. Reinforcing, the statement that what gets measured gets managed, and the importance of measuring environmental changes for proper prevention and management of infectious diseases [7].

Detection and prevention are often after-thoughts, with treatments and vaccines being the primary focus. Lockdowns on a grand scale paralyze the community. If done at all, it should be done in the early stage around the pathogen cluster, not at the end stage or middle stage of the spread. Waiting to implement a lockdown protocol until after the pathogen starts to spread, could create additional problems for the health care system and the populations affected.


The authors agree on this final form of the manuscript, and attested that all authors contributed in the final draft of the manuscript. 


The authors declare no conflicts of interest regarding the publication of this study.


No financial interests related to the material of this manuscript have been declared.


1. Forster P, Forster L, Renfrew C, Forster M. Phylogenetic network analysis of SARS-CoV-2 genomes. Proceedings of the National Academy of Sciences. 2020;117(17):9241–3.
2. Barardi C, Viancelli A, Rigotto C, Correa A, Moresco V, Souza D, et al. Monitoring viruses in environmental samples. International Journal of Environmental Science and Engineering Research. 2012;3:62–79.
3. Medema G, Heijnen L, Elsinga G, Italiaander R, Brouwer A. Presence of SARS-Coronavirus-2 in sewage. medRxiv 2020;
4. Hatsuki R, Honda A, Kajitani M, Yamamoto T. Nonlinear electrical impedance spectroscopy of viruses using very high electric fields created by nanogap electrodes. Front Microbiol. 2015;6:940.
5. Ienca M, Vayena E. On the responsible use of digital data to tackle the COVID-19 pandemic. Nat Med. 2020;26(4):463–4.
6. de Mesnard L. Tracking COVID-19 pandemic: The per-capita approach changes the whole picture. Springer 2020;
7. Hashimoto S, Murakami Y, Taniguchi K, Nagai M. Detection of epidemics in their early stage through infectious disease surveillance. Int J Epidemiol. 2000;29(5):905–10.

This display is generated from Gostaresh Afzar Hamara JATS XML.


  • There are currently no refbacks.