Using AI & ML to Fight Against Pandemics Like the Coronavirus (COVID-19)
This year, the world changed in the span of a few months, in unprecedented ways that surprised and overwhelmed every country on this planet.
The greatest global crisis since World War II and the largest global pandemic since the 1918–19 Spanish Flu fell upon us. Everybody spent a better part of their day looking at the daily rise of the death toll and the rapid, exponential spread of this novel strain of the COVID-19 virus.
Millions of people lost their jobs, unemployment rose through the roof, global travel and hospitality industries were all but decimated, international relationships were frayed, healthcare systems were stressed to the limits.
Of course, the fight was not one-sided. Governments, private enterprises, community organizers, healthcare organizations, scientists and engineers, front-line workers, supply chain and logistics organizations – all pitched in to battle against the (still raging) tiny, invisible enemy.
This was, in many ways, the first truly global pandemic of the 21st century, which impacted the largest swath of global population and economies. Therefore, it is also the first time that the most modern and ambitious tools of our scientific and industrial might are being deployed to control and mitigate the impacts of a pandemic.
Therefore, it’s natural to raise the question:How can the tools of Artificial Intelligence (AI) and Machine Learning (ML) help in this fight against the current and future pandemics?
After all, AI/ML are regularly hailed as the most transformative and promising technologies of the 21st century civilization, and are rightly expected to help humankind fight against future pandemics.
In this article, we discuss a few possible ideas in this regard including:
- Personalized Risk Assessment to Aid Epidemiological Models
- Vaccine Development with the help of Artificial Intelligence (AI)
- Protein Structure Prediction
- Risk Classification & Clustering for Better Contact Tracing
- Digital Surveillance of Epidemics
So, let’s take a look at how artificial intelligence and machine learning can be put to use in fighting against pandemics, now and in the years ahead.
Personalized Risk Assessment to Aid Epidemiological Models
AI and ML are already widely used in a variety of recommendation systems and business practices for personalizing the consumer choice for products and services. Amazon, Netflix, Facebook, Twitter, decide – both on the basis of our personal profile and macro-level data from other users – what to show us for books, movies, household products, friends’ comments, community messages, etc.
Going forward, the same strategy could work for fighting against future pandemics. Using multiple sources of data, machine-learning models can be trained to model and predict the clinical risk (or at least a probability measure) of an individual suffering severe outcomes if infected with diseases like COVID-19. This can lead to the prediction of the probable usage of critical care resources in a given healthcare system to better allocate resources to those in greatest need.
The final goal will determine the choice of the type of the models. If they are mostly used for predictions, then powerful Deep Learning algorithms can be used for this purpose. On the other hand, if explainability is a key requirement, then models with less parametric complexity, like decision trees or logistic regression can be used in a supervisory setting. Given a large dataset, and feature engineering by domain experts and data scientists, even a classical ML algorithm can demonstrate high accuracy and sensitivity/specificity.
Many epidemiological models already work this way. However, they mostly focus on the disease-spreading dynamics and related mathematical parameters of the particular epidemic on hand, without pulling data from all the possible sources. Large-scale ML systems are adept at dealing with disparate data sources at high velocity, and therefore, changing demographics and overall health pattern of a large population can be integrated into the existing epidemiological models using ML algorithms in a synergistic manner.
Standard epidemiological models operate on the macro signals and do not always lead to resource optimization, but the characteristics of individuals can be important for estimating critical care requirements in a particular region. In this regard, future pandemics will be fought with greater efficiency and lesser wastage of resources, with the help of AI/ML.
Vaccine Development with the help of Artificial Intelligence (AI)
As this article points out, the Harvard T.H. Chan School of Public Health and the Human Vaccines Project have announced the Human Immunomics Initiative, a joint effort that will use artificial intelligence models to accelerate vaccine development.
It will bring together experts in epidemiology, causal inference, immunology, statistical modeling, computer science, and computational/systems biology to develop AI-powered models of human immune system and response mechanisms that can be used to accelerate the design and testing of vaccines and therapeutics for a wide range of diseases.
AI-powered models will allow researchers to virtually test potential vaccines, and predict what vaccines and therapies might work best across populations. This could massively speed up vaccine and drug development, and lower costs spent on testing and trials.
AI-powered models inherently allow for stochastic scenario analyses, which is critical for such an enterprise, where multiple vaccine trials may be undergoing at the same time and healthcare and Government authorities have to make speedy decisions about the actual human trial and distribution by looking at various scenarios and weighting them properly. An individual, anecdotal approach is sure to fail in such complex situations. Large-scale data analytics is the only tool we have to make sound decisions.
Although the focus is on a large variety of diseases, it is needless to say that these kinds of AI-based models will be most effective where the largest amount of raw data is available. Global pandemics, such as COVID-19, play that role of data generator perfectly. While this kind of ambitious project takes time to develop robust models and safe drug-design mechanisms, and cannot be readily applied for an ongoing pandemic, they are the right kind of initiatives for preparing human society to fight against future pandemics.
Protein Structure Prediction
Global pandemics such as COVID-19 are most often caused by viruses. At the fundamental structural level, a virus mainly consists of a single (or a few) strands of DNA/RNA. Determining the 3D protein structure, i.e. the sequence of amino acid molecules from the genetic test data, is the key to develop certain classes – subunit and nucleic acid type – of vaccines.
This task is computationally infeasible (no matter how much hardware resources you throw at it) if tried using conventional protein-folding algorithms. Artificial intelligence can play a significant role to help solve this challenge with the latest techniques of deep reinforcement learning (DRL) and Bayesian optimization.
In fact, on that cue, DeepMind, the famous DL research unit of Google, introduced AlphaFold, a DRL-based system that predicts the 3D structure of a protein based on its genetic sequence. In early March, the system was put to the test on COVID-19. AI researchers at DeepMind were able to release structural predictions of several under-studied proteins associated with SARS-CoV-2 to help the worldwide clinical and virology research community better understand the virus and its impact on human biology.
It is indeed a strong testament to the generalizability and universality of the techniques developed in the fields of deep learning, game theory, and reinforcement learning, that the same underlying platform that powers AlphaGo (which beat world champion Lee Sedol in the classical game of Go) could be adapted for this protein structure prediction task with only some suitable injection of domain knowledge.
Multiple other research groups, at UT Austin and University of Washington, are trying to build 3D atomic scale models of the COVID-19 spike protein, which attaches to the human body cells. They employ AI tools to search for the optimal structure from a host of candidate designs.
Risk Classification & Clustering for Better Contact Tracing
One lesson learned from COVID-19 has been that forceful government interventions with shelter-in-place orders are only sustainable up to a point, beyond which, the enormous economic burdens start to pile up. Therefore, widespread testing and contact tracing have been acknowledged as the best possible policies to tackle any future pandemic beyond the most critical phase in order to mitigate the spread of a virus.
Traditional contact tracing techniques are dependent on isolated data chunks gathered from individual testing centers and government/health authorities. When tens of millions of data points start streaming in, conventional techniques can easily fail.
Drawing from the same idea above, we can put AI and ML techniques to use for real-time classification and clustering of micro-populations, who are at elevated risk of contracting or spreading the disease. This will be incredibly helpful for isolation and contact tracing, even with limited resources.
Advanced clustering techniques such as DBSCAN, hierarchical agglomerative clustering, multi-exemplar affinity propagation (MEAP), graph-based multi-prototype competitive learning (GMPCL), and clustering based on geospatial regression techniques, can be brought to bear on this problem.
Many of these modern methods are particularly optimized to work with large-scale streaming data, which is suitable for a scenario with ever-increasing testing and travel data constantly feeding into the ML system.
On top of clustering, dimensionality reduction techniques can also be used on this kind of data to identify the key factors which are giving rise to such clusters. These factors can be communicated to the appropriate authorities for high-level policy decisions with regard to travel, testing, isolation, and other suitable community-based actions.
Digital Surveillance of Epidemics
Ever-growing amounts of data are present on social media, blogs, chat rooms, and local news reports that give us clues about disease outbreaks happening on a daily basis. This trend is only going to grow as more people (particularly in countries like India, Brazil, South Africa, or China) go online and share their fear and symptoms, search for medicine or doctors, discuss governmental and healthcare policies.
Digital surveillance is the next-generation AI-powered tool that promises to track these conversations, data streams, search patterns, and the associated digital demographics – at an exabyte scale – to model, predict, and warn healthcare systems and Governments about emerging epidemics throughout the world.
Efficacy of such digital tools have already been demonstrated. Nearly a week before the WHO first warned of a mysterious new respiratory disease in Wuhan, China, a team of global disease experts based in Boston captured digital clues about the outbreak from online press reports and released their findings in a real-time monitoring system called HealthMap.
When you visit their website, you will be presented with this kind of interactive global map, which is being updated every hour.
This is being touted as Digital Epidemiology, where traditional mathematical models are being replaced with or complemented by machine learning and pattern-finding models, generated by Big Data technologies. The key advantages of this approach are, not surprisingly, speed and volume.
The trustworthiness or the so-called ‘veracity’ of disparate data sources still remains a pressing issue. Although, ML researchers have always liked to work with a diversified source of data, which can be readily ingested by ensemble models (e.g gradient-boosted trees) to democratize the predictive power and reduce bias in the models.
Looking Into the Future of AI & ML Predicting Spread of Pathogens
In this article, we took a quick tour of the various promising technologies and initiatives that are using AI/ML tools and techniques for solving the great challenge of modeling, mitigating, and predicting the spread of infectious pathogens, which cause global pandemics.
As the world becomes more digitally connected, the healthcare systems and policy initiatives which embrace data-driven technologies (like artificial intelligence and machine learning) are likely to stay ahead in the battle against epidemics.
How Artificial Intelligence (AI) & Machine Learning (ML) Can Fight Future Pandemics
Using AI & ML to Fight Against Pandemics Like the Coronavirus (COVID-19)
This year, the world changed in the span of a few months, in unprecedented ways that surprised and overwhelmed every country on this planet.
The greatest global crisis since World War II and the largest global pandemic since the 1918–19 Spanish Flu fell upon us. Everybody spent a better part of their day looking at the daily rise of the death toll and the rapid, exponential spread of this novel strain of the COVID-19 virus.
Millions of people lost their jobs, unemployment rose through the roof, global travel and hospitality industries were all but decimated, international relationships were frayed, healthcare systems were stressed to the limits.
Of course, the fight was not one-sided. Governments, private enterprises, community organizers, healthcare organizations, scientists and engineers, front-line workers, supply chain and logistics organizations – all pitched in to battle against the (still raging) tiny, invisible enemy.
This was, in many ways, the first truly global pandemic of the 21st century, which impacted the largest swath of global population and economies. Therefore, it is also the first time that the most modern and ambitious tools of our scientific and industrial might are being deployed to control and mitigate the impacts of a pandemic.
Therefore, it’s natural to raise the question:How can the tools of Artificial Intelligence (AI) and Machine Learning (ML) help in this fight against the current and future pandemics?
After all, AI/ML are regularly hailed as the most transformative and promising technologies of the 21st century civilization, and are rightly expected to help humankind fight against future pandemics.
In this article, we discuss a few possible ideas in this regard including:
- Personalized Risk Assessment to Aid Epidemiological Models
- Vaccine Development with the help of Artificial Intelligence (AI)
- Protein Structure Prediction
- Risk Classification & Clustering for Better Contact Tracing
- Digital Surveillance of Epidemics
So, let’s take a look at how artificial intelligence and machine learning can be put to use in fighting against pandemics, now and in the years ahead.
Personalized Risk Assessment to Aid Epidemiological Models
AI and ML are already widely used in a variety of recommendation systems and business practices for personalizing the consumer choice for products and services. Amazon, Netflix, Facebook, Twitter, decide – both on the basis of our personal profile and macro-level data from other users – what to show us for books, movies, household products, friends’ comments, community messages, etc.
Going forward, the same strategy could work for fighting against future pandemics. Using multiple sources of data, machine-learning models can be trained to model and predict the clinical risk (or at least a probability measure) of an individual suffering severe outcomes if infected with diseases like COVID-19. This can lead to the prediction of the probable usage of critical care resources in a given healthcare system to better allocate resources to those in greatest need.
The final goal will determine the choice of the type of the models. If they are mostly used for predictions, then powerful Deep Learning algorithms can be used for this purpose. On the other hand, if explainability is a key requirement, then models with less parametric complexity, like decision trees or logistic regression can be used in a supervisory setting. Given a large dataset, and feature engineering by domain experts and data scientists, even a classical ML algorithm can demonstrate high accuracy and sensitivity/specificity.
Many epidemiological models already work this way. However, they mostly focus on the disease-spreading dynamics and related mathematical parameters of the particular epidemic on hand, without pulling data from all the possible sources. Large-scale ML systems are adept at dealing with disparate data sources at high velocity, and therefore, changing demographics and overall health pattern of a large population can be integrated into the existing epidemiological models using ML algorithms in a synergistic manner.
Standard epidemiological models operate on the macro signals and do not always lead to resource optimization, but the characteristics of individuals can be important for estimating critical care requirements in a particular region. In this regard, future pandemics will be fought with greater efficiency and lesser wastage of resources, with the help of AI/ML.
Vaccine Development with the help of Artificial Intelligence (AI)
As this article points out, the Harvard T.H. Chan School of Public Health and the Human Vaccines Project have announced the Human Immunomics Initiative, a joint effort that will use artificial intelligence models to accelerate vaccine development.
It will bring together experts in epidemiology, causal inference, immunology, statistical modeling, computer science, and computational/systems biology to develop AI-powered models of human immune system and response mechanisms that can be used to accelerate the design and testing of vaccines and therapeutics for a wide range of diseases.
AI-powered models will allow researchers to virtually test potential vaccines, and predict what vaccines and therapies might work best across populations. This could massively speed up vaccine and drug development, and lower costs spent on testing and trials.
AI-powered models inherently allow for stochastic scenario analyses, which is critical for such an enterprise, where multiple vaccine trials may be undergoing at the same time and healthcare and Government authorities have to make speedy decisions about the actual human trial and distribution by looking at various scenarios and weighting them properly. An individual, anecdotal approach is sure to fail in such complex situations. Large-scale data analytics is the only tool we have to make sound decisions.
Although the focus is on a large variety of diseases, it is needless to say that these kinds of AI-based models will be most effective where the largest amount of raw data is available. Global pandemics, such as COVID-19, play that role of data generator perfectly. While this kind of ambitious project takes time to develop robust models and safe drug-design mechanisms, and cannot be readily applied for an ongoing pandemic, they are the right kind of initiatives for preparing human society to fight against future pandemics.
Protein Structure Prediction
Global pandemics such as COVID-19 are most often caused by viruses. At the fundamental structural level, a virus mainly consists of a single (or a few) strands of DNA/RNA. Determining the 3D protein structure, i.e. the sequence of amino acid molecules from the genetic test data, is the key to develop certain classes – subunit and nucleic acid type – of vaccines.
This task is computationally infeasible (no matter how much hardware resources you throw at it) if tried using conventional protein-folding algorithms. Artificial intelligence can play a significant role to help solve this challenge with the latest techniques of deep reinforcement learning (DRL) and Bayesian optimization.
In fact, on that cue, DeepMind, the famous DL research unit of Google, introduced AlphaFold, a DRL-based system that predicts the 3D structure of a protein based on its genetic sequence. In early March, the system was put to the test on COVID-19. AI researchers at DeepMind were able to release structural predictions of several under-studied proteins associated with SARS-CoV-2 to help the worldwide clinical and virology research community better understand the virus and its impact on human biology.
It is indeed a strong testament to the generalizability and universality of the techniques developed in the fields of deep learning, game theory, and reinforcement learning, that the same underlying platform that powers AlphaGo (which beat world champion Lee Sedol in the classical game of Go) could be adapted for this protein structure prediction task with only some suitable injection of domain knowledge.
Multiple other research groups, at UT Austin and University of Washington, are trying to build 3D atomic scale models of the COVID-19 spike protein, which attaches to the human body cells. They employ AI tools to search for the optimal structure from a host of candidate designs.
Risk Classification & Clustering for Better Contact Tracing
One lesson learned from COVID-19 has been that forceful government interventions with shelter-in-place orders are only sustainable up to a point, beyond which, the enormous economic burdens start to pile up. Therefore, widespread testing and contact tracing have been acknowledged as the best possible policies to tackle any future pandemic beyond the most critical phase in order to mitigate the spread of a virus.
Traditional contact tracing techniques are dependent on isolated data chunks gathered from individual testing centers and government/health authorities. When tens of millions of data points start streaming in, conventional techniques can easily fail.
Drawing from the same idea above, we can put AI and ML techniques to use for real-time classification and clustering of micro-populations, who are at elevated risk of contracting or spreading the disease. This will be incredibly helpful for isolation and contact tracing, even with limited resources.
Advanced clustering techniques such as DBSCAN, hierarchical agglomerative clustering, multi-exemplar affinity propagation (MEAP), graph-based multi-prototype competitive learning (GMPCL), and clustering based on geospatial regression techniques, can be brought to bear on this problem.
Many of these modern methods are particularly optimized to work with large-scale streaming data, which is suitable for a scenario with ever-increasing testing and travel data constantly feeding into the ML system.
On top of clustering, dimensionality reduction techniques can also be used on this kind of data to identify the key factors which are giving rise to such clusters. These factors can be communicated to the appropriate authorities for high-level policy decisions with regard to travel, testing, isolation, and other suitable community-based actions.
Digital Surveillance of Epidemics
Ever-growing amounts of data are present on social media, blogs, chat rooms, and local news reports that give us clues about disease outbreaks happening on a daily basis. This trend is only going to grow as more people (particularly in countries like India, Brazil, South Africa, or China) go online and share their fear and symptoms, search for medicine or doctors, discuss governmental and healthcare policies.
Digital surveillance is the next-generation AI-powered tool that promises to track these conversations, data streams, search patterns, and the associated digital demographics – at an exabyte scale – to model, predict, and warn healthcare systems and Governments about emerging epidemics throughout the world.
Efficacy of such digital tools have already been demonstrated. Nearly a week before the WHO first warned of a mysterious new respiratory disease in Wuhan, China, a team of global disease experts based in Boston captured digital clues about the outbreak from online press reports and released their findings in a real-time monitoring system called HealthMap.
When you visit their website, you will be presented with this kind of interactive global map, which is being updated every hour.
This is being touted as Digital Epidemiology, where traditional mathematical models are being replaced with or complemented by machine learning and pattern-finding models, generated by Big Data technologies. The key advantages of this approach are, not surprisingly, speed and volume.
The trustworthiness or the so-called ‘veracity’ of disparate data sources still remains a pressing issue. Although, ML researchers have always liked to work with a diversified source of data, which can be readily ingested by ensemble models (e.g gradient-boosted trees) to democratize the predictive power and reduce bias in the models.
Looking Into the Future of AI & ML Predicting Spread of Pathogens
In this article, we took a quick tour of the various promising technologies and initiatives that are using AI/ML tools and techniques for solving the great challenge of modeling, mitigating, and predicting the spread of infectious pathogens, which cause global pandemics.
As the world becomes more digitally connected, the healthcare systems and policy initiatives which embrace data-driven technologies (like artificial intelligence and machine learning) are likely to stay ahead in the battle against epidemics.