Glowing globe with lots of parallel lines and colorful dots pushing through

Simulation-based Forecasts Best, Study Finds

Contagions, severe weather, natural disasters, civil unrest – whatever data scientists are forecasting using network models, simulation-based methods are often the most effective, according to researchers at the University of Virginia Biocomplexity Institute and Princeton University, whose findings were published in the paper, “Fundamental Limitations on Efficiently Forecasting Certain Epidemic Measures in Network Models,” by PNAS (Proceedings of the National Academy of Sciences).

“Our results indicate that there are inherent limitations to purely analytic approaches for efficiently computing epidemic parameters when using network models,” said Daniel Rosenkrantz, Distinguished Institute Professor at the Biocomplexity Institute. The findings suggest that “detailed simulation-based approaches are the most computationally feasible methods of generating forecasts,” which leaders and policymakers rely on for making public health and safety decisions, he said.

Researchers at the Biocomplexity Institute initiated this study in 2015, which over the years, has been funded by UVA, the National Science Foundation, the National Institutes of Health, and the Defense Threat Reduction Agency. The UVA and Princeton collaboration is also part of recent NSF-funded Expeditions and PREPARE (Pandemic Research for Preparedness and Resilience) projects, headed by the Institute.

Contributors to the paper from UVA include: Madhav Marathe, S. S. Ravi, Rosenkrantz, Richard Stearns and Anil Vullikanti; and Simon Levin and H. Vincent Poor from Princeton. Marathe and Poor are the corresponding authors for the paper.

Impetus for the Study

A number of researchers at the Institute were developing simulation-based methods for forecasting the values of certain parameters that quantify the progression of an epidemic in a networked population, said Ravi, a Research Professor at the Institute.

“This work identified a number of practical difficulties – noisy data, the probabilistic nature of the epidemic process, the change in the value of model parameters over time – in developing reliable forecasts,” he said.

Forecasting epidemic dynamics is a very active research topic. The Centers for Disease Control (CDC) has established Flu Forecasting and COVID-19 forecasting hubs to support this effort. However, providing accurate and timely forecasts continues to be challenging due to a multitude of factors. This was evidenced by recent efforts in forecasting COVID-19 – researchers had a hard time precisely forecasting the dynamics due to rapidly changing circumstances. As a result, the CDC paused displaying COVID-19 forecasts after a year-long effort.

“With our research experience in the foundational aspects of computer science, we wanted to investigate whether one can devise efficient analytical techniques that can be used to compute the epidemic measures without using simulations,”

said Anil Vullikanti, a Professor at the Institute and a Professor of Computer Science in UVA’s School of Engineering and Applied Science. “Specifically, we wanted to understand whether there are fundamental difficulties above and beyond the practical difficulties, due to computational complexity that might limit our ability to compute epidemic parameters.”

Richard Stearns, a Distinguished Institute Professor at the Biocomplexity Institute and winner of the prestigious A. M. Turing Award from ACM (the Association for Computing Machinery) in 1993 for his work that established the foundations of computational complexity, said “viewing forecasting problems through the lens of computational complexity provides new insights into the potential challenges for producing accurate forecasts.”

The researchers developed rigorous formulations of short-term forecasting problems (i.e., forecasting problems where the time-horizon is small) and were able to prove that the resulting problems are computationally intractable, Ravi said. “Our main results show that the existence of efficient algorithms for short-term forecasting problems would violate a widely believed complexity theoretic hypothesis,” added Marathe, Division Director at the Biocomplexity Institute and Professor of Computer Science at UVA. “On the positive side, our work also identified restricted versions of forecasting problems which can be solved efficiently.”

Any surprises?

“We initially expected our results for forecasting problems to hold over a long time-horizon. So, we were surprised to see that the results hold even for a very short time-horizon, two or three time-steps into the future,” Ravi said. As the work progressed, the researchers were also surprised to see that the results are valid for many forecast parameters (e.g., peak value of the number of infections, percentage of the population infected by a certain time) and special classes of social networks (e.g., power-law networks, small-world networks).

Going forward

“Our main results bring out inherent computational limitations in forecasting epidemic dynamics even when one ignores the other sources of difficulty, such as noisy data,” Marathe said. “So, efficient analytical methods that work for all inputs are unlikely to exist.” Researchers interested in developing methods that work across a range of inputs should focus on exploiting special properties of inputs, Ravi and Rosenkrantz said. “Our positive results provide several examples of special classes of inputs for which forecasting problems can be solved efficiently,” Vullikanti noted. The UVA and Princeton researchers plan to continue their work. Next, they will explore whether efficient forecasting algorithms can be developed when the underlying social network has special structural properties.