# Epidemiology Models: A History

*The review was prepared by Liana Pankratova, Alla Loseva*

An epidemiological model is a mathematical way to predict the course of an epidemic. Models help evaluate the spread of infection, characteristics of vulnerable populations, the optimal age for vaccination, and other social and economic factors associated with the disease. The findings of such studies are used by public health organizations to successfully combat the spread of infection.

In previous posts of the epidemic modeling series, we discussed when simple and more complex models are used. Today’s review is devoted to the development of epidemiological modeling over time and different approaches to it.

For the review, we have performed a systematic search in the scientific literature database Web of Science and have built a map of publications based on their reference lists (see *Figure 1*). The map displays “citation trees” where the earliest studies (above) are cited by later ones (below). Thus, it reflects the dynamics of the research field. Belonging to the same cluster in this map means that publications cite the same “classic” research, continuing one research tradition. The map is built using the CitNetExplorer software.

Epidemic modeling studies can be split into four clusters:

- yellow: how measles spreads,
- navy: ordinary differential equations and stochastic models,
- orange: models on networks,
- purple: models of new epidemics.

In the scientific field of epidemic modeling, the most cited classics are physicians of the early 20th century: Sir Ronald Ross, William Hamer, Anderson McKendrick, and William Kermack. They laid the foundations of a mathematical approach in epidemiology built on compartmental models (Brauer 2017).

The next stage in the development of epidemiological models began at the turn of the 1950s and 60s. The authors of the time most often referred to by epidemiologists are tropical medicine professor George Macdonald, statisticians Norman Bailey and Maurice Bartlett, and mathematician Paul Erdős. Macdonald conducted research on malaria initiated by Ross and introduced the concept of basic reproduction number. Bartlett developed a stochastic analogy of the Kermack–McKendrick model. Erdős appears on the map – as a co-author of the random graph generation model – in connection with the establishment of a new approach to the study of epidemics, modeling on networks.

Since the 1970s, the main body of research in the field of epidemiological modeling has been appearing, including spatial and probabilistic models and studies on the spread of new viruses in the context of globalization.

**Yellow cluster: how measles spreads**

This cluster begins with the papers that model measles spread. In later publications, previous models are improved through the inclusion of new factors and are applied to various diseases.

Hamer (1906) emphasized that the spread of infection is affected not only by the infectiousness of the pathogen. He hypothesized that the course of the epidemic depends on the number of infected people and the number of susceptible people – by analogy with the law of mass action discovered in chemistry shortly before that, where the rate of reaction depends on the concentration of reagents. This idea became the basis for compartmental models.

The main part of the cluster follows the studies of Hamer and Bartlett. It was Bartlett whose research started the development of stochastic models of epidemic processes. Bartlett (1957, 1960) used the stochastic version of the Kermack–McKendrick compartmental model to find the critical size of society in which the infection ceases to spread.

In a subsequent period, models begin to include additional factors, such as the seasonality factor, to assess fluctuations in large and small outbreaks of the disease (Aron and Schwartz 1984; Fine and Clarkson 1982; Schwartz 1985). Bolker and Grenfell (1995) include a spatial component in the measles epidemic model that allows to compare geographic regions and establish a link between human mobility and outbreaks of infection.

Then models get complicated. Keeling and Rohani (2002) test a standard way to model the connection between regions and a model based on exact mobility data. Bjørnstad, Finkenstädt, and Grenfell (2002) include time series in a model that captures both endemic cycles and episodic measles outbreaks. Keeling and Grenfell (1997) also develop the idea of critical community size, explaining fluctuations in the number of people infected with measles.

On the left side of the cluster, papers citing Macdonald (1957) are located. This group of publications is closer to the navy cluster, since Macdonald is a direct follower of Ross’s research on malaria, and it is in his work that the concept of the basic reproduction number was introduced, which had already been used by Ross, Kermack, and McKendrick.

**Navy cluster: ordinary differential equations and stochastic models**

The navy cluster shows the dynamics of research approaches from ordinary differential equations to models that include the probabilities and non-linear patterns of the virus spread.

A classic of this cluster is the epidemiological model of malaria transmission (Ross 1911). In 1902, Ross received the Nobel Prize in medicine for demonstrating the dynamics of transmission of malaria between mosquito and human populations (Brauer 2017). Before this, it was believed that malaria cannot be defeated unless all mosquitoes are exterminated. However, Ross on a simple compartmental model showed that it would be enough to lower the number of insects below a critical level to stop the spread of the disease.

The publication of Kermack and McKendrick (1927) was the next step in the development of compartmental models. The researchers found that there are special thresholds for population density for various combinations of infectivity, recovery, and mortality, and if these critical points are exceeded, the number of infected people will increase. The theory developed by the authors became the basis for SIR modeling.

The central studies in the cluster are devoted to the development and application of nonlinear models. Hethcote (1976, 1978) develops compartmental models, taking into account the spatial distribution of people in a population. Several papers use Hopf bifurcation when the critical value is found – the bifurcation point, in which the mathematical model has several development paths (Hethcote and Driessche 1991; Hethcote, Stech, and Driessche 1981; Huang, Cooke, and Castillo-Chavez 1992; Liu, Levin, and Iwasa 1986).

Some research is dedicated to disease vectors – insects and animals (Anderson et al. 1981; Anderson and May 1981, 1982; May and Anderson 1979; Murray, Stanley, and Brown 1986). Another important topic in this cluster is sexually transmitted infections (Dietz and Hadeler 1988; Hadeler and Castillo-Chavez 1995; Hyman and Stanley 1988; May and Anderson 1987).

More recent studies discuss the basic reproductive number R_{0}, methods for its calculation, and use cases in epidemiological models (Driessche and Watmough 2002; Heesterbeek 2002; Heffernan, Smith, and Wahl 2005; Hethcote 2006). Besides, the researchers derive threshold values to control the spread of diseases according to the SEIRS model, which takes into account the incubation period of the disease (Cooke and Driessche 1996; Li et al. 1999).

**Orange cluster: models on networks**

The orange cluster is dedicated to the network approach to epidemic modeling. The population is represented as a network, where the nodes are people, and the connections between them indicate contacts. From an infected individual, the virus is transmitted to those who come into contact with them, and then further over the network. This cluster often refers to the random graph theory developed by Erdős and Rényi (1959, 1960, 1961).

A large share of the cluster is dedicated to the transmission of HIV / AIDS through networks of social contacts (Gupta, Anderson, and May 1989; Jacquez et al. 1988; Klovdahl et al. 1994; May and Anderson 1988).

Some studies develop models that take into account spatial heterogeneity of the population (Ball 1983; Ball 1986; Longini 1988; May and Anderson 1984).

More recent publications also use random graphs, and they consider not the appearance of the first infected people, but the first transmission of the virus to be the beginning of an epidemic (Callaway et al. 2000; Newman 2002; Newman, Strogatz, and Watts 2001). In other studies, the analysis is performed on scale-free networks (Dezső and Barabási 2002), multilevel networks (Sahneh, Scoglio, and Mieghem 2013; Watts et al. 2005), and is also generalized for different types of networks (Chakrabarti et al. 2008).

**Purple cluster: models of new epidemics**

The purple cluster brings together studies in which complex network models are used to simulate epidemics, including influenza pandemics in the era of globalization, when infections travel rapidly around the world due to long-distance journeys (see, e.g., Hufnagel, Brockman, and Geisel 2004).

The cluster opens with papers on the 1968–1969 Hong Kong flu, its spread, and vaccination effectiveness (Elveback et al. 1976; Longini, Ackerman, and Elveback 1978; Longini et al. 1982).

Some publications on influenza pandemics using stochastic models evaluate the effectiveness of various measures to control the epidemic: vaccination and social distance (Ferguson et al. 2005; Longini 2004; Longini et al. 2005). Other studies include factors such as the mutation of the virus and its resistance to drugs, travel as a catalyst for the spread of viruses (Ferguson, Galvani, and Bush 2003; Grais, Ellis, and Glass 2003; Stilianakis, Perelson, and Hayden 1998).

Models were also developed that simulated the spread of smallpox and the control of it (Bauch, Galvani, and Earn 2003; Ferguson et al. 2003), and the severe acute respiratory syndrome (SARS), an epidemic that unfolded in those years (Chowell et al. 2003; Lipsitch et al. 2003).

An important direction was the study of human mobility as one of the factors in the spread of viral infections using spatial models, including the ones on networks (Bajardi et al. 2011; Balcan et al. 2009; Gonzalez, Hidalgo, and Barabasi 2008; Riley 2007).

Please proceed to *page 2* to see general reviews on the history of epidemic modeling and the description of our data.