The review was prepared by Katya Tulubenskaya, Alla Loseva
To predict the spread and duration of the epidemic, scientists model virus transmission. Models can be of varying degrees of detail. Some of them describe only infection and recovery: if there are infection spreaders, then a certain share of people without immunity will become infected, and a share of the infected will recover. Other models take into account additional factors, such as immunity acquired through vaccination. Of course, this last adjustment can be included in the model only if a virus vaccine exists. Therefore, how detailed the model is, directly depends on the virus, the spread of which it is meant to reflect.
This is our second post on epidemic modeling. In the first post, we looked at a simple model called SIR. It implies that at any given time, population is divided into three groups, or classes, between which people move sequentially along the course of the epidemic. The model’s name is an abbreviation of class names: S – susceptible, that is, without immunity to the disease, I – infected and spreading the virus, and R – recovered and received immunity. Due to the division into classes, the SIR model is called compartmental.
The SIR model is relevant only if additional processes can be discarded, such as the immunity waning with time, causing reinfection, or the transmission of the virus not from person to person, but through animal hosts or with water. Therefore, the simple SIR model is well applicable to diseases resulting in lifelong immunity: measles, rubella, and mumps.
Today we will look at compartmental models with additional classes that are designed to model other types of diseases. The most commonly used classes are:
- E – exposed, infected and in the incubation period, without spreading the virus. The SEIR model, respectively, helps to model the spread of infections that do not manifest themselves immediately.
- C – carrier, recovered but continuing to spread the infection. The carrier state model is used to model infections that can progress to the chronic stage so that the patient continues to infect others. This is, for example, the case of hepatitis B (Cao et al. 2014).
- D – dead from the disease. This class will be especially important in models for the spread of diseases with high mortality, such as Ebola.
- M – maternally derived immunity, immunity from birth. MSEIR models are sometimes very complex, because they take into account the process of gradual fading of immunity and, accordingly, the increasing likelihood of infection.
For example, the SEIS model means that individuals are susceptible to viruses (S), then some of them become infected and enter the incubation period (E), after which these people begin to infect others (I), and after recovering become susceptible again – if the disease does not result in immunity.
There are other ways of moving between model classes:
- SIRS: for diseases after which temporary immunity remains, and the recovered individuals become vulnerable again only after some time;
- SIS: a simplified model for diseases for which immunity is not developed – for example, respiratory infections;
- models with vaccination: when some susceptibles immediately move to class R, which now stands for “resistant”.
Besides, different sets of classes can be used in different types of models. Standard epidemiological models are based on ordinary differential equations that describe the ratio of people in different classes at any given time. However, such models do not take into account important details: different degrees of susceptibility to the virus, or patterns of local contacts between people (White, del Rey, and Sánchez 2007). Therefore, models are being developed that not only include different classes of people, but also the features of their interactions: for example, models on networks, cellular automata and agent models, probabilistic models, spatial and demographic ones.
Here we consider a map of publications that use variations of the SIR model. For the review, we have performed a systematic search in the scientific literature database Scopus and have built a map of publications based on their reference lists (Figure 1). Proximity in this map and belonging to the same cluster mean that the papers cite the same publications, therefore the papers are likely to consider similar issues. The map is built using VOSviewer software.
Studies that use variations of the SIR model are divided into seven clusters:
- orange, top left, and blue, bottom left: age groups and vaccination,
- gray, left: spatial models,
- light blue, on the top: random processes,
- purple, center: generalized models,
- navy, bottom right: models on networks,
- yellow, bottom center: models on real data.
Cluster description
Orange and blue clusters: age groups and vaccination
Realistic epidemic models should take into account the age structure of the population, emphasizes the author of the most notable work in this cluster and the entire map (Hethcote 2000). In part, because at different ages, people interact with each other in different ways: students interact daily with a large number of other students, and older adults communicate less regularly and with fewer people. With age, the risks of getting infected and recovering sometimes change. People get vaccinated at a certain age, too (Korobeinikov 2007). And if the model includes natural population growth and decline, then their value will also depend on the age of individuals.
Age-stratified models are sometimes used to determine the optimal timing and strategy of vaccination. For example, a study by de Blasio, Iversen, and Tomba (2012) from the blue cluster shows that during the 2009 swine flu epidemic in Norway, vaccination should have started 6 weeks earlier for the best result. But if children were the first to be vaccinated, as the incidence among them was higher, then the same result could be achieved by starting vaccination 5 weeks earlier. That is, vaccination aimed at the risk groups is effective even with a little delay.
In our first post on epidemiological modeling, we mentioned studies on pulse vaccination – they also appear on this map, such as the papers by Shulgin, Stone, and Agur (1998) and Stone, Shulgin, and Agur (2000) in the orange cluster.
Pulse vaccination implies that representatives of a certain risk group, for example, children from 5 to 16 years, get vaccinated during a campaign. After some time, members of the same age group are vaccinated again, and so it is repeated with a certain interval between vaccination cycles.
This strategy is different from general vaccination, through which almost all people go, for example, at the age of six. And while the general vaccination helps to defeat the epidemic only if the vast majority of the population has been vaccinated (say, for measles, it is 95%), pulse vaccination is effective even with less coverage (Gao et al. 2006).
Shulgin, Stone, and Agur (1998) study how pulse vaccination affects the spread of diseases that are affected by changing seasons – due to the weather or the start of the school year. Researchers conclude that this vaccination strategy can stop the spread of seasonal diseases. However, they recommend combining strategies: general vaccination with greater coverage reduces the number of susceptibles, and pulse vaccination with less coverage and longer intervals between cycles (up to five years) stops intermittent viruses. Some works of the blue cluster are devoted to mixed vaccination strategies in more complex models – for example, taking into account population growth (de la Sen et al. 2010).
Gray cluster: spatial models
Publications of this cluster examine how the infection is transmitted in space. Since a spatial dimension is introduced, for simplicity, only two classes are left in the model itself: infected (I) and susceptible (S). Between classes, people move according to the SIS model, i.e., with recovery, immunity is not acquired and people are again vulnerable to the virus.
It is assumed that the space is heterogeneous, there may be patches with a greater and lesser risk of infection, and individuals can move between them. The risks of getting infected are less on the patches with high rates of recovery or low rates of transmission – e.g., when physical distance is maintained (Sun et al. 2011). If such “safe zones” exist and individual movements are limited, then it becomes possible to stop the epidemic at least in these zones (Allen et al. 2008). To control the epidemic throughout the space, the movement of susceptibles, not the infected, should be limited so that they have less contact with the infected (Peng 2009).
Light blue cluster: random processes
Virus transmission can be modeled as a deterministic process – a linear pattern where susceptible people become infected and infected ones recover, and the more people are infected and susceptible at some point, the more people become infected at the next step. Still, the epidemic can be viewed as a random process to some extent, and the number of new infections can be considered not directly proportional to the number of the infected and the susceptible. To take this into account, probabilities and the so-called noise (random factors) are also introduced into the model, and it turns from a deterministic one into a stochastic, that is, random one.
This approach is used in the papers of the light blue cluster. In the most cited papers, the conditions are studied under which in a stochastic model the infection either ceases spreading or persists in the population (Gray et al. 2011). The studies also consider situations where two infections that immunize against each other are spreading simultaneously (Meng et al. 2016), and stochastic models are developed that take vaccination or treatment into account (Zhao, Jiang, and O’Regan 2013; Zhao and Jiang 2014).
Purple cluster: generalized models
This cluster contains papers where models are enhanced to be more general (Satsuma et al. 2004). For example, Feng, Xu, and Zhao (2007) note that models with an exponential increase in the number of infected people are not suitable for modeling quarantine and isolation, and a more general type of models is derived with a realistic distribution of individuals by classes.
The most notable publication in the cluster generalizes the existing models of the spread of infections and social influences such as rumors (Dodds and Watts 2005). In the resulting model, individuals have a memory of the influence, varying “dose” of exposure, as well as the degree of sensitivity to the influence. It turns out that the memory of the influence, that is, the ability to accumulate the “dose”, has the greatest effect on the shape of the epidemic.
Gomes, White, and Medley (2004) simulate a wider range of immunity types: temporary immunity (which wanes with time) and partial immunity (which reduces the risk of reinfection but does not fully protect against it). Researchers find that due to temporary immunity, the gaps between epidemics are shorter, and eradicating the infection is more difficult. From the model with partial immunity, it is concluded that providing vaccines which are stronger than the immunity obtained from the disease helps to reduce the incidence.
Also discussed here is such a method for modeling epidemics as cellular automata (White, del Rey, and Sánchez 2007). Imagine that a virus spreads in a two-dimensional space, divided into identical sections that change their state based on simple rules. For instance, if one of the sections changes its state into “infected”, then all of its neighbors become “infected” in the next step. Such local events change the picture at the macro level, and sometimes distinct patterns emerge. Through a cellular automaton, Liu and Jin (2005) found that the epidemic spreads worse in segregated space.
Besides, the cluster contains studies on how the virus is transmitted through the environment: indoors (Noakes et al. 2006), through water (Tien and Earn 2010) or, in the case of animals, through biological fluids and excrement (Bravo de Rueda et al. 2015).
Navy cluster: models on networks
To model epidemics, one can use a different view of social networks. These can be random networks, where all nodes (people) have a probability of connecting to all other nodes (Gleeson 2011; Parshani, Carmi, and Havlin 2010).
Another option is scale-free networks, where most nodes have few connections and only a few nodes have many. Scale-free networks are known to spread epidemics very quickly because people with many connections transmit the infection to a huge number of other people (Barthélemy et al. 2004). Whether the disease will turn into an epidemic depends also on the density of the network of contacts of the first infected (Moreno and Vázquez 2003).
However, it is not only the spatial structure of the network that matters. For example, Rocha, Liljeros, and Holme (2011) found that when modeling sexually transmitted infections, the temporal structure of interactions is also important.
Furthermore, networks can be represented as adaptive, that is, their structure will change in the course of the epidemic (Marceau et al. 2010). There are even more complex network models: several publications note that the N-intertwined model (Ferreira, Castellano and Pastor-Satorras, 2012) has advantages over models based on Markov chains (Van Mieghem, 2010).
This cluster also contains publications that model the transmission of emotions (Hill et al. 2010), the spread of rumors (Trpevski, Tang and Kocarev, 2010; Zhao et al. 2012), and computer viruses (Yuan and Chen 2008).
Yellow cluster: models on real data
Here the publications are diverse. Some of them focus on influenza pandemics and include demographic and international traffic data (Chowell, Nishiura, and Bettencourt 2007; Grais, Hugh Ellis, and Glass 2003). For example, Ciofi degli Atti et al. (2008) use census data to model household-level mobility between home, school, and work. With this model, researchers demonstrate how influenza would spread throughout Italy during a pandemic, and how vaccination and restriction of social contacts would affect it.
Hansen and Day (2011) also model epidemic control strategies – importantly, with limited resources – and show at what point it is optimal to introduce different control measures or a combination of them.
Nowadays, researchers can collect very detailed data on contacts between people, with information about the time of each interaction. But modeling epidemics on a full dynamic network is not always convenient, and scientists are looking for ways to supplement the model with such data. For example, Stehlé et al. (2011) find that the dynamics of an epidemic in a complete network are quite accurately reproduced by a network of contacts that takes into account their daily duration. Machens et al. (2013) get a good result on a network where the probability distributions of contacts are stored, but the average duration of contacts approximates the dynamics of the virus spread much worse.
Please proceed to page 2 to see general reviews on epidemic modeling and the description of our data.
General reviews
- Gibson, Gavin J., George Streftaris, and David Thong. 2018. ‘Comparison and Assessment of Epidemic Models’. Statistical Science 33(1):19–33.
- Rahmandad, Hazhir, and John Sterman. 2008. “Heterogeneity and Network Structure in the Dynamics of Diffusion: Comparing Agent-Based and Differential Equation Models.” Management Science 54(5):998–1014.
- Walters, Caroline E., Margaux M. I. Meslé, and Ian M. Hall. 2018. ‘Modelling the Global Spread of Diseases: A Review of Current Practice and Capability’. Epidemics 25:1–8.
- White, S. Hoya, A. Martín del Rey, and G. Rodríguez Sánchez. 2007. “Modeling Epidemics Using Cellular Automata.” Applied Mathematics and Computation 186(1):193–202.
Data source: Scopus bibliographic database. The search was made by titles, abstracts and keywords of publications using the different ways to indicate the SIR model variations. The search resulted in 3054 publications, excluding books.
Search query:
TITLE-ABS-KEY ( msir OR sis OR “carrier state” OR seir OR seis OR mseir OR mseirs OR siwr W/3 model* AND epidemi* OR pandemi* OR virus OR disease AND NOT “insulin sensitivity” ) OR TITLE-ABS-KEY ( {SIR model with} OR ( model AND {SIR with} ) OR {SEIR model with} OR ( model AND {SEIR with} ) ) OR
TITLE-ABS-KEY ( {SEIR model} OR {SIR model} AND extension OR extend* ) OR
TITLE-ABS-KEY ( sir W/3 model* AND vaccination OR mutation )