# How COVID-19 spreads

*Piet Van Mieghem*

**Virus spread on networks is scientifically understood. It is a beautiful example of Network Science. The relatively new field of Network Science is based upon the fundamental duality in any complex system: the interplay between “function” (process, service) and “structure” (graph, topology). If the dynamic transmission process and the contact graph are known, then the spread of an item can be computed, the impact of interventions can be assessed and future scenario’s may be predicted.**

**Local rule-global emergence**

Let us explain the duality for the simple two compartment SIS model, where each node in the network is either infected (I) or healthy, but susceptible (S). The SIS local function rule for each node is: if the node is infected, it can infect all its healthy neighbours with some strength/rate and simultaneously, the node can cure with some curing rate and afterwards be susceptible again. The SIS local rule is just an ‘if ...., then ... else ...’ statement, that is easy to program. Nevertheless, the emerging effect of all those interacting local rules via the contact graph causes a fantastically complicated emerging global behaviour. ‘Local rule-global emergence’ dynamics abound in nature: fire-flies synchronize their flashes, swarm of fishes or birds create artful movements, etc. Many ‘local rule-global emergence’ dynamics feature abrupt changes, called phase transitions.

**Epidemic threshold**

Among all those ‘local rule-global emergence’ processes, SIS virus spread on a network is by far the easiest, because mathematical computation is possible to some extent. Specifically, the probability of infection of each node can be determined. A crucial concept is that of phase transition, called the epidemic threshold: if the viral infectiousness is above the epidemic threshold, then the virus will spread over the network and eventually infect a substantial fraction of the nodes. Below the epidemic threshold, the virus will die out and disappear. Above the epidemic threshold, the virus initially spreads and infects other nodes exponentially fast in time. In human language, an exponentially growing process is said to explode. Indeed, above the epidemic threshold, virus spread follows initially the same laws of a physical branching process as an uncontrolled nuclear reaction. That analogy explains why suppressing the virus spread is a matter of time and waiting is disastrous, as we have experienced over the last few months.

**1.5m distance rule**

On the positive side, the phase transition is close to a characteristic of the contact network and can be computed if the graph is known. By modifying the contact graph for a same virus, the epidemic threshold changes. Hence, if contacts are limited, then there are less links in the graph, the epidemic threshold increases and the virus spread decreases. This simple, but powerful law motivates the 1.5m distance rule that, together with hygiene measures, has so far caused a decline in infection in most countries. However, it is still unclear whether the 1.5m social distancing shifts the epidemic threshold above COVID-19’s infection strength, a necessary condition to eradicate the coronavirus after a sufficiently long time. In most communications about the coronavirus, the basic reproduction number R_{0} is mentioned, which is related to the epidemic threshold. Although less prominent from a physics point of view, the basic reproduction number is easier to understand as the average number of susceptible neighbours in the contact network that a typical infected node can infect. An R_{0} > 1 is equivalent to an infectious spread above the epidemic threshold.

**Uncertainties**

While virus spread on networks is scientifically understood, we face uncertainties on both the structure (contact network) and the function (infection process) in practice. Obviously, the contact network is not precisely known. Moreover, the contact network is not static, but changes over time. So far, we have not precisely stated what a node in the network is. If a node is an individual person, it is clear that the contact network of all persons in a nation is unknown. On a slightly higher level of granularity, a node may represent the group of all people in some region (e.g. city or province). The contact network then describes the interactions between the groups of people in one region with those in another region. Within a group, all people are considered as equal and undistinguishable. In spite of the loss in individual precision within a group, this level of granularity can be specified a bit better, because we concentrate now on the average interaction of a group, rather than on each individual and an average is nearly always easier to determine than the random variable itself. Nevertheless, even on higher level of granularity, the contact network can only be approximated.

**Asymptomatic**

The other fundamental dual is the dynamic spreading process of the coronavirus. The precise behaviour of COVID-19 is unknown, but worldwide we learn more every day. The coronavirus is expected to become the best studied virus ever. The dynamic parameters in the theory of virus spread specify the local-rule, namely the infectious strength of the virus and the human resistance against COVID-19 infection. Apart from the strength, the distribution over time also plays a role: the infection time or virus generation time describes how long the host remains infected and can infect others from the moment that COVID-19 viruses penetrate in the host with sufficient in number to reproduce. The uncertainties on the time properties are larger and depend among others upon the age, health and immunity system of the host. The virus itself is studied in virology, while its spread is studied in epidemiology. A final, troubling fact for COVID-19 is that a non-negligible fraction of the population is asymptomatic, which means that they are infected and infectious without knowing or clearly feeling that they are infected.

In the current COVID-19 situation, neither the parameters of the infection process nor sufficient details about the contact network are known. Consequently, infection probabilities in regions of the contact network cannot be computed nor the effect of possible interventions. As long as COVID-19 is not eradicated or suppressed by vaccines, a simple conclusion is that we better measure more, since "*measuring is knowing"*.

**Measuring the characteristics of the spread of COVID-19**

Epidemic spreading is based on the duality between the virus spreading process, that only specifies the local rule in a node, and the network of contacts among nodes, which leads to the emergent and global effect of the pandemic. The minimal information in any compartment model required to specify dynamics is the duplet consisting of a rate vector and a graph.

For example, the simple SIS epidemic model needs:

- the infection rate or infection probability between two nodes and the curing rate or probability of recovering from the disease
- the graph of the underlying contact network

There are two digital (i.e. non-medical) measurement techniques to determine who is or has been infected:

- by inference, based on estimates of the contact network and the likeliness of infectious encounters
- by directly measuring

**Measuring by inference**

By inference, we mean inverse computation to estimate the rate vector and the contact graph, using the history of the infection process up to the present in any node: see NIPA. A simplified version of inference that ignores the contact graph appears most often: the number of infected people reported on a country-wide level over time is fitted with a simple SIR epidemic model and the infection rate and curing rate are estimated, from which the basic reproduction number R_{0} then follows as the ratio of infection over curing rate.

Why doesn’t the inference technique solve the entire problem of determining the duplet? There is fundamental limitation on the inference technique, even when exact time series in each node are known. Although in that optimal case we can predict very precisely, we cannot tell much about the precise underlying graph. Exit strategies will typically change the rate vectors as well as the contact graph. Hence, if exit strategies need to be quantified, more precise information about the contact graph is needed.

**Direct measurements**

Direct measurements aim to collect all information to specify the contact graph and possibly also the rate vectors. The viral contact network is in some way related to human mobility patterns, that can be deduced from traffic measurements, mobile wireless network, potential apps (bluetooth range) or sensors. In simple terms, the infection probability between two nodes (regions) in a network is supposed to be proportional with the number of people that travels between them. The amount of different transportation modes (trains, airplane, ship, car, bike, etc.) clearly plays a role. It is precisely our globally interconnected world that has caused the rapid spread of COVID-19.

The current scientific challenge lies in data fusion, the artful combination of several different measurement techniques to construct an accurate duplet (rate vectors and contact graph). Again, once the duplet is known, we can control the spread of COVID-19.

**Models to simulate exit strategies**

Exit strategies describe the relaxation from the current intelligent lockdown condition. Typically, each exit strategy will contain two aspects:

- the type of measure that is relaxed (e.g. public gatherings, school closures, shop closures, social distancing, mobility restrictions and employment restrictions)
- the extent to which people will comply to these measures. See for example the work of Niek Mouter.

Each exit strategy also has a temporal and a spatial dimension: in what time window is a specific measure operational and in what region?

**Trial and error**

Currently, optimal exit strategies are not known. Most governments take actions based on partial information and apply a trial-and-error approach: if some quarantine measure is relaxed, then the effect on the number of new infections is closely monitored. If some level is exceeded, the quarantine measure will be again imposed. At the moment, given the rather serious lack in measured data, trial and error seems to be only way. To date, all exit strategies are also regionally based with each country taking their own decisions.

Nevertheless, many groups are investigating exit strategies (example). A first step may consist of structuring or grouping relevant exit strategies. At TU Delft we are working on a taxonomy of exit strategies, including:

- regional measures
- periodic lockdown and partial working week
- age-based relaxation
- sector-based relaxation
- systematic virus extinction (longer quarantining and developing of vaccines).

**Evaluation and optimisation**

A second step will be to try and ‘epidemically’ evaluate the most promising exit strategies by computing the effect of an exit strategy on the infection probability of each node in the network. In particular, one row in the infection probability contact matrix corresponds to the infection probabilities of one region in its interactions with all other regions. If we change the entries of one row in the infection probability contact matrix due to a certain governmental measure in an exit strategy, we can compute the effect of that measure on the entire network. We believe that this network science approach is unique over all currently available models that focus on one region only, without considering the interactions with neighbouring regions.

The third step will be to choose the strategy that optimizes epidemic suppression together with other evaluation criteria (economic, social/psychological, enforcement, etc.).

## Prof. Piet Van Mieghem

Professor Piet Van Mieghem specialises in complex networks. Ten years ago, he started to study digital viruses that enter our networks via e-mails or system errors. Now Van Mieghem is shifting the focus from a digital to a biological virus, because, the basic mathematical models that underlie it are actually the same. At the request of RIVM, he is conducting research into the dissemination of COVID-19 using the Network Inference-based Prediction Algorithm (NIPA). He is also contributing his knowledge of network recovery in the wake earthquakes, floods and terrorist attacks to the development of a Dutch exit strategy.