Skip to content

COVID-19 Preprints: How the Topics Change (April Edition)

From virus binding to modeling exit from lockdown

When society is in a global emergency, researchers are all the more inspired to publish their findings fast and open-access. An available option is to publish a preprint, a paper that has not yet received quality evaluation but quickly becomes available online. In this post we show how the preprints on the novel coronavirus SARS-CoV-2 and COVID-19 (the disease caused by it) split into topics, and how these topics have changed between January and April 2020. See page 2 for our methods and data description.

How preprints cluster

Since mid-January till mid-April, the issues discussed in the preprints have consistently split into three main domains (see Figures 1–3). These are:

  • navy-green cluster: virology and molecular biology, discussing the virus itself;
  • orange cluster: clinical medicine, discussing the virus-related diseases and clinical characteristics;
  • purple-blue cluster: epidemiology and public health, discussing the virus transmission and containment measures.
Figure 1. Co-occurrence network of keywords used in the abstracts of preprints on the novel coronavirus published from January 19 to February 18, 2020
Nodes are colored according to the automatically identified clusters. Links indicate that the keywords have appeared together in the same preprint abstract(s). Node sizes correspond to the relevance of the keywords. Only the keywords that occurred at least five times across the preprints of this period, and are connected to at least five other keywords are shown. Click on the map to see the full resolution (opens in the same tab)
Figure 2. Co-occurrence network of keywords used in the abstracts of preprints on the novel coronavirus published from February 19 to March 18, 2020
Nodes are colored according to the automatically identified clusters. Links indicate that the keywords have appeared together in the same preprint abstract(s). Node sizes correspond to the relevance of the keywords. Only the keywords that occurred at least five times across the preprints of this period, and are connected to at least five other keywords are shown. Click on the map to see the full resolution (opens in the same tab)
Figure 3. Co-occurrence network of keywords used in the abstracts of preprints on the novel coronavirus published from March 19 to April 17, 2020
Nodes are colored according to the automatically identified clusters. Links indicate that the keywords have appeared together in the same preprint abstract(s). Node sizes correspond to the relevance of the keywords. Only the keywords that occurred at least five times across the preprints of this period, and are connected to at least ten other keywords are shown. Click on the map to see the full resolution (opens in the same tab)

Topic dynamics

From the networks, we see that the main keywords in all the clusters remain unchanged, but other specific ones are added each month. In what follows, we briefly describe the trends, providing as references systematic reviews and meta-analyses where possible. Still, a disclaimer worth making is that preprints report the research that has not been certified through peer review and thus should not be used to guide policies or practice.

Navy cluster: molecular biology, immunology, virology

Early preprints discuss SARS-CoV-2 virus binding mechanisms: ACE2, one of the keywords, is the enzyme in some human cells to which the coronavirus attaches. By March, more preprints appear that describe the changes in the immune system of the infected people. For example, it was found that COVID-19 patients have a lower number of the immune cells, CD4+ T-cells (e.g., Qi et al. 2020). Up to now, the studies appear that are devoted to prophylactic vaccine design and repurposing of already approved drugs to treat COVID-19 patients (for a systematic review, see Lem et al. 2020).

Orange cluster: clinical medicine

Early preprints are devoted to the symptoms that develop in the infected patients. In early February, only the diagnosis, pneumonia, was clear enough. By early March, when systematic clinical accounts started to be published, manifestations like fever, cough, and fatigue are mentioned as the most important symptoms.

Hypertension is discussed as a characteristic of death cases, being a prevalent comorbid condition, along with cardiovascular diseases and diabetes (Chen et al. 2020; for a systematic review, see Jain and Yuan 2020). Later, lymphocyte counts appear as a keyword in the network as the lower quantity of them was found to be associated with how severe the disease is (e.g., see a review by Brown et al. 2020).

Purple cluster: epidemiology

The purple, epidemiological cluster started with studying the Chinese case and the transmission of the virus there. In March, the control measures implemented in China were also discussed, including their effectiveness for restricting the virus spread. By April, preprints start to address the American and European cases, to consider the impact of quarantine on the peaks of incidence and mortality, and to model the exit scenarios from the lockdowns.

As regards the transmissibility of the virus, the paper synthesizing evidence from many countries concludes that without control measures, the basic reproduction number R0 for the novel coronavirus SARS-CoV-2 is 4.5 (Katul et al. 2020), which is almost twice as high as earlier WHO estimates or previously published estimates (e.g., Layne et al. 2020).

Having modeled the exit scenarios, Goldsztejn, Schwartzman, and Nehorai (2020) argue for strict quarantine for vulnerable social groups and gradual lifting of restrictions for general population. Lopez and Rodo (2020) come to almost the same conclusions, modeling how the exit would work in the Spanish case, where the lockdown was announced on March 14 and even stricter confinement on March 29. The authors predict that the gradual reincorporation of the confined population at a low rate is a better strategy than any other. For the best result, as they argue, the reincorporation should start at least 45 days after the quarantine was imposed, to avoid a large second wave of the epidemic in 2020.

Please proceed to page 2 to learn about our data and methods.

Pages: 1 2