Earlier this year as the COVID-19 virus spread around the world, countries responded by imposing lockdown measures one by one. Scientists, seeing the subsequent satellite observations, rushed to publish papers about improved air quality, many of which appeared in short order on preprint servers. These preprints, which are often published in tandem with their submission to a peer-reviewed journal, spawned a host of press releases and news articles that were spread on social media.
We’re living in a time when calamitous current events are escalating the need for more information. In response, scientists and media both are making leaps between observation and conclusion. If we want to make sure that assertions are accurate before they are widely disseminated, our rigorous and necessary review systems must be modernized so they aren’t circumvented. We must also make sure that the move toward open data is made with the means for nonexperts to understand the context and limitations of that information.
The Leap to Conclusions
The rush to print by both scientists and the press can raise questions about the validity of the research conclusions. One study published on a preprint server in early April by Harvard University researchers [Wu et al., 2020] was widely circulated on social media. The authors, who simultaneously submitted it to the New England Journal of Medicine, claimed in their paper that an increase in long-term exposure to particle pollution of 1 microgram per cubic meter can lead to a 15% increase in mortality from COVID-19, leading to many news stories on the correlation of air pollution and death rates from the virus.
Two epidemiologists reviewed the preprint and concluded that its assertions were not robust. For example, to specify fine-particle (PM2.5) exposure levels, Wu et al. averaged particle concentration estimates across the United States from satellite observations and models covering a 17-year period at a spatial resolution of 0.01° × 0.01° (about 1 × 1 kilometer). They then mapped these results to county levels by spatial averaging. But assigning a single particle concentration value on the basis of a 17-year mean to a large region is problematic; such coarse representation does not capture the variability of human exposure to particle concentrations in space and time. Several weeks after the preprint was published, the researchers revised their mortality estimate of 15% down to 8%.
The Tropospheric Monitoring Instrument (TROPOMI) is a sensor aboard the European Space Agency’s (ESA) Sentinel-5 Precursor satellite that collects atmospheric composition data at a relatively high spatial resolution (3.5 × 5.5 kilometers) at daily intervals, providing greater insights into urban-scale changes in air quality. Its data can be found on the mission’s data hub. TROPOMI observations of nitrogen dioxide (NO2) have been widely used by the media as indicators of urban- to regional-scale economic activity and how it is changing during the pandemic. In principle, NO2, a pollutant produced by high-temperature combustion and a precursor for photochemical smog, is ideal for this application because it has a short chemical lifetime and remains near its source.
However, satellite data often come with caveats, quality flags, and recommendations on how to use or not use the data (e.g., accounting for transport by winds or screening for clouds, if they are present). Scientists routinely access this information in users’ guides and algorithm theoretical basis documents, but nonexpert users either don’t know to look for these resources or may not have access to them. There is a potential for misinterpretation of the results when these data are used without properly heeding these recommendations. Researchers at the Copernicus Atmosphere Monitoring Service illustrated those issues after several news outlets used TROPOMI data to illustrate impacts of COVID-19 lockdowns on reduced traffic and improved air quality.
Travails of the Peer Review Process
Scientists, of course, regularly present ongoing work and discuss it with peers at conferences. These presentations are opportunities to obtain feedback and revise work in preparation for publishing in a peer-reviewed journal. The peer review process has provided a successful mechanism for legitimizing research and informing the world of new science discoveries. Yet it is fraught with issues, including “peer review rings,” in which a feature that allows authors to suggest reviewers is misused to fast-track a paper into publication without real scrutiny. There’s also the argument that the peer review process itself remains unvalidated.
Finally, the sometimes exasperating delays associated with the peer review process have led to efforts to circumvent the process. Researchers risk losing their claim to a discovery if another paper reporting the same discovery is published by a speedier journal. Scientists have thus begun turning to other avenues to get eyes on their results. Publishing a preprint and sharing it on social media allow authors to get attention and feedback, as well as put a date stamp on their work. By the end of September, a search for “COVID-19 and air quality” called up 1,445 preprints on the medRxiv server; medRxiv (pronounced “med archive”) is a free online archive and distribution server for complete but unreviewed manuscripts in the medical, clinical, and related health sciences.
Along with preprint servers, predatory publishing outlets continue to operate around the world. They practice “pay to publish” under the guise of open access practices. Though the community continually works to identify and blacklist these fraudulent journals [Strinzel et al., 2019], threats of lawsuits often restrict sharing of those lists. As long as scientists’ recognition is tied to their number of publications, the race to publish—and the opportunities for those journals to take advantage of the system—will continue.
Rapid Dissemination but with Rigorous Scientific Analysis
What options are available to scientists who want to disseminate their findings quickly but still operate under the safeguards of rigorous review? In June, the MIT Press responded to the deluge of pandemic-related papers being published before peer review by launching Rapid Reviews: COVID-19, an open access journal that offers accelerated reviews of preprints.
Other journals are developing more open review processes. Authors who submit to AGU Advances are encouraged to publish in the Earth and Space Science Open Archive, ESSOAr, while undergoing peer review. The European Geosciences Union’s Copernicus journals publish manuscripts that pass a rapid peer review in open access discussion forums to solicit comments from the community; those comments are considered when the paper undergoes formal peer review.
Meanwhile, agencies that are responsible for freely available data can adapt in several ways. First, they can offer accessible documents and manuals with information on data quality and its limitations. They can also ensure that the data are analysis ready and packaged for convenient use by nonexperts. And they can ensure that all these offerings are taken advantage of through regular training for media and other nonexpert users. ESA, for example, has developed a dedicated service to provide analysis-ready NO2 data sets from TROPOMI to the public. Space agencies are launching user-friendly dashboards with Earth-observing data and even coronavirus-specific data. Additionally, NASA, NOAA, and other institutions are developing training materials for media.
The COVID-19-related demand for environmental data caught most research institutions by surprise. As we continue to embrace FAIR (findable, accessible, interoperable, and reusable) data policies that offer free and open access to the observations, we must also embrace policies that encourage best practices for use of those data. We must also, as scientists, find better ways for our community to expedite the sharing of results while still ensuring proper scrutiny of our methods. By implementing these safeguards, we reduce the desire to circumvent the system completely and ensure that the public can quickly get much-needed scientific information that they can rely on.