Leaders at the U.S. Geological Survey (USGS) expect the proliferation of networked devices, inexpensive sensors, and drones to make an explosion of massive data sets available to Earth scientists. At the same time, advances in cloud computing and artificial intelligence will enable more powerful models for understanding these data and using them to project into the future. This is the outlook from the USGS 21st-Century Science Strategy 2020–2030. The report, released in January, describes USGS’s growth from its foundation in traditional observational science to a resource for predictive tools that can guide decision-makers in the management of natural resources and environmental hazards.
Experts have said that realizing this vision will require communication across disciplines and support for scientists who engage in interdisciplinary work. “In order to anticipate things that might happen in the short term and in the long term, we need to start looking at the Earth as a system of systems,” said Geoffrey Plumlee, chief scientist at USGS. Originally trained as a geologist, Plumlee has spent many years looking at the intersections between geology, environmental disasters, and human health.
“The single-discipline focus is still needed,” said Plumlee, “but we also need a lot more people that can do this cross-disciplinary work. From what I’ve seen, the younger generation is already getting into that, because they like the transdisciplinary idea of how Earth and space scientists can interact with human health scientists. Also, it’s pretty clear that a lot of our up-and-coming scientists are very adept with things like artificial intelligence.”
The Promise of Machine Learning in Earth Systems
Where others see geophysics problems, biomedical problems, and climate science problems, Karianne Bergen sees one problem: a data problem. Scientists go out into their respective fields and collect mounds of data. Some sets of parameters, combined in the right way, form a model that explains not only the observations but also countless other possible observations. That’s the data problem. Solving the data problem provides an insight into the future.
“Advances related to computing and data science can translate from one discipline to another,” said Bergen. “If someone finds a good strategy that works for geophysics, someone in climate science may be able to adopt it.” As an assistant professor of Earth, environmental, and planetary sciences and data science at Brown University, Bergen uses machine learning to look for these solutions.
Machine learning is a branch of artificial intelligence that uses optimization to create models based on existing data, which can then be used to make predictions. Instead of giving a computer an equation and asking it to solve for a solution, scientists give the computer a set of results and ask it to find the best equation. When applied to Earth systems, these models can be used to anticipate the effects of policy decisions and future changes in the environment.
The potential of machine learning in the Earth sciences has been recognized for decades, but only recently have advances in computer science made these types of projects feasible. “Computer science researchers figured out how to train some of the more powerful deep neural network models using GPU [graphics processing unit] computing, which allows them to get these models up and running on a scale that they never could before,” said Bergen. “So the computing bottleneck has changed a lot in the last 10 years, making it easier for people to work with those models. People also have more data to feed into these models because there are more sensors out there and more people collecting data.”
Availability of data is important because of the way machine learning works, by optimizing models to fit data points. “You want something that will work broadly, across a wide range of data. If you have only a few data points, you might learn to match just the noise in those data points, rather than the general pattern,” said Bergen. “That’s why machine learning is so data-intensive.”
Fostering a Connected Community
The availability and necessity of unprecedented data sets, the interconnected nature of natural systems, and the promise of new data tools add up to a call for more connections between scientists of different disciplines, academia, and government agencies and between researchers and decision-makers.
“The types of problems society is facing are multidisciplinary,” said Gary Rowe, EarthMAP (Earth Monitoring, Analyses, and Projections) Program management team leader at USGS. “They involve humans, they involve natural systems, they involve assumptions about how the future might evolve. So it’s a grand challenge for us and other science agencies to move forward in this together.”
—Matthew Stonecash (@MattStonecash), Science Writer