Editors’ Vox is a blog from AGU’s Publications Department.

An ever-increasing volume of data about planet Earth is being gathered. These data are “big” not only in size but also in their complexity, different formats, and varied scientific disciplines. New methods and platforms, such as the cloud, are being used to handle, analyze and share big data. A new book, Big Data Analytics in Earth, Atmospheric, and Ocean Sciences, recently published in AGU’s Special Publications series, explores new tools being used for the analysis and display of the rapidly increasing volume of data about the Earth. We asked the lead editors to explain more about big data in the context of Earth sciences and describe some of the advances and challenges in analyzing it.

How would you explain “big data” and “big Earth data” to a non-specialist?

“Big data” are data that can be described by the four Vs:

  • volume – the datasets are large
  • variety – the datasets may have different formats or types of data
  • velocity – the data arrive quickly
  • veracity – there may be uncertainty about the quality of the data or its availability

Big data are found in every academic field. In our field of science, “big Earth data” are data about the planet: the ocean, the land, the atmosphere, and climate.

Why do we need Big Earth Data?

Earth is a complex system and it is continually changing. For example, the latest Intergovernmental Panel on Climate Change (IPCC) report describes the unprecedented rate at which the global climate has warmed in the last 200 years, resulting in increasing ocean temperature, rising sea levels, intensifying rains and floods, new records for heatwaves and droughts, and ever-growing stress on freshwater availability.

Many scientists all over the world are studying these changes – making observations, analyzing data, and running models. In doing so, they are generating vast amounts of information.

Big Earth data analytics is the application of increasingly sophisticated tools for data analysis and display.

Big Earth data analytics is the application of increasingly sophisticated tools for data analysis and display. This can enable researchers and decision-makers to quickly understand the current state of our changing climate and render actionable predictions to save lives and change the course of our deteriorating climate.

What type of big Earth data are being analyzed?

The data being analyzed range from satellite data to seismic data exploring the Earth’s structure. Analyses of these data borrow both from traditional scientific analyses and tools developed for business applications. These types of data analytics are developed by university and research teams, and they are increasingly becoming an area of interest to companies. From Google’s Earth Engine to Zuci System’s Top 10 Data Science Trends for 2022 to NOAA’s Open Data Dissemination Program, big data about the Earth and their analysis are increasingly emphasized. These types of analyses increasingly rely on cloud-based storage and processing capabilities as the volume of the data and the computing resources needed go beyond local resources.

What type of tools or methods are being used for the analysis of big Earth data?

Analyzing the Earth system is a multifaceted and multivariate challenge. It includes highly scalable data processing solutions to quickly transform raw telemetry from Earth Observing satellites into science quality products, analysis optimized and harmonization frameworks for collocated analysis, on-demand assimilation for advanced numerical models, and machine learning-based predictions.

Cloud computing and high performance computing have become ubiquitous for tackling our big data challenge. Many traditional analysis methods have been enhanced to promote embarrassingly parallel, shared-memory computing and multi-computing methodologies. We are pleased to see the ongoing investments from agencies and universities in tackling various aspects of the Earth system by using a wide variety of Earth science data to demonstrate and validate their methods.

Open source science is vital in developing reproducible, sustainable and community validated big data analysis solutions. Much of the data, including satellite, airborne, in situ, seismic and model, is distributed through official archives such as NASA, NOAA, USGS, ESA and various academic organizations.

How has big Earth data analytics advanced in recent years?

Analytics for big data have responded to the ability to gather larger and more complex data and the ability to run larger and more detailed models over longer timescales by developing analyses for larger and more complex datasets. Many of these analyses take advantage of increased computing access via cloud services. There is an increased ability to analyze data closer to the point of collection via the Internet of Things and concepts such as edge and fog computing. 

What type of challenges are there when trying to fully implement big Earth data analytics?

New data streams from new platforms and sensors, such as the recently launched GOES-T, make it clear that the amount of big data about the Earth will only continue to increase.

Ultimately, our ability to discover knowledge from this bounty of data will be directly dependent on our ability to develop and apply methods to analyze it.

Recent years have witnessed significant advancement in terms of big Earth data management and processing, but these ever-larger datasets still remain challenging to analyze. Ultimately, our ability to discover knowledge from this bounty of data will be directly dependent on our ability to develop and apply methods to analyze it.

Who will find your book useful?

Big Earth data is of use to a wide range of environmental scientists from geographers to oceanographers to climate scientists.

Policy makers can use actionable data to make decisions and set policies. Our book can be used by geoinformatics professionals who are working on providing big Earth data processing, and scientists or engineers who need big Earth data processing to support their research and development.

It will also be helpful for non-scientists seeking an introduction to the topic, such as companies moving into big Earth data analysis.

Big Data Analytics in Earth, Atmospheric, and Ocean Sciences, 2022. ISBN: 978-1-119-46757-1. List price: $185.00 (print), $148.00 (e-book)

Chapter 1 is freely available. Visit the book’s page on Wiley.com and click on “Read an Excerpt” below the cover image.

—Tiffany C. Vance (tiffany.c.vance@noaa.gov; 0000-0001-5471-025X), NOAA US Integrated Ocean Observing System, USA; Thomas Huang (0000-0002-6010-5248), NASA Jet Propulsion Laboratory, USA; and Christopher Lynnes (0000-0001-6744-3349), NASA Goddard Space Flight Center (retired), USA

Editor’s Note: It is the policy of AGU Publications to invite the authors or editors of newly published books to write a summary for Eos Editors’ Vox.

Citation: Vance, T. C., T. Huang, and C. Lynnes (2022), Analyzing big Earth data: progress, challenges, opportunities, Eos, 103, https://doi.org/10.1029/2022EO225035. Published on 9 November 2022.
This article does not represent the opinion of AGU, Eos, or any of its affiliates. It is solely the opinion of the author(s).
Text © 2022. The authors. CC BY-NC-ND 3.0
Except where otherwise noted, images are subject to copyright. Any reuse without express permission from the copyright owner is prohibited.