Editors’ Vox is a blog from AGU’s Publications Department.
Data assimilation in geophysics is a method that combines numerical models with observational data to improve our understanding and predictions of Earth’s processes. This approach is essential because, due to theoretical limitations, no single model can capture all natural variations. Similarly, observational data, although detailed, often lacks complete coverage both spatially and temporally. By integrating models with real-world data, data assimilation enhances the accuracy and reliability of predictions across various geophysical domains, such as atmospheric flows, ocean currents, and land surface processes.
A new article in Reviews of Geophysics explores the theory, methods, and applications of land data assimilation (LDA). We asked the authors to give an overview of how scientists use LDA, what major advances have been made, and where additional research is needed.
What insights can data assimilation provide when applied to the study of land surface processes?
Data assimilation serves dual purposes, enhancing scientific understanding and providing key engineering tools. It utilizes geophysical theories to interpret well-known processes and integrates observational data to address knowledge gaps in less understood areas. For instance, LDA refines hydrological models and improves accuracy in representing processes such as evapotranspiration.
On the engineering side, LDA underpins forecasting systems and historical data reconstruction essential for hydrological prediction. Additionally, it facilitates the development of Earth’s digital twins, allowing for more accurate environmental planning and scenario testing. This makes LDA invaluable both for advancing scientific inquiry and enhancing practical applications in Earth system sciences.

What are the benefits of using LDA compared to other techniques in land surface research?
LDA enhances the accuracy of land surface research models by integrating observations from diverse sources and scales.
LDA enhances the accuracy of land surface research models by integrating observations from diverse sources and scales. This integration not only improves model performance but also deepens understanding of complex land-surface processes. LDA refines uncertainty quantification by characterizing and reducing errors in model simulations and observations, thus providing optimized estimations of land surface states.
Innovatively, LDA employs techniques such as Bayesian filters and nonlinear optimization to tackle the nonlinearity and non-Gaussian characteristics of land surface processes.
Looking forward, LDA is set to expand from purely geophysical applications to include coupled natural and social systems, facilitated by integrating with big data analytics. This expansion promises to revolutionize the field, enhancing the accuracy and scope of predictions in Earth system sciences.
What kinds of land observational data are integrated into land models?
Land models assimilate a variety of observational data to enhance the understanding and prediction of land states. These observations include measurements such as soil moisture, snow, groundwater, land surface temperature, evapotranspiration, and irrigation. The data is collected from multiple sources, including in situ observations, which are increasingly enhanced by Internet of Things (IoT) technologies for real-time, continuous monitoring.
Additionally, remote sensing observations using sensors across microwave, infrared, and visible bands are gathered from airborne or spaceborne platforms, providing extensive coverage and varied data types. Recently, more unconventional data sources including social media, social surveys, and surveillance cameras have also been integrated, enriching the data landscape for land data assimilation and offering novel perspectives on land dynamics.
What are some of the major developments that have advanced LDA over the past 3 decades?
Over the past three decades, LDA has undergone substantial evolution.
Over the past three decades, LDA has undergone substantial evolution. Initially concentrating on refining theoretical foundations and methods, the field soon leveraged breakthroughs such as Bayesian filtering algorithms and hybrid methods to address the non-linearities and non-Gaussian behaviors in land surface systems. Innovation continued with the integration of machine learning, enhancing the diversity of handled data and revolutionizing uncertainty quantification.
Today, LDA incorporates robust, adaptive algorithms that efficiently integrate multivariate and multi-scale observational data, supporting applications from local to global scales. A notable breakthrough is the development of coupled land-atmosphere data assimilation systems and the applications of LDA from pure geophysical systems to coupled natural and human systems.
On what geographic scales can LDA systems be used?
LDA systems are adaptable across various geographic scales:
- Local and catchment scales: Focuses on detailed, high-resolution data to closely monitor land and water dynamics within small areas and river basins.
- Regional scales: Expands to assimilate data over larger areas, crucial for understanding regional water cycles and energy budgets.
- Global scales: Integrates worldwide data to enhance global land process analysis and understanding of environmental changes.
What are some of the prospective benefits and challenges of incorporating big Earth data and machine learning into LDA?
Benefits:
- Improved data utilization: Big Earth data provides a wealth of information that enhances the quality and coverage of data assimilated into LDA models.
- Enhanced predictive capabilities: Machine learning algorithms identify complex patterns within land surface processes from big Earth data, enabling the identification of crucial variables.
- Real-time monitoring and forecasting: The automation of data processing and analysis through machine learning makes LDA more efficient and reduces the time required, enabling near real-time monitoring and forecasting.
Challenges:
- Data quality, consistency, and integration: Using big Earth data from diverse sources requires careful preprocessing and harmonization to avoid biases and inaccuracies.
- Model interpretability: Understanding the underlying physical processes using machine learning alone presents challenges.
- Computational demands: Processing large volumes of big Earth data and implementing complex machine learning algorithms necessitates huge computational resources.

What additional research efforts are needed to continue the advancement of LDA?
LDA advances through targeted research efforts in key areas. First, LDA systems could be refined by improving the quality and scope of observational data, including meteorological and land observations. Second, LDA could benefit from the development of comprehensive land reanalysis that integrate extensive historical data, thus enhancing their accuracy and applicability. Third, development efforts should focus on creating fully operational systems for real-time application. Fourth, integrating LDA with studies of atmospheric processes and Earth’s critical zones will greatly broaden insights. Additionally, enriching LDA’s applications by incorporating advanced AI techniques and leveraging big data, such as social sensing, will enhance its utility. These collective efforts will propel LDA’s effectiveness and broader application in Earth system sciences.
—Xin Li ([email protected], 0000-0003-2999-9818), Chinese Academy of Sciences, China; and Feng Liu ([email protected],
0000-0002-5872-3709), Chinese Academy of Sciences, China