Editors’ Vox is a blog from AGU’s Publications Department.
Data collected through observations is essential to science. Models also make a fundamental contribution to science by helping us to understand processes and forecast future change. However, both observations and modeling have certain limitations. “Reanalysis” is a way to combine the two and then build a more complete and accurate picture of the phenomenon being studied. In Earth science this can be applied to the study of the atmosphere, the oceans, and the land surface. A recent article published in Reviews of Geophysics describes achievements in reanalyzes across the Earth sciences and explores reanalysis for terrestrial ecosystems, which is a much newer application. Here, the authors explain reanalysis, its uses, and new developments in terrestrial ecosystem reanalysis.
How would you explain “reanalysis” in simple language?
A reanalysis in Earth system science takes three main ingredients: a numerical model, observational data, and an optimization scheme to merge the model forecast with the observations on the run. The result of optimally merged model forecasts and observations is called “reanalysis.”
For example, the most recent global atmospheric reanalysis ERA5 produced by the European Center for Medium-Range Weather Forecasts (ECMWF) provides a large number of atmospheric variables (such as temperature and precipitation) at a horizontal resolution of 30 kilometers, a vertical resolution of 137 levels, and an hourly time step from 1979 to present. This reanalysis considered a very large number of in situ and remotely sensed observation data. It -provides a close-to-optimal reconstruction of the atmospheric states for the period since 1979 given our current models and model-data fusion methodology.
Recently, an increasing number of reanalysis data have been made available for terrestrial processes such as the physical land surface, the terrestrial carbon, and the terrestrial hydrologic cycle.
What is the main difference of reanalysis to other data products?
Reanalysis provides “continuously optimized states” (for example, soil moisture in a land surface model) and fluxes (for example, evapotranspiration in a land surface model) over a long time period. The optimization process is formally called data assimilation. Continuous optimization is usually not performed for mere data products or model output. “Continuously optimized” means the data assimilation algorithm optimizes at each individual time step the states and fluxes given the observations available. In data assimilation, the numerical model is propagated until the time step of the first observations, the model is stopped, the optimization procedure is run, and the model is propagated further until the next time step with observations. Data assimilation considers uncertainties in the observations, initial conditions, and might consider uncertainty in the model structure, model parameters, and forcing conditions. The continuous optimization requires large amounts of computational resources as often an ensemble of models is used for producing forecasts.

What are reanalyses used for?
Coarse scale global atmospheric reanalyses inform regional high-resolution reanalysis by providing the boundary conditions. Atmospheric reanalyses are also used to inform oceanic simulations and terrestrial simulations by providing atmospheric forcing conditions. The stand-alone simulation of oceanic/terrestrial processes forced by atmospheric reanalysis is termed offline-coupling. These offline coupled simulations provide a more detailed spatio-temporal and process resolution than atmospheric reanalysis.
For water management, the coupling of groundwater and surface water is of particular interest. Carbon cycle predictions focus on carbon, energy, and nutrients in ocean and terrestrial ecosystem models. This way, ecosystem reanalysis finds its way into management and policy such as building a blue growth strategy or into the latest IPCC Sixth Assessment Report AR6.
Why is terrestrial ecosystem reanalysis emerging just now?
There are various reasons ecosystem reanalysis is only now becoming available. In the past the focus in reanalysis was on physical (abiotic) and biogeochemical (biotic) systems and less on ecosystems (which encompasses abiotic and biotic variables and their interactions). Biotic variables are often related to biodiversity properties such as genetic composition, species population, plant traits, and fauna density.
Key observations from the Fluxnet data network (measuring, for example, exchanges of water, energy and carbon between the land and the atmosphere), the Integrated Carbon Observatory System (ICOS), the MODIS vegetation indices, and for hydrology the global runoff data center (GRDC), and the GRACE mission allowed for advanced process understanding and improvement of models. However, the data are only available since 2000. These combined data sources allow models to couple the carbon and the water cycles constrained by energy and nutrient availability. The 2015 launched SMAP mission provided data for the first coupled global terrestrial carbon-water-reanalysis. We will see in future more data being provided to constrain ecosystem models and ecosystem reanalysis.

Which recent developments will shape and benefit terrestrial ecosystem reanalyses?
In our review article we identified four steps within reach to further the goal of producing ecosystem reanalysis.
First, researchers will develop new empirical frameworks in form of numerical models to link physical (abiotic) with biological (biotic) variables in integrated ecosystem models. Currently, the link of biotic to abiotic variables is rather poorly represented while biotic variables such as genetic composition, species populations contribute significantly to ecosystem functions, biodiversity as natural resource and ecosystem resilience as a whole.
Second, an agreement on essential ecosystem variables is needed. Essential variables provide the focus for the modeling and for the observational communities.
Third, remotely sensed trait observations are an area of research with high potential to better link biotic-abiotic processes at sufficient spatio-temporal coverage. This needs to go hand in hand with the provision of multi-variate long term in-situ observations as done in the European Long Term Ecosystem Research Infrastructure. Finally, the existing data bases (GBIF, TRAITS) and forthcoming high throughput biodiversity data provide further data to be integrated into ecosystem models and into ecosystem reanalysis. Integrating these new models and data into model-data-fusion frameworks will advance our understanding through digital ecosystem twins and provide new means to manage terrestrial ecosystems.

—Roland Baatz ([email protected]; 0000-0001-5481-0904), Harrie-Jan Hendricks-Franssen (
0000-0002-0004-8114), and Harry Vereecken (
0000-0002-8051-8517), Institute of Bio and Geosciences, Forschungszentrum Jülich and International Soil Modelling Consortium, Germany