Weather forecasting has evolved significantly from the late 1800s, when telegraph and telephone technology first allowed multiple weather stations to share observations and develop a synoptic view of weather systems as they moved across the country. In the late 20th century, satellites enabled another leap forward, providing an “eye in the sky” to monitor Earth system phenomena that include tropical storms evolving into hurricanes. Now, computer models integrate huge volumes of data, producing everything from simulations of long-term climate trends to nowcasts that predict small weather events just a few minutes or hours into the future.
And the progress continues: Nowcasting and weather forecasting are on the brink of a major paradigm shift. New approaches are required to enable full Earth system prediction and to make cost-effective use of the dramatic increase in volume, diversity, and capabilities of observations (particularly satellite observations) and environmental products. We predict that machine learning (ML) and other artificial intelligence techniques will have to supplement or supplant major components of operational systems. Fortunately, the fields of satellite remote sensing and numerical weather prediction (NWP) are poised to take advantage of recent years’ rapid advances in ML.
An explosion in data volume in numerous fields, including autonomous vehicles, facial and voice recognition, target tracking, finance, music composition, and bioinformatics, has driven interest in ML. Cost-effective hardware architectures such as graphical processing units readily accelerate ML methods, further increasing feasibility for large, time-constrained problems.
Identifying a shared set of fundamental needs can expedite the transfer of knowledge from these diverse fields to satellite remote sensing and NWP. A nonexhaustive list of these needs includes the following:
- efficient and intelligent signal and image processing
- quality control mechanisms
- pattern recognition
- data fusion (combining diverse streams of observations)
- data assimilation
- mapping (approximating functions efficiently)
- prediction capabilities
To adapt ML to weather-related applications, it is critical to meet all of these needs at multiple spatial and temporal scales for diverse geophysical domains that include the atmosphere, ocean, biosphere, hydrosphere, and near-space environment.
ML is a “learning from data” approach. A trained ML network estimates an output from a set of inputs. ML is similar to linear regression, but ML can fit virtually any function and easily represents the kinds of nonlinear effects common in geophysics. ML is capable of extracting information from large data sets and establishing and approximating complex relationships between disparate data sets of different types (physical, chemical, and biological, to name a few).
Common ML architectures use connected layers: an input layer, an output layer, and one or more “hidden” intermediate layers, each comprising a set of nodes (see van Veen and Leijnen  for schematics of numerous ML architectures). In the training stage, the strengths of linkages between the interconnected nodes are optimized to best fit the training set.
Numerical Weather Prediction
Let us first consider NWP. NWP models are computational representations of the fluid dynamics of the atmosphere of a rotating planet. NWP models enforce the basic conservation laws of mass, energy, and momentum while accounting for external sources and sinks, such as solar heating and surface drag. NWP models work well for large-scale processes, but typically, they cannot resolve important processes in the atmosphere, such as precipitation, that occur at small scales. NWP models parameterize such processes by efficiently estimating them using approximate relationships.
Today, weather forecasts of synoptic-scale (>1,000 kilometers) atmospheric motions are skillful out to about 1 week. However, forecasts of smaller-scale features—rainfall from severe storms and intense winds in the cores of hurricanes and tornadoes—are often still unreliable. Increases in computing power have allowed NWP models to use increasingly finer grids, which improves accuracy by better resolving small-scale dynamics, thereby reducing reliance on parameterizations to approximate phenomena smaller than the grid scale.
Despite these advances, it is still very difficult to predict the precise evolution of thunderstorms a few hours ahead or to predict hurricane intensity a few days ahead. No model is perfect: Initial conditions, finite differencing, or poorly approximated or missing physical processes can introduce errors, which accumulate and grow.
Complicating the difficulty is the practical demand that to be relevant, forecasts for smaller-scale phenomena must be made available to users quickly. A tornado forecast that takes even 1 hour to produce provides little practical value, whereas a forecast of the synoptic-scale weather 7 days ahead is useful even if it takes 12 hours to produce.
Parameterizations introduce errors by using simplified assumptions about physical processes and through approximations and even mistakes made as the parameterizations are implemented in models. For example, a parameterized submodel that simulates heating and cooling of the atmosphere by infrared and solar radiation can be very computationally demanding. Typically, the heating and cooling of the atmosphere are calculated for one particular time and then held fixed, in some cases for hours. Meanwhile, the other atmospheric processes are simulated with time steps of a few minutes. This procedure is used for operational NWP systems, even though the strong interaction between clouds and radiation is modulated as clouds move and evolve on scales of minutes.
To alleviate these problems, modelers are already using fast and accurate ML emulations of existing time-consuming parameterizations. For example, an ML emulation of the long-wave radiation parameterization [Chevallier et al., 2000] has long been used operationally, albeit in a limited way, at the European Centre for Medium-Range Weather Forecasts. An ML emulation of both the long- and short-wave radiation parameterizations [Krasnopolsky et al., 2010] is efficient enough to be run every model time step, and this method has been successfully tested in different models.
Taking this trend to the limit, Brenowitz and Bretherton  trained ML algorithms on data simulated at resolutions significantly higher than those of global NWP models to accurately determine the heating, cooling, moistening, and drying of the atmosphere due to all physical processes simultaneously. ML solutions are not limited to atmospheric processes: For example, neural networks have been developed for nonlinear wave-wave interactions in a wind-wave model [Tolman et al., 2005].
Let us next consider NWP data assimilation (DA) systems and how these systems use observations, especially remotely sensed observations. DA aims to optimally combine information from observations and prior knowledge from a previous forecast to initialize NWP models. The statistics of the errors in the observations and forecasts are important quantities for DA, and these statistics are difficult to estimate.
In practice, DA makes various approximations and assumptions (including Gaussian distributions and quasi-linear error dynamics), introduces numerous unknown “constants” that act as additional parameters to be tuned, aggressively thins or averages satellite data streams, and relies heavily on data preprocessing and postprocessing to eliminate biases in the data. As a result, only a fraction of the information in the observations makes it through the DA “filter.” Several steps in the DA system for processing, quality control, and retrieval (i.e., extraction) of geophysical information from satellite observations can be accelerated or improved with ML solutions [Gilbert et al., 2010; Atzberger, 2004].
In particular, both retrieval and DA systems assimilate satellite observations of radiances by iteratively simulating the observed radiances from the retrieved or assimilated geophysical quantities such as temperature and humidity. Fast ML emulations to simulate the radiances have been developed to significantly speed up these procedures [Taylor et al., 2016; Takenaka et al., 2011].
A Growing Need
Because of society’s increasing reliance on actionable environmental information and situational awareness, there is a pressing need to fully exploit current and future large and diverse sources of environmental data. The current volume of environmental data, especially from satellites, already presents a major challenge in terms of hardware, power, and latency constraints for DA and other real-time applications.
In part for these reasons, global weather forecasting uses only about 1%–3% of currently available satellite data, and the processing time for the traditional approaches is already crippling. Today, for many parameters of interest to operational forecasters, satellites provide data streams from multiple satellites, each having its own set of limitations and data artifacts that are difficult, if not impossible, for a forecaster to recall when faced with time-critical decisions on which lives depend. And this challenge will be exacerbated by the coming wave of new sources of environmental data. In this new era, small satellites may be launched in constellations of hundreds, if not thousands. The Internet of Things will create huge volumes of environmental data: Your phone may already be measuring temperature and pressure.
As the pool of available environmental observations grows in volume and diversity, new approaches, with higher efficiency and accuracy, will be required to fully exploit these resources, as either a complement or alternative to traditional systems. Trained ML systems based on advanced neural networks are computationally very efficient and easily implemented with modern scientific programming languages.
Training an ML model is a one-time cost, so it is possible to invest in very high quality physics- and chemistry-based simulations to produce highly accurate training data that span both common and rare or extreme events [Keller and Evans, 2019]. So far in early applications, ML is successfully addressing the demands put on environmental products for higher accuracy, higher spatial and temporal resolution, enhanced conventional forecasts, and better model output postprocessing [Rasp and Lerch, 2018; Campos et al., 2018]. ML is also addressing demands for outlooks and predictions on subseasonal to seasonal timescales [Fan et al., 2019] and for improvements in the process of issuing advisories and warnings, including those for severe weather and hurricanes [Shahroudi et al., 2019].
Wanted: New Approaches
Improved weather nowcasts and forecasts need new approaches to take full advantage of all traditional and novel observations. These approaches must allow more sources of observations to be fused and blended for nowcasting applications and ingested by increasingly accurate (and more computationally demanding) DA and forecast systems, and they must process and disseminate data within an ever-shrinking time window.
Although significant challenges remain to the broad implementation and acceptance of ML approaches, the steps we must take to mitigate these challenges are becoming clearer:
- Large and representative training data sets and ensembles of ML models will ensure that ML applications retain accuracy for rare and unusual cases.
- Explainable artificial intelligence and other developments will aid in relieving the ML “black box” stigma.
- ML approaches must satisfy appropriate physical constraints, such as conservation of mass.
- ML outputs should include quantitative uncertainty estimates.
- Adherence to modern computational science practices will make ML development reproducible.
If these challenges can be overcome, then we expect ML will become an important part of the solution for satellite remote sensing and NWP. We forecast more efficient use of human and computational resources; enhancements of existing tools for greater efficiency (and, in some cases, accuracy); and improved forecast skills using nowcasting, NWP, and other applications, including forecasts of extreme weather events.