A magnifying glass is held up in front of a computer screen displaying lines of code.
Climate models are highly complex computer programs, and they inevitably contain mistakes. Credit: Casimiro – stock.adobe.com
Source: Earth’s Future

Global climate models are software behemoths, often containing more than a million lines of code.

Inevitably, such complex models will contain mistakes, or “bugs.” But because model outputs are widely used to inform climate policy, it’s important that they generate trustworthy results.

Proske and Melsen set out to understand how climate modelers think about, identify, and address bugs. They interviewed 11 scientists and scientific programmers from the Max-Planck-Institut für Meteorologie who work on the ICON climate model.

When new code is developed for ICON, it’s screened and tested to catch bugs before being integrated into the model itself, the interviewees said.

After code is integrated, however, such testing usually stops. The code is assumed to be bug free until the model behaves weirdly or a programmer serendipitously discovers a bug while examining the code for other reasons. Even when the model crashes, it’s not necessarily a sign that a bug needs to be fixed because researchers are always making trade-offs between the speed and the stability of the model, and sometimes they simply push the model outside the bounds of what it can handle given those constraints.

Tracking down bugs and fixing them can be time-consuming, so even if the team suspects the presence of a bug, they sometimes estimate its impact to be minor enough that it doesn’t warrant correction. When the researchers do decide to fix a bug, many view the process as an extension of climate science: They generate hypotheses about how the bug might cause the model to behave, then test those hypotheses to discern the exact nature of the bug and how to address it.

The best way to avoid bugs is to test code thoroughly before it’s integrated into the full model, many interviewees said. Tools exist to facilitate testing, such as Buildbot and the GitLab development platform, and the scientists said such tools could be leveraged more fully in ICON’s development process. However, they also said there are inherent limits to how thoroughly researchers can test climate models because researchers don’t always know what a 100% accurate model output would look like. Thus, they do not have that basis to which they can compare actual model output.

Though the interviewees acknowledged that ICON is imperfect, they also considered it to be “good enough” to forecast weather or to answer research questions such as how increased atmospheric carbon will affect global temperatures. The authors write that although “the principle of ‘good enoughness’” is pragmatic and understandable, it could also lead to misunderstandings if users don’t appreciate a model’s limits. (Earth’s Future, https://doi.org/10.1029/2025EF006318, 2025)

—Saima May Sidik (@saimamay.bsky.social), Science Writer

A photo of a telescope array appears in a circle over a field of blue along with the Eos logo and the following text: Support Eos’s mission to broadly share science news and research. Below the text is a darker blue button that reads “donate today.”
Citation: Sidik, S. M. (2025), When is a climate model “good enough”?, Eos, 106, https://doi.org/10.1029/2025EO250332. Published on 10 September 2025.
Text © 2025. AGU. CC BY-NC-ND 3.0
Except where otherwise noted, images are subject to copyright. Any reuse without express permission from the copyright owner is prohibited.