Numerical models outperform AI weather forecasts of record-breaking extremes

Generative AI & LLMs
Published: arXiv: 2508.15724v1
Authors

Zhongwei Zhang Erich Fischer Jakob Zscheischler Sebastian Engelke

Abstract

Artificial intelligence (AI)-based models are revolutionizing weather forecasting and have surpassed leading numerical weather prediction systems on various benchmark tasks. However, their ability to extrapolate and reliably forecast unprecedented extreme events remains unclear. Here, we show that for record-breaking weather extremes, the numerical model High RESolution forecast (HRES) from the European Centre for Medium-Range Weather Forecasts still consistently outperforms state-of-the-art AI models GraphCast, GraphCast operational, Pangu-Weather, Pangu-Weather operational, and Fuxi. We demonstrate that forecast errors in AI models are consistently larger for record-breaking heat, cold, and wind than in HRES across nearly all lead times. We further find that the examined AI models tend to underestimate both the frequency and intensity of record-breaking events, and they underpredict hot records and overestimate cold records with growing errors for larger record exceedance. Our findings underscore the current limitations of AI weather models in extrapolating beyond their training domain and in forecasting the potentially most impactful record-breaking weather events that are particularly frequent in a rapidly warming climate. Further rigorous verification and model development is needed before these models can be solely relied upon for high-stakes applications such as early warning systems and disaster management.

Paper Summary

Problem
Record-breaking weather extremes, such as heatwaves and winter storms, can cause significant damage and loss of life. While artificial intelligence (AI) models have shown promise in weather forecasting, their ability to accurately predict these extreme events remains unclear.
Key Innovation
This research paper evaluates the performance of state-of-the-art AI weather models in forecasting record-breaking weather extremes, such as heat, cold, and wind events. The authors compare the AI models to a traditional numerical weather prediction (NWP) system and find that the NWP system consistently outperforms the AI models in predicting these extreme events.
Practical Impact
The findings of this study have important implications for early warning systems and disaster management. While AI models may be useful for predicting some types of weather events, they may not be reliable for predicting record-breaking extremes. This means that emergency responders and policymakers should not rely solely on AI models for critical decisions. Instead, they should use a combination of traditional NWP systems and AI models to get a more accurate picture of the weather.
Analogy / Intuitive Explanation
Imagine trying to predict the stock market. While AI models can be very good at predicting general trends, they may struggle to predict sudden, extreme events, such as a stock market crash. Similarly, AI models may be good at predicting general weather patterns, but they may struggle to predict record-breaking weather extremes, such as a heatwave or a hurricane. In both cases, traditional models and human expertise are still essential for making accurate predictions.
Paper Information
Categories:
physics.ao-ph cs.AI stat.AP J.2; I.6.4
Published Date:

arXiv ID:

2508.15724v1

Quick Actions