This is the first, likely not last, entry in this here ol’ BLOG that will be devoted to the use of artificial intelligence (AI), including machine-learning output, in forecasting. If you want scientifically vetted, peer-reviewed documentation of some of what I’m talking about, check out:
* Random forests for short-fused applications
* Using random forests to derive day-2 probabilities from CAM ensembles
* Random forests at various severe-day periods
Both public- and private-sector AI usage and claims are presented in this fairly presented Washington Post article.
As someone who uses, and oversees the use of, a set of AI-informed tools as a part of the daily forecast regimen, I’ll state this is not something to be afraid of, nor to lazily and mindlessly use as a crutch.
To address the fear factor—a lot of that is rooted in the black-box aspect Schumacher has discussed. He and other university and lab researchers (such as Loken) have published peer-reviewed papers (see above) discussing how their methodologies work, in reproducible ways. They’re in AMS journals, to which we in NWS have ready access. So don’t be afraid, be informed.
Admittedly it’s harder to be informed on the private-sector black boxes, because they are unscientific in the sense of not being reproducible and peer reviewed. They’re hidden behind a “proprietary” shroud of secrecy—the very antithesis of science! All you see are results—typically only the cherry-picked best, of course, with perhaps a token minor failure if the presenter is allowed to reveal such. But we don’t really know fully where those are consistently weak, due to (again!) lack of peer-reviewed documentation with reproducible methodologies, analyses and results. Caveat emptor!
To the second point (tools, not crutches), this is another in a long line of beneficial advancements in improving the forecasts overall. They add confidence, for example, to SPC day-3+ areas, but the forecaster does not just mindlessly “draw around them”. Meteorological reasoning still must be used, and discussed, by the outlook forecaster. The more uncommon a scenario is, the less the machine learning is trained on situations like them, and the more they’ll miss, even in an ensemble sense. Bulk statistical verification of AI-based forecasts may look great, but smooths over the outliers, which tend to be disproportionately destructive and important. And there’s still the “garbage in garbage out” factor, mainly due to lack of high-density upper-air obs here and even surface obs in much of the world, that renders uncertainty and output spread in ensembles right from the start.
AI/ML-informed models are imperfect, especially for the rare events that most depend on our own physical understanding as meteorologists to tease out. Lose that understanding, and we might as well be automatons regurgitating models. That might be “good enough for government work” most of the time, maybe even accidentally excellent sometimes.
Yet for anyone tempted to steer that direction, in a personal or policy sense, I offer these sobering and disturbing words: