My answer is: both — and as much of each as we choose to make it. If you have patience for a somewhat rambling but most assuredly sincere discussion on the subject, read on. This is a compilation of both new thoughts and some musings I’ve made on this topic in private correspondence with other scientists over the last few years.
The Coming Era Has Come
Much weeping and gnashing of teeth has taken place over the past few decades in the weather prediction community about automated forecasts, and the increasing accuracy and efficiency of computer models in generating them. As mainframe computing power accelerates, so does the angst in many human forecasters.
Military forecasters at Joint Typhoon Warning Center (JTWC) have been relying heavily on doing this for a few years, with apparently good success, using an ensemble approach called Systematic and Integrated Approach as a Forecast Aid (SAFA, more info here). Forecasters are allowed to toss away outliers at their judgment, but only if the outliers follow pre-identified model error patterns from a checklist. This leaves behind a numerical forecast called Selective Consensus (SCON) that the forecaster must use. In the long haul, it is almost impossible to beat the average track errors from these ensemble techniques.
NHC is relying more heavily than ever on composite (ensemble) track guidance, and with good reason; it works pretty darn well, averaged over a season. [2005 was their best season ever for overall track forecast error through 72 hours, partly because of use of ensemble composite guidance and partly because so many storms were “well behaved” compared to climatology and persistence (CLIPER).] SAFA-style forecast regulation may be the future in TC prediction stateside, after some years of ultimately futile inertial/political resistance.
Severe storms forecasters (me included) have been using Short Range Ensemble Forecasting ( SREF) guidance more and more during the last few years. Though some deterministic (single solution) modeling produces high-resolution output that looks like realistic radar echoes of supercells and squall lines, it’s still wrong quite often. Further, such output still has a long way to go before being robust and timely enough to integrate fully in SREFs that give reliable output by the time forecasts are due. But the writing is on the wall!
Are human forecasters are being made obsolete by machines? After all, a truly “accurate” forecast (say, a better prediction of tomorrow’s high temperature, wind speed or cloud cover) probably is, or soon will be, more useful to more customers than one of lesser accuracy that has a “human touch.” By in large, in a fast paced and market driven economy, results matter far more than the means to which those results are achieved.
If the slick-haired, smooth talking dude on “4-Warn” is telling Aunt Mildred there could be a hard freeze tomorrow night, and it happens, she’ll be glad she responded appropriately by moving in her potted plants. She probably won’t give a flip whether the forecast ultimately originated from a computer or a person. It was a good forecast and that’s all that matters.
Let’s briefly consider the powerful input-output problem: that forecasts only are as good as the input observational data. I totally agree! Ensembles don’t solve that problem; otherwise they would tightly cluster about a specific solution no matter the density or quality of input. They do, however, mitigate this factor somewhat, given surface and upper air data densities we now have over the U.S., by integrating numerous possible solutions (given the same input or even multiple sets of simultaneous input) and providing ranges of possibilities for forecasters to choose from.
We’re sliding into ensemble land and there is no going back. That much we know. Thus, could it get to the point where we’re required to “adhere to rules” to a great degree in probability forecasting because we will be shown to be outperformed otherwise? If so, when? If not, why?
Consensus Forecasting vs. Extreme Event “Outliers”
In ensemble-based forecasting, most human prognosticators will go for the most common solutions and throw out the extremes (especially if required to by rule, as at JTWC). But the wise forecaster, who is allowed to do so, also considers the extremes for what they are — low probability possibilities — and examines them further for conceptual validity before heaving them into digital Hades. After all, the model “outlier” makes its forecast for a reason…something was there which caused it to integrate such a seemingly whacked out solution, and there is a (low but not ignorable) probability that it may sometimes be right! What happens to the ensemble consensus approach every 5 or 10 or 20 years when a major landfalling typhoon or hurricane just kicks the stuffing out of those ensemble means and does something from way out in left field of the ensemble ballpark?
Pretend for a minute that ensembles of 20 years from now will provide a range of forecasts for specific supercells and tornadoes. The next 3 April 1974 type outbreak is forecast by a gross outlier…and happens! If one of the lowest-probability and most ridiculous looking ensemble extremes for that day had been a forecast of over 140 tornadoes, it would have been highly unwise to discard it!
JTWC forecasts often were hugely in error before SAFA techniques were put into place. Now they’re not, a much greater majority of the time. But what of the inevitable “fluke” event that defies SAFA error rules — thereby defying the expungement of outliers? Forecasters truly earn their pay by nailing the most deadly and dangerous: quite often, the rare extreme that comes “out of nowhere” to cause great damage and potential harm to people and the economy.
I dearly hope rigid rules for forecasting don’t allow the next 3 April 1974, or the next Labor-Day-1935 hurricane, to go badly forecast just because it is so extreme as to seem in error.
Yes, the human must have a sound physical and conceptual understanding of the atmosphere and of the processes relevant to his realm of forecasting (in my case, severe storms) to have any idea when one of those low-probability outliers actually could strike!
Those humans who simply regurgitate the ensemble mean, or any of the individual high-probability solutions, will watch the most important forecast of their careers roll around the bowl and down the hole, at the cost of great embarrassment (minimum consequence) and massive loss of human life (maximum). That’s the price any forecaster risks paying for overdependence on computer generated ensemble consensus, over his/her own understanding of the atmosphere.
Over the Next Several Years
This asymmetric penalty function actually can be made to work to the benefit of the human in the forecast process — at least, for the humans who scientifically educate themselves enough to keep up with understanding all possible aspects of atmospheric extremes that lie just beyond the reach of ensemble consensus forecasts.
In the years and maybe decades until the “Obsolescence Point” (more below) is reached for extreme-event forecasting (such as hurricanes, tornadoes, highly anomalous winter weather and so forth), I have no doubt at all that we will have important roles in:
- “Filling the gap” in numerical prognostic capabilities, where it still lags the human; and
- While still allowed, sniffing out exceptional events which defy MOS-like climatologically adjusted probabilities, or which verify out nearer to the fringe members of the ensemble instead of the means.
In the TC realm, the temporary reprieve from the prospect of utter model dominance over humans is in intensity forecasting (where models and humans both suck…humans just suck less) and in TC rainfall (likewise). Perhaps those will become the emphases of the human TC predictor as the models take over track prediction (and the latter becomes more of a zero sum game for the human versus the ensemble). [According to Chris Landsea (personal communication), the state of TC intensity forecasting is about where track prediction was in 1980.]
I can see this happening with some aspects of severe local storms (SLS) forecasting as well — say, overall outlook areas may be lost first, followed by tornado, then hail, then wind probabilities, then the mesoscale areas, then watch probabilities for each event type.
Each automation step takes several years and progressively more near-term superiority in accuracy — perhaps an asymptotically more difficult climb for the models to make to match the human. So maybe…just maybe…there still would be something left for me and my colleagues of same and lesser age to forecast when we are nearing retirement age.
So if you want to stay relevant as a forecaster in the next decade or so, get good at predicting extremes and rare anomalies. Tropical cyclones (TCs) and severe local storms (SLS) are relatively safe havens in forecasting for this reason; daily highs and lows are not!
But for how long?
The Obsolescence Point
At what point to the “painful exceptions” become so few that the bureaucrats become confident enough to risk them and put humans out of the daily forecast business, when the cost/benefit ratio (salaries, benefits, etc) as human forecasters is deemed expendable — even if we humans are still a little better than some “SREF-severe-probability-MOS”? Eventually even SLS and TC prediction increasingly will be “taken over” by automation. The machine may not forecast a few extremes well now and then, but also doesn’t keel over from avian flu. Supercomputers don’t get messy divorces, file union grievances, go home and go to sleep, nor take annual leave to go storm chasing.
I think the Obsolescence Point in humans in forecasting, to some extent, is coming in my natural lifespan. Chuck Doswell has touched on the latter across many of his writings (including this, and this, and this); now a manifestation of the phenomenon looms.
So where does this leave us? Even in that world, I have reason for optimism — pragmatically too, not in a Pollyanna fantasyland of wishes and magical fairy dust.
The Great Hope (Scientific Understanding and Communication)
The prospect of model ensembles taking over some forecast functions is not as scary once we set aside our innate ego and territoriality, and take a more pragmatic look at where we may be more useful in the future.
What matters in a forecast ultimately is its closeness to reality (the results, as often expressed by ranges or probabilities to reflect uncertainty. There will be no useful purpose, therefore, in sentimentally clinging to such a (by then) outmoded value system as “the human touch.” Our focus may have to shift to being “interpreters” rather than “predictors.”
Therein lies the importance of keeping up with scientific advances in understanding related to what we do (at least). Even when we reach the point at which we no longer predict much of it better than the machine, we will need physical understanding of the numerically superior predicted phenomenon, and practical understanding of its impacts, to translate it to publicly and industrially palatable forms.
A fair question Erik Rasmussen once asked me is this: “Is it good for society, for the human spirit, for mankind, to have our labors replaced by machines?”
No! At least, not unless we adjust in ways that give us reasons as scientists and as humans — intellectually and (glad he
mentioned this oft-neglected aspect!) spiritually — to stay motivated and stimulated. It sure helps to recognize one’s limitations and adapt accordingly. In both TC and SLS forecasting, maintaining optimal scientific understanding will help…I hope.
Science-minded operational meteorologists will be doing something 20 years from now. It just may not be forecasting in any form resembling today’s. Savvy forecasters might aim toward new niches in applied forecasting, where we guide the customers of those forecasts in responding appropriately, or translate a high-confidence numerical forecast to give emergency managers (for example) the likeliest time window of a tornado hitting Dallas this afternoon. As both human and mechanized forecasts get more accurate, the issue of communicating and interpreting them will get extremely important.
What to Do?
My message to fellow forecasters is this: Be prepared, not scared. Stand tall in the face of these challenges. Understand the science of extreme and deadly weather, to the greatest possible extent. Make yourself too valuable, too knowledgeable, to be eliminated. Instead of “fight” or “flight” — the two most natural and instinctive responses, “adaptation” could be the best way to respond most of the time.
Still, we must not sell out , turn to inauthenticity or otherwise compromise personal and scientific integrity. Expect to experience subplots of “fight” (against pseudoscientific crap or overselling of technology) or sometimes even “flight” (i.e., from Sisyphus-like futile struggles against superior mechanized forecasting on the longer timescales). As an old gamblin’ song once advised, “Know when to hold ’em, know when to fold ’em!”
So maybe I’ll eventually go for my CCM certificate afterall, in case my unnamed workplace is rendered wholly robotic by 2020, or deemed too cost-inefficient by scientifically obtuse bureaucratic hierarchy to keep as a distinct entity sooner than that. Then I can be ready to earn my keep in translating the new stuff into forms folks can use while remaining plugged in to the science.
In the meantime, and for as long as possible, I will continue to strive to maximize the tax dollars that pay my salary by outperforming the automatons on exceptional and deadly events — beginning where all great human forecasts do: with colored pencils applied to surface and upper air charts!