An “Excuse to Reduce” NWS Forecasters?

I’ve been pondering a new AMS paper by Baars and Mass — available for free as an abstract and for pay as a full PDF paper (which I have). Their study shows that computer generated “consensus model output stats” (CMOS), and weighted CMOS, each outperform the NWS human forecast except on day-1 and for extreme thermal variation from climatology.

I’m going to exercise academic license and reprint the most salient excerpts here (italics added by me for emphasis):

An essential finding of this paper is that it is getting increasingly difficult for human forecasters to improve upon MOS, a simple statistical postprocessing of ever-improving model output. Humans cannot consistently beat MOS precipitation forecasts for virtually all of the locations and forecast projections examined in this study, and are only superior to MOS for short-term temperature forecasts during large excursions from climatology. These results are consistent with the recent results of Dallavalle and Dagostaro (2004), who showed that during the past 2 yr, human and MOS skill in predicting short-term (24 and 48 h) probability of precipitation and minimum temperatures have become virtually equivalent, with only maximum temperature providing an arena in which human forecasts are marginally better (0.3° to 0.5°F).

Followed soon thereafter by…

An implication of the transition to human–MOS equivalence in prediction skill for precipitation and temperature at 12 h and beyond is that humans should spend most of their time on the short-term (0–12 h) forecasting problem, where the combination of superior graphical interpretation and physical understanding, coupled with the ability to communicate with the user communities, will allow profound improvements in the accuracy and usability of forecast information. Thus, this paper should not be seen as an excuse to reduce the number and responsibilities of forecasters, but rather as an indication that they should shift their efforts to important short-term forecasting problems and user interactions that are inadequately served today.

Note that this applies to only two variables (temperature and precip amount) out of the many that are predicted, and most certainly does not apply to such deeply complex and multivariate dependencies as severe thunderstorm probabilities or hurricane intensity guidance. I’m troubled, in a scientific sense, that the authors generalized their conclusions to forecasting as a whole, based on only two variables over which MOS has optimal mastery (or is it MOStery?). Despite their “excuse to reduce” caveat, bureaucrats and popular science media — each eager to gloss over inconvenient details and to run with the attention-grabbing headline — easily could misconstrue this as license to hand all forecasting over to the automatons, ASAP or perhaps sooner.

The results for WMOS and CMOS are no big surprise, but it’s probably more of a catch-22 than the authors assert. The forecasters’ poor performance compared to the MOS variables not only comes from improved MOS performance, but also precisely because the humans are stretched too thin by dumb policy decisions to concentrate with full analytic and prognostic effort — especially from day2 and day3 onward.

[To be fair, I’m sure some of it also comes from lack of diagnostic education and ability on many forecasters’ parts, especially the younger ones, but that’s harder to assess. Thorough techniques in subjective hand analysis and use of diagnostic tools simply aren’t taught commonly nor emphasized in many universities, which arrogantly see themselves as theoretical instead of operational training grounds. And even where well mastered, such skills are hard to put to use under ever worsening time constraints.]

It would be interesting, and worthwhile, to perform an experiment where a set of forecasters concentrates *only* on, say, day2 or day3 precip and temperature, nothing else — no distracting phone calls, no tie-wearing boss lurking overhead, no AWIPS glitches to debug, no warnings and TAFs to issue — very controlled conditions, with those results in turn compared to the MOS ensembles. I hypothesize they would again outperform MOS, but not by much. Alas, field forecasters are not permitted to engage due concentration needed to produce superior forecasts in most time bins. Policy (made of course by detached and unfamiliar non-forecasters in Washington) has forced procedure to become so cumbersome and time consuming as to inevitably yield this result — or at least, hasten it by several years.

When you short-staff a forecast office, add a large pile of digital/graphical gruntwork duties, take away almost all time for thorough 4D diagnosis, and then multiply it by 122 offices, results like theirs are inevitable and should shock no one. Chuck Doswell, and Allan Murphy and Len Snellman before him, foresaw this very occurrence! The paper by Baars and Mass was inevitable, the only question being whose names would appear on the article and when.

The damage has been done, precip and temperature efforts beyond day1 largely are a waste of resources given the procedural constraints imposed on field forecasters; and now, the authors’ suggestions probably should be implemented: Automate beyond 12h for temp/POP forecasts in the field and set the saved time aside for concentrating on day1. Instead I fear the second part of that recommendation will be ignored.

I fully agree with the authors that this is a direct reflection of the IFPS crank-and-pull process. NWS mgmt knew this when implementing egregiously time consuming methods for producing extended forecasts. This was a deliberate means to a desired end: staffing reduction — a prospect that is now knocking on the doorstep and has been impeded only by my union’s objection.

Is NWSEO merely a deer in front of the big truck in this regard? As an NWSEO union steward, I certainly hope not. I strongly encourage my union to work smartly with NWS hierarchy in returning more time to forecasters for hand analysis and thorough real time diagnostics (using satellite, radar, soundings, profilers…the full suite of observational data) — and less time wasted on thousands of mouse clicks just to move numbers and colors hither and yon. Yes, the amount of forecast data has jumped tremendously, but at what cost to quality and understanding? More does not mean better!

To their credit, Baars and Mass do assert, ” this paper should not be seen as an excuse to reduce the number and responsibilities of forecasters,” but that could get lost in the message. Management in NWS has a knack for selectively ignoring that which doesn’t favor the prevailing agenda. Watch and see what actually happens.

The paper itself formally illustrates the problem field forecasters have been complaining about since it began — that procedure has taken over at the expense of meteorology. If we as a predictive science allow much more of that foolishness, we do so at our own peril.

Thanks to JEvans and others for some earlier, stimulating discussion on this offline, as well as to Chuck, who has foreseen and discussed this very set of events with me and anyone else who will listen, since well before I became a professional meteorologist.

If you’re interested in more discussion on these matters, check out a lengthy diatribe I wrote a couple of months ago on consensus forecasting the the future of humans (before reading the Baars and Mass paper, but including links to Chuck’s still earlier discussions).


One Response to “An “Excuse to Reduce” NWS Forecasters?”

  1. Gilbert Sebenste on February 17th, 2006 11:02 pm

    Since I have this bad cold and I feel like I;m
    about to get hit by a train, I guess I’ll get clotheslined by Roger instead after he reads this post. 🙂

    I will summarize your wonderful rant as follows: information, no matter how abundant, does NOT equal wisdom. The explosion of information over the last several decades has produced wonderful data, but few who can interpret it properly to make sense of it all, apply it accordingly, and transfer that to others. This paper makes a dangerous assertion that temperature and precipitation can’t be beaten by human forecasters most of the time. Do you know when it can’t? Exactly…when it’s NEEDED THE MOST!
    When it impacts thousands or millions of people.
    When it’s below the resolution of the model. When it’s an extreme event. And, maybe just as importantly, when it is wrong.

    Yes, I must concur that most of the time, MOS temp and precip amounts does as well as forecasters, in the time they are alloted to do their job properly, for those two parameters. But what about:

    1. Lightning. I forecast for outdoor venues in my job. Models can’t tell me if there will be cloud to ground lightning. Golf courses to the local little league game need to know that.
    The model got the precipitation right. Good!
    That won’t do squat to the guy on the soccer
    field or golf course who planned his/her outing two days ago and didn’t cancel because there was just “a chance of rain” with 5,000 J/KG forecasted CAPE as I hear on some radio stations now. And too proud to get to shelter before he/she is zapped. Or maybe it’s the “bolt from the blue” or first one…or the approaching thunder isn’t heard in a carnival…

    2. The models can’t tell me where a hurricane has the highest risk of going. The model runs for Katrina were surprisingly tight, but NHC nailed the track by experience. Overlooked in all the problems of the aftermath.

    3. Wind? One unnamed forecaster went with
    MOS one day last week, calling for 10-15 MPH winds. There were gusts up to 40 MPH at times that day. Because the forecasts called for light winds, there were no trailer bans…and there were high profile vehicles having difficulties
    on a main highway. Drivers (truck/car), utility companies, recreational interests, people putting stuff outside like patio furniture, etc
    need to know when it’s going to be windy. Blow a forecast of 50 or 60 MPH winds, and there will be damage, injuries and possibly loss of life as a result…

    4. Precipitation type? Snow/ice/freezing rain.
    Models can’t handle that well…the impacts are easy to imagine. Enough said.

    5. Tornadoes, large hail, damaging wind potential, 1 to 5 days out. Good luck.
    And that goes for 0-24 hours, too.

    In summary, this artifical intelligence is no match for natural stupidity that is also being imposed on NWS forecasters. I know budgets are getting tighter, and I fear this and other articles will only propel us closer to the day when human intervention in a forecast will depend on what one guy on one shift does. It’s the same mentality thrust upon us by the “ASOS can replace machines at major airports” folks, and we all know how well THAT went.
    Funny, the quality of observations is lower, in general, than before ASOS went live and humans were canned or turned more into babysitters than
    weather observers. Haven’t we learned from that?
    Guess not. Haven’t we learned when we depend on technology or “somebody else” to save us, we get screwed (ala Katrina)? The pride to think that computers are as good as us at forecasting (read: analysis + computer assistance) is simply galling.

    By the way, for the last big system that went through here on February 16, 2006…MOS was forecasting 8″+ of snow for my location, almost everybody went for a substantial accumulation except myself. I went for 1″, locally up to 3″. Most areas got 1″ or less. And yes, I was given the time to forecast. And that is not an anomaly to me and to those who don’t think “MOS is boss”.

    Let’s see how this passes for a Nyquil buzz…

Leave a Reply

You must be logged in to post a comment.