The most reliable and accurate forecast for hurricane track and intensity in the Atlantic ocean is the official forecast produced by the National Hurricane Center. It consistently outperforms individual computer models, and that makes sense because those models are tools the NHC uses to make their forecast.

The official NHC forecast should be familiar to anyone who has ever monitored the progress of a tropical cyclone (the term meteorologists use to describe weather phenomena like hurricanes). It is the forecast that features a forecast track surrounded by a “cone of uncertainty.”

NHC Forecast Track for Florence at 11 am on September 13

Official forecast track of Hurricane Florence produced by the National Hurricane Center as of 11am on September 13th, 2018.

As Florence approaches the Southeast coast, a lot of attention has been directed at the so-called European (or Euro) model because of its dire predictions of a storm that slowly meanders down the S.C. coast, blasting Charleston with hurricane force winds.

Comparison of Euro and GFS models

Comparison of the ECMWF and GFS models for Saturday evening as of Thursday morning.

It’s an alarming scenario, but how seriously should we take it? How should we even think about these models at all?

Forecast models are computer simulations of the atmosphere. Meteorologists record measurements about the atmosphere as it currently is and use that to simulate what could happen in the future according to the laws of physics. But they are all flawed because of imprecise input and limited computer resources.

Models are run on some of the planet's more powerful computers, and they still require hours to complete.

The European model is produced by the European Centre for Medium-range Weather Forecasts. It is one of several models run by various organizations around the globe that meteorologists consider reliable and useful in forecasting tropical cyclones.

In recent years the ECMWF model has earned a reputation for being more advanced than other models. It is often the best performing model. This is why so many forecasters were alarmed when it began to forecast a southern shift for Florence. But in the past two years, other global models have closed the gap and model performance always varies from storm to storm. The European model may have the best reputation, but it is not always the best model.

Every year, the NHC compares its official forecasts with what actually happened to see how good their forecasts were. They also compare the accuracy of models with reality. Over time, these data present a clear picture: no model consistently outperforms the official forecast.

48 hour track error for various forecast models vs official forecast

The black line (“OFCL”) represents the error in the official NHC forecast track. In 2017, that forecast tended to be off by about 50 nautical miles (about 57 regular miles). Seven top models are shown by colored dots. The European model is the blue dot labeled EMXI. GFSI, GFDI, GFNI, and NGPI are American models. HWFI is a research model specialized in forecasting Hurricane intensity. UKMI is a British Model. The other items are averages of models or less commonly cited models.

On social media, GIFs showing individual model runs often go viral. The images are very detailed, offering precise predictions. But precision is not the same thing as accuracy, and it is important not to focus on individual models. No model can show you what will happen, only a reasonable guess of what could happen.

Evidence is clear that the best strategy for using models is to average the different models together into one. Combining different predictions in this way tends to correct the errors present in each one.

Strategies like this have been proven successful in other fields involving uncertainty and predictions. Consider electoral polling. Different polling organizations have better or worse track records but sites like Fivethirtyeight have been able to consistently outperform individual polls by combining all of them into one average.

The NHC produces their forecasts using the data generated by models, and when they do they are able to consider how each model has performed in the past. This is why its forecasts are, overall, better than individual models.

Follow J. Emory Parker on Twitter @jaspar.