Are Seasonal forecasts for Europe skilful?

Computer generated weather forecasts exist for seasons ahead.  There are several sources for these forecasts, with those from the European Centre for Medium-range Weather Forecasts (ECMWF) thought to be the most skilful single model.  Skill varies across the globe, and for different parameters.  Europe is not a highly skilful area, as shown in figure 1 below; indeed for parts of Europe one is better off using the seasonal normal rather than the forecast.

2m temperature anomaly correlation for world

Figure 1. Anomaly correlation for December–January–February 2 m temperature predictions initialised on 1 November for the ECMWF SEA5 seasonal forecast system, for the ensemble mean. Anomaly correlations of 1 are perfect forecasts, 0 equates to no skill; the seasonal normal is a better forecast for negative scores. Source: ECMWF Newsletter #154.

Much of the discussion of skill is for 3 month periods.  However, decisions made using individual monthly forecasts are fairly common (e.g.  So how does the forecast for e.g. May vary as we get closer to May, and does the skill improve?

During a meeting at ECMWF earlier this year, there was an opportunity to provide feedback on the forecasts.  Lake Street Consulting spoke of the ‘jumps’ that we’d seen in the seasonal forecasts from one month to the next – e.g. shifts in the forecast for May between that initialised in April and that initialised in May.  However, due to our using different terminology to that used within ECMWF, it was a challenge to get our point across.  So we went away and produced the following plots, thereby arguing our point in a common language.   Below we share the plots.

Data used

  • ECMWF seasonal (SEA5) 50 member ensemble forecasts
  • Forecasts initialised from May 2017 onwards, valid for November 2017 through May 2018. (A small dataset.)
  • French average temperature
  • Climatology, or seasonal normal, is the average of observations for 1993-2016.


  • Plot design is that of Linus Magnusson, ECMWF. Each plot is for a number of forecasts all valid for the same time.  Below in figure 2 is a plot for forecasts valid for February 2018, initialised on the first of each month from August 2017 through February 2018.
  • Boxes are 25-50% and 50-75%, and the whiskers are the ensemble maximum and minimum.
  • The red dot is the verification – what actually happened.
  • Each plot is for one verification date, with box & whisker plots for different initialisation dates (oldest to left), and climatology on the right.
ECMWF seasonal forecast

Figure 2. ECMWF SEA5 seasonal forecasts for February 2018, starting from different initialisation dates (oldest on the left) for French temperature. There are 50 forecasts.  Boxes indicate 25% and 75%, and whiskers show the minimum and maximum. On the right is the climatology from 1993-2016, and the ‘verification’ for this year (red dot). Source: Lake Street Consulting Ltd.

  • Below in figure 3 we show these grouping of forecasts for Nov 2017 (upper) to May 2018 (lower), offset so that the verification months line up vertically.
  • All plots have a range of 14°C on the y-axis, so spread is comparable between plots.
ECMWF seasonal forecast

Figure 3. As figure 2, but for November 2017 through May 2018. Source: Lake Street Consulting Ltd.

What’s surprising about these plots?

Ideally, both box and whisker spreads would vary according to the skill in the ensemble forecast, with the overall trend being to lower spread/higher skill as the lead time of the forecast decreased.  Varying sensitivity of the atmosphere in different weather patterns suggests that the decrease in spread will not be smooth.

What we notice is that quite often, the forecast spread narrows significantly between lead times of 2 months and 1 month (so from the forecast initialised in Jan 2018 to the forecast initialised in Feb 2018, both for Feb 2018, as in Figure 2).  And also, the mean of the distribution shows a significant ‘jump’.

The forecast series valid for Nov 2017, Feb 2018 and May 2018 all have a ‘jump’ between forecasts with lead time of 2 months vs 1 month which is so large that 25-75% ranges between the forecast lead times do not overlap.  On a positive note, this  ‘jump’ seems to be in the direction of the actual verification.  Skill seems to appear in the seasonal forecast for the front month, but is lacking for lead times of 2 months or more.  To confirm this, we need to calculate the skill scores over a larger data set – a potential student project.

The results here are for French average temperature.  Given this type of insight, there are a number of ways in which we adapt the forecasts we provide to our clients.