RECAPITULATION ON ENSEMBLE PREDICTION SYSTEMS (EPS)
(A. Garcia Mendez, ECMWF)
EPS, an Overview:
The normal loss of consistency of a NWP deterministic forecast is around 4 to 5 days. The
uncertainty in the analysis initial condition result in small un
certainties growing quickly to large
errors within unstable flows. So small
scale errors will affect the large scale through non
dynamics. Generally speaking the error growth is flow dependant.
The lack of a perfect forecast is due mainly to:
e actual state of the atmosphere is known only approximately
Model errors due to limited resolution and/or parametrization of physical processes.
The ECMWF EPS as a complement to the deterministic forecast, has been part of the Operational
e 1992. The EPS simulates possible initial uncertainties by adding to the
unperturbed analysis small perturbations within the limits of uncertainty of the analysis. From these
a number of T159L40 forecasts are produced.
A complete description of the weath
er prediction problem can be formulated in terms of the time
evolution of an appropriate probability density function (PDF) in the atmosphere’s phase space.
Although the problem can be formulated exactly through the continuity equation for probability
ouville equation), ensemble prediction appears to be the only feasible method to predict the PDF
beyond the range of linear error growth.
In ensemble prediction the PDF at initial time is represented through a finite sample of possible
The sample of initial states must give a realistic estimate of the PDF of analysis
The sensitive areas of the atmosphere are targeted using the so
called singular vectors.
By definition the singular vectors depend on:
The initial and final m
etrics used to measure perturbations amplitude
The basic state characteristics
The optimization time interval (OTI)
The ECMWF EPS is based on an ensemble of 50 members plus the unperturbed Control non
linear time integration. The initial conditions (IC) o
f the 50 members are created by adding a
perturbation to the control ICs. The perturbations are defined using the fastest growing singular
vectors (SVs) of a linear approximation of the ECMWF non
The main steps of the EPS are:
of the first 25 SV
Generation of 25 perturbations by a phase
space rotation of the SV
The 25 perturbations are mirrored by changing its sign and then 50 IC are built by adding
50 + 1 non
linear integration run for 10 days.
At the present
the SV resolution is T42L31
The SV are computed in the following way:
If x is a state vector of the system: and
is the solution of the tangent version of the model equations then the perturbation total energy
norm can be computed as:
where L* is the adjoint of the tangent propagator L defined in (1), <...;...
> is the Euclidean scalar
product, and E is the energy weight matrix.
The SVs are the directions in the phase space of the system that maximizes the energy growth
defined by (3). They are the eigenvectors of the operator L*EL with the largest eigenvalues
SVs are a function of the optimization time interval t and of the norm definition.
The EPS sampling strategy consists of:
Identify the phase space directions along which analysis error grows fastest in the linear
regime (Toti ~ 48 hours). The gro
wth is measured using the total energy norm.
These directions are called
of the tangent forward propagator. They are
the eigenvectors of an operator, which include the tangent forward, and adjoint operators.
They depend on the optimizat
ion time interval (Toti) and on the norm.
They identify regions where perturbations can grow by barotropic and baroclinic energy
Comparison of unperturbed analysis (Control) with EPS member 35 (Initial conditions).
Comparison of unperturbed analysis (Control) with EPS member 36 (35 mirrored member).
5/October/2000; 12 UTC
Same as figure 1 for the D+5 forecast (member 35)
Same as figure 2 for the D+5 forecast (member 36)
The main problem for the EPS to be operational is to design suitable products to provide potential
users with valuable information. The daily output of the EPS is basically a collection of forecasts,
which are different possible realizations of
the meteorological future. Relevant information can be
found at different levels:
The ensemble mean, centroid of the distribution represents the first level of information.
The ensemble spread represents a second level of information, as it should meas
uncertainty of the forecast or the amplitude of possible fluctuations around the centroid.
A third level of information is the distribution of the spread. Small spread regions should indicate
ensemble modalities where the likelihood is larger tha
n in large spread regions.
The last level of information is reached when the whole distribution is regarded as informative, e.g.
when weather parameter probabilities are directly generated from the ensemble.
Fig 5: Ensemble mean, unperturbed forecast (Co
ntrol) and EPS variance. D+5 forecast based on
5/October/2000; 12 UTC.
Different products have been designed to make easier the EPS interpretation. Some products
allow visualizing specific aspects of the distribution (e.g. plumes or spaghetti charts); ot
based on an objective classification of ensemble members (e.g. clustering).
Hierarchical agglomerative algorithms
The principle of these algorithms is to merge at each stage the two nearest clusters into a new
uster. The number of clusters varies from 51 at the first stage (each member being a single
member cluster) to 1 at the last stage. Several options allow deciding when the partition is to be
When the number of clusters exceeds a value (e.g. 6 c
When the internal variance of a cluster exceeds a value. It may be an absolute value (e.g.
std smaller than 50 m) corresponding to a certain level of smoothing or sharpness of the
cluster or a relative value (e.g. std smaller than 50% of the to
When the part of explained variance by cluster means (centroids) variance drops below a
threshold (e.g. 50%).
When the number of members in a cluster exceeds a threshold, e.g. 17 members
corresponding to a majoritary ensemble alternative.
This algorithm is used operationally at the ECMWF. It tends to assign a member to a small
cluster in order to balance the variance. It produces several separate clusters of
comparable size (or at least of comparable internal variance).
It shows the best results in
terms of explained variance. This algorithm has the disadvantage of mixing
members presenting strong differences. It could be said that Ward's algorithm highlights
the ensemble multimodality.
The ECMWF operational cl
ustering is based on the RMS differences between the 500 hPa
geopotential forecasts averaged from D+5 to D+7 forecasts. It is always the same
members, which make up the contents of each cluster. There are occasions when two
members in the same cluster ca
n be rather different at the beginning or end of the period,
but sufficiently similar during the rest of the time interval to be placed in the same cluster.
On the other hand two members being similar during a part of the period may be placed in
clusters if they are sufficiently different during most of the period. The clustering is
performed separately for the whole of Europe plus 4 European sub
Operational clustering for Europe. D+6 based on 5/October/2000; 12 UTC. Explained
variance 60 %. Cluster 1 std=42 m, Cluster 2 std=42 m, Cluster 3 std=37 m, Cluster 4 std=37 m,
Cluster 5 std=30 m.
Ward clustering for Europe. D+6 based on 5/October/2000; 12 UTC. Proportion of
explained variance 50% resulting in 4 clusters.
Explained variance curve. Same as for figures 6 and 7
Several ECMWF Member States compute their own tailored clustering.
This algorithm tends to group a large majority of members in a single cluster (chaining or
ect) and the rest in small clusters of 2 or 3 members. A significant number of
members are not grouped and remain as single clusters representing the extreme situations that
were lost in the Ward's algorithm. It could be said that this algorithm highligh
This algorithm is designed to balance the effects of the two previous ones. It tends to minimize the
internal cluster variance and simultaneously to maximize the variance inter clusters. The result is
ly a large cluster and several small clusters. This algorithm has been regarded as the best
procedure to achieve a climatological classification of synoptic situations.
This algorithm is used in the Was
hington NCEP ensemble operational clustering. The members
are grouped in a cluster when their distance (in terms of anomaly correlation instead of RMS) is
below a given threshold. The clustering starts by finding the two less similar members closest to
he threshold each of them becoming the seed of a cluster. The remaining members are grouped
around these two according to the threshold (when a member may be clustered in both clusters is
assigned to the closest one. In a second stage the closest pair of
similar members (according to
the threshold) is found and members are clustered around them. The process iterates until there is
no remaining pair of similar members in the ensemble.
This clustering is based on the assumption that the range of alternati
ves sampled by the ensemble
could be simply summarized by the interval between the two less similar forecasts. The main
advantage is that it allows defining (by the threshold choice) exactly what the user regards as
similar and different forecasts. On th
e other hand you cannot expect a large explained variance.
It groups the members around the ensemble mean regarded as the center of the members
distribution assuming the ensemble to be rather monomodal around its mean as it seems to
nerally be the case.
The size of the central cluster can be given by:
Its internal variance corresponding to a given level of smoothing or sharpness of the cluster mean
The number of members included in the central cluster.
Once obtained the ce
ntral cluster the remaining members are grouped following a basic
hierarchical algorithm. The clusters growth is limited by their internal variance, which cannot
exceed the central cluster variance (in order to keep the same resolution). Unlike in hierar
algorithms the clustering doesn't stop when one cluster exceeds the conditions, its growth stops
but other clusters can merge. The size of the remaining clusters is generally small compared with
the central cluster since they represent the edges of
the distribution. This algorithm highlights the
ensemble monomodality even more than Centroid hierarchical algorithm.
Based on the assumption that high
resolution models are more reliable than EPS members and
that T159 Control is more
reliable than perturbed forecasts. The idea is to use these 'better'
forecasts as seeds or references around which perturbed members are grouped. The members
are grouped around the closest reference found (distances in terms of RMS differences).
The principle of tubing lies on three main assumptions
The ensemble distribution is generally monomodal
The verification is more likely to be found in the vicinity of the ensemble mean.
The previous assumption is true even in the case of multimo
The principle of tubing is to get a classification of ensemble members suitable with monomodal
distributions, giving more emphasis to the central part of the distribution and disregarding possible
modalities. Unlike any clustering method the tubi
ng does not try to group the members around
hypothetical centroids but to organize them along axes coming from the EM and reaching the
extremes of the distribution. These axes should represent the directions in which the verification is
likely to deviate
from the EM.
The algorithm works in the following way:
The central cluster is obtained by grouping members around the EM. The number of
members depends on a specific condition (see below).
The member who is the most remote from the EM is located. It b
ecomes the first extreme
and defines a tube grouping those members lying into the cylinder starting from the
boundaries of the central cluster and finishing on the extreme. The radius of each cylinder
is the same as for the central cluster (the distance b
etween the EM and the last accepted
member in the central cluster).
The process is then iterated with the following rule: a member belonging to a tube can be
still classified in another tube but can't become an extreme.
The tubing ends when all members a
The main parameter of the tubing is the condition limiting the size of the central cluster. This
condition may limit the number of members included in the central cluster, the spread of the central
cluster or directly its radius. Two main
configurations can be distinguished:
The size of the central cluster can be dependant on the spread of the day (e.g. the radius is
limited to the ensemble std or the number of members is limited to 2/3 of the distribution or
the variance of the central c
luster is limited to 50% of the total ensemble variance). For this
configuration the number of tubes varies little from day to day and is not dependant on the
forecast uncertainty but on the ensemble multimodality. The proportion of information
from the ensemble is always the same. On the other hand the resolution of the
tubing given by its radius varies from day to day which may be a problem for operational
The size of the central cluster can be independent on the spread of the day
in order to keep
the same resolution everyday; either the radius or the std of the central cluster is limited to
a certain threshold. Since the spread/uncertainty varies largely along the year a seasonal
adaptation is necessary, for instance by using the
mean forecast error as a threshold. For
this configuration the number of tubes varies from day to day according to the
spread/uncertainty of the day.
The previous configurations are similar for hierarchical agglomerative algorithms as the Ward's.
irst configuration corresponds to the case when the clustering process stops when the
proportion of variance explained by the centroids drops below a given threshold (typically 50%) or
when the internal variance of one cluster exceeds a certain proportion
of the total ensemble
The second configuration corresponds to a fixed limitation of the clusters std, generally seasonal
dependant as in the ECMWF operational Ward clustering configuration.
Tubing. D+6 forecast based on 5/October/2000; 1
39 members in central cluster.
Central cluster std=57 m, prop of total variance 49 %, radius=70 m
Tube 1: 4 members, extreme at 128 m from EM
Tube 2: 5 members, extreme at 119 m from EM
Tube 3: 3 members, extreme at 86 m from EM
Display Products Generated
Spaghetti plots. D+4 and D+7 forecasts based on 5/Oct/2000; 12 UTC showing a
large spread in Europe.
Plumes for T850, precipitation and Z 500. 9/Oct/2000; 12 UTC. Madrid
Time series spread
ill. Clusters and Tubes.
EPS precipitation probabilities based on 5/October/2000; 12 UTC. Accumulated
precipitation from D+5 to D+7
Multidimensional scaling to the inter
distances between EPS members
The multidimensional scaling (mds)
is a technique that allows the representation of a configuration
of n p
dimensional points in a k
dimensional Euclidean space (k << p) taking information only from
the distances. Such distances don't need to be exclusively Euclidean but could be any simi
measure between different elements (e.g. correlation coefficient etc...).
Given an Euclidean distance matrix D, i.e. a symmetric matrix whose elements are such that dij >
0, dii = 0, dij+djk > dik; it can be shown that the k
dimensional subspace is
chosen so that the new
δij minimize the function:
This minimization leads to a set of new coordinates x'ij for the original points that could be
where the rhs of the equation represents the eigenvalues and the correspon
ding eigenvectors of
the centered matrix B with elements
The mds is also called principal component analysis and if D is the Euclidean distance matrix of a
group of points in an n
dimensional space (matrix X) it can be sho
wn that their k principal
coordinates correspond to the first k
centered principal component of the matrix X .
The spectrum of normalized eigenvalues for a Z500 D+6 forecast the first three components are
able to give account for 50 to 80 % of the origina
l point distribution variability. For this reason it is
reasonable to represent the original EPS inter
distances configuration in 2 or 3 dimensions.
A qualitative analysis shows that an increase of the spread considered at different days at the
time seems to be followed by a sharpening of the eigenvalues spectrum. Generally the
higher the spread the higher the variability taken into account by the first eigenvalues.
In this application the Euclidean distance
and the cor
(calculated using either the climatological mean or the mean obtained by averaging over all grid
points of the chosen field) have been used. In order to transform the latter into a distance the
ation has been used
Normalized eigenvalues. D+6 fcst based on 5/Oct/2000; 12 UTC
Projection of the EPS plus T319 (O), UKMO (U), DWD (D) models plus verification (V)
on the first two eigenvectors. D+6 fcst based on
5/Oct/2000; 12 UTC
Barkmeijer J., Buizza R. and Palmer T.N., 1999. 3D
Var Hessian singular vectors and their
potential use in the ECMWF Ensemble Prediction System. Q.J.R. Met. Soc. 125, 2333
Buizza R. and Palmer T.N
., 1995. The singular vector structure of the atmospheric general
circulation. J. Atmos. Sci., 52, 9, 1434
Buizza R., Petroliagis T., Palmer T.N., Barkmeijer J., Hamrud M., Hollingsworth A., Simmons A.
and Weldi N., 1998. Impact of model resolution
and ensemble size on the performance of an
Ensemble Prediction System. Q. J. R. Met. Soc. 124, 1935
Buizza R., Miller M. and Palmer T.N., 1999. Stochastic representation of model uncertainties in the
ECMWF Ensemble Prediction System. Q. J. R. Met. S
oc., 125, 2887
Atger F., 1997. The tubing, an alternative to the clustering for EPS classification. OD Memo,