# Lecture-1 - WMO

AI and Robotics

Nov 25, 2013 (4 years and 5 months ago)

120 views

RECAPITULATION ON ENSEMBLE PREDICTION SYSTEMS (EPS)

(A. Garcia Mendez, ECMWF)

1.

EPS, an Overview:

The normal loss of consistency of a NWP deterministic forecast is around 4 to 5 days. The
uncertainty in the analysis initial condition result in small un
certainties growing quickly to large
errors within unstable flows. So small
-
scale errors will affect the large scale through non
-
linear
dynamics. Generally speaking the error growth is flow dependant.

The lack of a perfect forecast is due mainly to:

Th
e actual state of the atmosphere is known only approximately

Model errors due to limited resolution and/or parametrization of physical processes.

The ECMWF EPS as a complement to the deterministic forecast, has been part of the Operational
production sinc
e 1992. The EPS simulates possible initial uncertainties by adding to the
unperturbed analysis small perturbations within the limits of uncertainty of the analysis. From these
a number of T159L40 forecasts are produced.

A complete description of the weath
er prediction problem can be formulated in terms of the time
evolution of an appropriate probability density function (PDF) in the atmosphere’s phase space.

Although the problem can be formulated exactly through the continuity equation for probability
(Li
ouville equation), ensemble prediction appears to be the only feasible method to predict the PDF
beyond the range of linear error growth.

In ensemble prediction the PDF at initial time is represented through a finite sample of possible
initial conditions.

The sample of initial states must give a realistic estimate of the PDF of analysis
errors.

The sensitive areas of the atmosphere are targeted using the so
-
called singular vectors.

By definition the singular vectors depend on:

The initial and final m
etrics used to measure perturbations amplitude

The basic state characteristics

The optimization time interval (OTI)

The ECMWF EPS is based on an ensemble of 50 members plus the unperturbed Control non
-
linear time integration. The initial conditions (IC) o
f the 50 members are created by adding a
perturbation to the control ICs. The perturbations are defined using the fastest growing singular
vectors (SVs) of a linear approximation of the ECMWF non
-
linear model.

The main steps of the EPS are:

Computation
of the first 25 SV

Generation of 25 perturbations by a phase
-
space rotation of the SV

The 25 perturbations are mirrored by changing its sign and then 50 IC are built by adding
the perturbations

50 + 1 non
-
linear integration run for 10 days.

At the present

the SV resolution is T42L31

-

2

-

The SV are computed in the following way:

0
)
(
Lx
t
x

If x is a state vector of the system: and

Ax
t
x

is the solution of the tangent version of the model equations then the perturbation total energy
norm can be computed as:




0
0
*
0
0
2
;
;
)
(
x
Lx
E
L
Lx
Lx
E
t
x

where L* is the adjoint of the tangent propagator L defined in (1), <...;...
> is the Euclidean scalar
product, and E is the energy weight matrix.

The SVs are the directions in the phase space of the system that maximizes the energy growth
defined by (3). They are the eigenvectors of the operator L*EL with the largest eigenvalues
. The
SVs are a function of the optimization time interval t and of the norm definition.

The EPS sampling strategy consists of:

Identify the phase space directions along which analysis error grows fastest in the linear
regime (Toti ~ 48 hours). The gro
wth is measured using the total energy norm.

These directions are called
singular vectors

of the tangent forward propagator. They are
the eigenvectors of an operator, which include the tangent forward, and adjoint operators.
They depend on the optimizat
ion time interval (Toti) and on the norm.

They identify regions where perturbations can grow by barotropic and baroclinic energy
conversion.

Fig 1:

Comparison of unperturbed analysis (Control) with EPS member 35 (Initial conditions).

5/October/2000; 12

UTC

Fig 2:

Comparison of unperturbed analysis (Control) with EPS member 36 (35 mirrored member).

5/October/2000; 12 UTC

Fig 3:

Same as figure 1 for the D+5 forecast (member 35)

Fig 4:

Same as figure 2 for the D+5 forecast (member 36)

2.

Clustering T
echniques

The main problem for the EPS to be operational is to design suitable products to provide potential
users with valuable information. The daily output of the EPS is basically a collection of forecasts,
which are different possible realizations of

the meteorological future. Relevant information can be
found at different levels:

The ensemble mean, centroid of the distribution represents the first level of information.

-

3

-

The ensemble spread represents a second level of information, as it should meas
ure the
uncertainty of the forecast or the amplitude of possible fluctuations around the centroid.

A third level of information is the distribution of the spread. Small spread regions should indicate
ensemble modalities where the likelihood is larger tha

The last level of information is reached when the whole distribution is regarded as informative, e.g.
when weather parameter probabilities are directly generated from the ensemble.

Fig 5: Ensemble mean, unperturbed forecast (Co
ntrol) and EPS variance. D+5 forecast based on
5/October/2000; 12 UTC.

Different products have been designed to make easier the EPS interpretation. Some products
allow visualizing specific aspects of the distribution (e.g. plumes or spaghetti charts); ot
hers are
based on an objective classification of ensemble members (e.g. clustering).

2.1

Clustering Algorithms

2.1.1

Hierarchical agglomerative algorithms

The principle of these algorithms is to merge at each stage the two nearest clusters into a new
cl
uster. The number of clusters varies from 51 at the first stage (each member being a single
member cluster) to 1 at the last stage. Several options allow deciding when the partition is to be
stopped.

When the number of clusters exceeds a value (e.g. 6 c
lusters).

When the internal variance of a cluster exceeds a value. It may be an absolute value (e.g.
std smaller than 50 m) corresponding to a certain level of smoothing or sharpness of the
cluster or a relative value (e.g. std smaller than 50% of the to
tal variance).

When the part of explained variance by cluster means (centroids) variance drops below a
threshold (e.g. 50%).

When the number of members in a cluster exceeds a threshold, e.g. 17 members
corresponding to a majoritary ensemble alternative.

Ward Algorithm

This algorithm is used operationally at the ECMWF. It tends to assign a member to a small
cluster in order to balance the variance. It produces several separate clusters of
comparable size (or at least of comparable internal variance).
It shows the best results in
terms of explained variance. This algorithm has the disadvantage of mixing
-
up some
members presenting strong differences. It could be said that Ward's algorithm highlights
the ensemble multimodality.

The ECMWF operational cl
ustering is based on the RMS differences between the 500 hPa
geopotential forecasts averaged from D+5 to D+7 forecasts. It is always the same
members, which make up the contents of each cluster. There are occasions when two
members in the same cluster ca
n be rather different at the beginning or end of the period,
but sufficiently similar during the rest of the time interval to be placed in the same cluster.
On the other hand two members being similar during a part of the period may be placed in
different

clusters if they are sufficiently different during most of the period. The clustering is
performed separately for the whole of Europe plus 4 European sub
-
domains.

-

4

-

Fig 6:

Operational clustering for Europe. D+6 based on 5/October/2000; 12 UTC. Explained

variance 60 %. Cluster 1 std=42 m, Cluster 2 std=42 m, Cluster 3 std=37 m, Cluster 4 std=37 m,
Cluster 5 std=30 m.

Fig 7:

Ward clustering for Europe. D+6 based on 5/October/2000; 12 UTC. Proportion of
explained variance 50% resulting in 4 clusters.

Fi
g 8:

Explained variance curve. Same as for figures 6 and 7

Several ECMWF Member States compute their own tailored clustering.

Centroid Algorithm

This algorithm tends to group a large majority of members in a single cluster (chaining or
snowballing eff
ect) and the rest in small clusters of 2 or 3 members. A significant number of
members are not grouped and remain as single clusters representing the extreme situations that
were lost in the Ward's algorithm. It could be said that this algorithm highligh
ts ensemble
monomodality.

This algorithm is designed to balance the effects of the two previous ones. It tends to minimize the
internal cluster variance and simultaneously to maximize the variance inter clusters. The result is
general
ly a large cluster and several small clusters. This algorithm has been regarded as the best
procedure to achieve a climatological classification of synoptic situations.

2.1.2

Non
-
hierarchical Algorithms

ACC Clustering

This algorithm is used in the Was
hington NCEP ensemble operational clustering. The members
are grouped in a cluster when their distance (in terms of anomaly correlation instead of RMS) is
below a given threshold. The clustering starts by finding the two less similar members closest to
t
he threshold each of them becoming the seed of a cluster. The remaining members are grouped
around these two according to the threshold (when a member may be clustered in both clusters is
assigned to the closest one. In a second stage the closest pair of

similar members (according to
the threshold) is found and members are clustered around them. The process iterates until there is
no remaining pair of similar members in the ensemble.

This clustering is based on the assumption that the range of alternati
ves sampled by the ensemble
could be simply summarized by the interval between the two less similar forecasts. The main
advantage is that it allows defining (by the threshold choice) exactly what the user regards as
similar and different forecasts. On th
e other hand you cannot expect a large explained variance.

Central Clustering

It groups the members around the ensemble mean regarded as the center of the members
distribution assuming the ensemble to be rather monomodal around its mean as it seems to
ge
nerally be the case.

The size of the central cluster can be given by:

Its internal variance corresponding to a given level of smoothing or sharpness of the cluster mean
field.

The number of members included in the central cluster.

-

5

-

Once obtained the ce
ntral cluster the remaining members are grouped following a basic
hierarchical algorithm. The clusters growth is limited by their internal variance, which cannot
exceed the central cluster variance (in order to keep the same resolution). Unlike in hierar
chical
algorithms the clustering doesn't stop when one cluster exceeds the conditions, its growth stops
but other clusters can merge. The size of the remaining clusters is generally small compared with
the central cluster since they represent the edges of

the distribution. This algorithm highlights the
ensemble monomodality even more than Centroid hierarchical algorithm.

Clustering Models

Based on the assumption that high
-
resolution models are more reliable than EPS members and
that T159 Control is more

reliable than perturbed forecasts. The idea is to use these 'better'
forecasts as seeds or references around which perturbed members are grouped. The members
are grouped around the closest reference found (distances in terms of RMS differences).

2.1.3

Tubing

The principle of tubing lies on three main assumptions

The ensemble distribution is generally monomodal

The verification is more likely to be found in the vicinity of the ensemble mean.

The previous assumption is true even in the case of multimo
dality.

The principle of tubing is to get a classification of ensemble members suitable with monomodal
distributions, giving more emphasis to the central part of the distribution and disregarding possible
modalities. Unlike any clustering method the tubi
ng does not try to group the members around
hypothetical centroids but to organize them along axes coming from the EM and reaching the
extremes of the distribution. These axes should represent the directions in which the verification is
likely to deviate
from the EM.

The algorithm works in the following way:

The central cluster is obtained by grouping members around the EM. The number of
members depends on a specific condition (see below).

The member who is the most remote from the EM is located. It b
ecomes the first extreme
and defines a tube grouping those members lying into the cylinder starting from the
boundaries of the central cluster and finishing on the extreme. The radius of each cylinder
is the same as for the central cluster (the distance b
etween the EM and the last accepted
member in the central cluster).

The process is then iterated with the following rule: a member belonging to a tube can be
still classified in another tube but can't become an extreme.

The tubing ends when all members a
re classified.

The main parameter of the tubing is the condition limiting the size of the central cluster. This
condition may limit the number of members included in the central cluster, the spread of the central
cluster or directly its radius. Two main

configurations can be distinguished:

The size of the central cluster can be dependant on the spread of the day (e.g. the radius is
limited to the ensemble std or the number of members is limited to 2/3 of the distribution or
the variance of the central c
luster is limited to 50% of the total ensemble variance). For this
configuration the number of tubes varies little from day to day and is not dependant on the
forecast uncertainty but on the ensemble multimodality. The proportion of information
extracted

from the ensemble is always the same. On the other hand the resolution of the

-

6

-

tubing given by its radius varies from day to day which may be a problem for operational
purposes.

The size of the central cluster can be independent on the spread of the day

in order to keep
the same resolution everyday; either the radius or the std of the central cluster is limited to
a certain threshold. Since the spread/uncertainty varies largely along the year a seasonal
adaptation is necessary, for instance by using the

mean forecast error as a threshold. For
this configuration the number of tubes varies from day to day according to the

The previous configurations are similar for hierarchical agglomerative algorithms as the Ward's.

The f
irst configuration corresponds to the case when the clustering process stops when the
proportion of variance explained by the centroids drops below a given threshold (typically 50%) or
when the internal variance of one cluster exceeds a certain proportion
of the total ensemble
variance.

The second configuration corresponds to a fixed limitation of the clusters std, generally seasonal
dependant as in the ECMWF operational Ward clustering configuration.

Fig 9:

Tubing. D+6 forecast based on 5/October/2000; 1
2 UTC

39 members in central cluster.

Central cluster std=57 m, prop of total variance 49 %, radius=70 m

Tube 1: 4 members, extreme at 128 m from EM

Tube 2: 5 members, extreme at 119 m from EM

Tube 3: 3 members, extreme at 86 m from EM

3.

Examples of

Display Products Generated

Fig 10/11:

Spaghetti plots. D+4 and D+7 forecasts based on 5/Oct/2000; 12 UTC showing a

Fig 12:

Plumes for T850, precipitation and Z 500. 9/Oct/2000; 12 UTC. Madrid
-
Spain

Fig 13:

-
sk
ill. Clusters and Tubes.

Fig 14:

EPS precipitation probabilities based on 5/October/2000; 12 UTC. Accumulated

precipitation from D+5 to D+7

-

7

-

Appendix

Multidimensional scaling to the inter
-
distances between EPS members

The multidimensional scaling (mds)

is a technique that allows the representation of a configuration
of n p
-
dimensional points in a k
-
dimensional Euclidean space (k << p) taking information only from
the distances. Such distances don't need to be exclusively Euclidean but could be any simi
larity
measure between different elements (e.g. correlation coefficient etc...).

Given an Euclidean distance matrix D, i.e. a symmetric matrix whose elements are such that dij >
0, dii = 0, dij+djk > dik; it can be shown that the k
-
dimensional subspace is

chosen so that the new
distances
δij minimize the function:

)
-
d
(
n
1
=
ij
2
ij
2
n
1
=
j
i,
2

This minimization leads to a set of new coordinates x'ij for the original points that could be
expressed as

v
ji

i
ij
=
x

where the rhs of the equation represents the eigenvalues and the correspon
ding eigenvectors of
the centered matrix B with elements

)
d
2
+
d
j
2
-
d
i
2
-
d
ij
2
(
2
1
-
=
b
ij

where:

The mds is also called principal component analysis and if D is the Euclidean distance matrix of a
group of points in an n
-
dimensional space (matrix X) it can be sho
wn that their k principal
coordinates correspond to the first k
-
centered principal component of the matrix X .

The spectrum of normalized eigenvalues for a Z500 D+6 forecast the first three components are
able to give account for 50 to 80 % of the origina
l point distribution variability. For this reason it is
reasonable to represent the original EPS inter
-
distances configuration in 2 or 3 dimensions.

A qualitative analysis shows that an increase of the spread considered at different days at the
-
time seems to be followed by a sharpening of the eigenvalues spectrum. Generally the
higher the spread the higher the variability taken into account by the first eigenvalues.

In this application the Euclidean distance

]
x
-
x
(
[
=
d
1/2
jn
2
2
n
=1
k
ij

n
j
i
ij
n
i
ij
j
n
j
ij
i
d
n
d
d
n
d
d
n
d
1
,
2
2
1
2
2
1
2
2
1
1
1

-

8

-

and the cor
relation coefficient

j
i
j
jk
i
ik
n
=1
k
ij
)
-
x
)(
-
x
(
=

(calculated using either the climatological mean or the mean obtained by averaging over all grid
points of the chosen field) have been used. In order to transform the latter into a distance the
following modific
ation has been used

/2
)
+
2
-
(
=
d
jj
ij
ii
1
ij

Fig 15:

Normalized eigenvalues. D+6 fcst based on 5/Oct/2000; 12 UTC

Fig 16:

Projection of the EPS plus T319 (O), UKMO (U), DWD (D) models plus verification (V)

on the first two eigenvectors. D+6 fcst based on

5/Oct/2000; 12 UTC

_______________

References:

Barkmeijer J., Buizza R. and Palmer T.N., 1999. 3D
-
Var Hessian singular vectors and their
potential use in the ECMWF Ensemble Prediction System. Q.J.R. Met. Soc. 125, 2333
-
2351.

Buizza R. and Palmer T.N
., 1995. The singular vector structure of the atmospheric general
circulation. J. Atmos. Sci., 52, 9, 1434
-
1456.

Buizza R., Petroliagis T., Palmer T.N., Barkmeijer J., Hamrud M., Hollingsworth A., Simmons A.
and Weldi N., 1998. Impact of model resolution
and ensemble size on the performance of an
Ensemble Prediction System. Q. J. R. Met. Soc. 124, 1935
-
1960.

Buizza R., Miller M. and Palmer T.N., 1999. Stochastic representation of model uncertainties in the
ECMWF Ensemble Prediction System. Q. J. R. Met. S
oc., 125, 2887
-
2908.

Atger F., 1997. The tubing, an alternative to the clustering for EPS classification. OD Memo,
ECMWF.