1
Consequences of the ergodic theorems for
classical test theory, factor analysis, and the
analysis of developmental processes
Peter C.M. Molenaar
The Pennsylvania State University
2
1. Introduction
The currently dominant a
pproach to statistical analysis in psychology and
biomedicine is based on analysis of inter

individual variation. Differences
between subjects
,
drawn from a population of subjects
,
provide the information
for making
inferences about states of affairs at th
e population level (e.g., mean
and/or covariance structure
). This approach underlies all standard statistical
analysis techniques such as analysis of variance, regression analysis, path
analysis,
factor analysis, cluster analysis, and multilevel modeling t
echniques.
Whether the data are obtained in cross

sectional or longitudinal designs (or more
elaborated designs such as sequential designs), the statistical analysis always is
focused on the structure of inter

individual variation. Parameters and statistic
s of
interest are estimated by pooling across subjects, where these subjects are
assumed to be homogeneous in all relevant respects. This is the hall

mark of
analysis of inter

individual variation: the sums defining the estimators in statistical
analysis a
re taken over different subjects randomly drawn from a population of
presumably
homogeneous
subjects. In mixed modeling the
population is
considered to be
composed of different sub

populations, but within each
subpopulation subjects again are assumed to be
homogeneous.
In the next section definitions will be given of inter

individual variation and
homogeneity of a population of subjects, but the intuitive content of these
concepts is clear. These intuitions would seem to imply that
inferences about
states
of affairs at the population level
obtained by pooling across subjects
constitute general findings that apply to each subject in the homogeneous
population.
Yet in general this is not the case. That is, in general it is not true that
inferences about stat
es of affairs at the population level based on analysis of
inter

individual variation apply to any of the individual subjects making up the
population. This negative result is a direct implication of a set of mathematical

statistical
theorems;
the so

calle
d classical ergodic theorems (cf. Molenaar,
2004).
A concise heuristic description of the classical ergodic theorems will be
given below. The main focus of this chapter, however, will be on some of the
implications of these theorems.
For instance, it
will
be shown that classical test
theory is based on assumptions that violate the classical ergodic theorems, and
hence
,
in a precise sense to be defined later on
,
the results of classical test
theory do not apply in individual assessments. This, of course, is
a serious
shortcoming of classical test theory, because many psychological test
s
have
been constructed and standardized
according to
classic
al test theory and are
applied in
the
assessment of individual subjects.
Special emphasis will be given to the fact
that developmental systems constitute
prime examples of non

ergodic systems having age

dependent statistical
characteristics (mean trends and sequential dependencies). Therefore the
statistical analysis of developmental processes has to be based not on in
ter

individual variation, as now is the standard approach, but on intra

individual
variation (where the latter type of variation will be defined in the next section). It
3
will be indicated that the insistence that developmental processes should be
studied a
t the individual level has a long history in theore
tical developmental
psychology. The classical ergodic theorems provide a definite vindication of this
theoretical line of thought.
At the close of this chapter a new statistical modeling technique will b
e presented
with which it is possible to analyze developmental processes with age

dependent
statistical characteristics at the required intra

individual level. This modeling
technique is based on advanced engineering methods for the analysis of
complex dyn
amic systems. It will be shown that the new modeling technique
allows for the optimal
guidance
of ongoing developmental processes at the intra

individual level.
Evidently, this opens up entirely new possibilities for applied
developmental psychological sci
ence.
2. Preliminaries
In this section definitions will be given of the main concepts used in this chapter.
The given definition of (non

)ergodicity is heuristic; selected references will be
given to the vast literature on ergodic theory for more formal
elaborations.
2.1
Unit of analysis
. Each actually existing human being can be conceived of as
a high

dimensional integrated system whose behavior evolves as function of
place and time. In psychology one usually does not consider place, leaving time
as th
e dimension of main interest. The system includes important functional
subsystems such as the perceptual, emotional, cognitive and physiological
systems, as well as their dynamic interrelationships. The complete set of
measurable time

dependent variables c
haracterizing the system’s behavior can
be represented as
the coordinates of
a high

dimensional space
(cf. Nayfeh &
Balachandran, 1993, Ch. 1)
, which will be called the behavior space. According
to
D
e Groot (1954), the behavior space contains all the scien
tifically relevant
information about a person.
The realized values of all measurable variables for a particular individual
at
consecutive time points
constitutes a tr
ajectory (life history) in
behavior space.
This trajectory in behavior space is our basi
c unit of analysis. Accordingly, the
complete set of life histories of a population of human subjects can be
represented as an ensemble of trajectories in the same behavior space.
2.2
Inter

and intra

individual variation
. A standard dictionary definitio
n of
variation is: “The degree to which something differs, for example, from a former
state or value, from others of the same type, or from a standard”. The degree to
which something differs implies a comparison, either between different replicates
of the
same type of entity (inter

individual variation) or else between
consecutive
temporal states of the same individual entit
y (intra

individual variation). Based on
this dictionary definition and using the construct of an ensemble of life trajectories
defined
in the previous section, it is possible to give appropriate definitions of
4
inter

and intra

individual variation. The following definitions are inspired by
Catell’s (1952) notion of the Data Box.
With respect to an ensemble of trajectories in behavior s
pace, inter

individual
variation is defined as follows: (i) select a fixed subset of variables; (ii) select one
or more fixed time points as measurement occasions, (iii) determine the variation
of the scores on the selected variables at the selected time p
oints by pooling
across subjects.
A
nalysis of inter

individual variation
thus defined is called R

technique by Cattell (1952)
.
In contrast, intra

individual variation is defined as
follows: (i) select a fixed subset of variables; (ii) select a fixed subje
ct; (iii)
determine the variation of the scores of the single subject on the selected
variables by pooling across time points.
A
nalysis of intra

individual variation
thus
defined is called P

technique by Cattell (1952).
2.3 E
rgodicity
. We now
can present
a
heuristic
definition of ergodicity in terms of
the concepts defined in the previous sections
.
Ergodicity addresses the following
foundational question:
Given the same set of selected variables
(of Cattell’s Data
Box)
, under which conditions will an analy
sis of inter

individual variation yield the
same results as an analysis of intra

individual variation? To illustrate this
question: under which conditions will factor analysis of inter

individual covariation
yield a factor solution that is equal to factor
analysis of intra

individual
covariation? The latter illustration can be rephrased in ter
ms of Cattell’s Data Box
in the following way
:
U
nder which conditions will R

technique factor analysis of
inter

individual covariation yield a solution that equals the
analogous P

technique
factor solution of intra

individual covariaton?
The general answer to this question is provided by the classical ergodic
theorems (cf. Molenaar, 2004; Molenaar, 2003, chapter 3). The answer is
: O
nly if
the ensemble of time

dependen
t trajectories in behavior space obeys two
rigorous
conditions
will an analysis of inter

individual variation yield the same
results as an analysis of intra

individual variation
.
The two conditions
concerned
are the following.
Firstly, the trajectory of ea
ch subject in the ensemble has to
obey exactly the same dynamical law
s
(homogeneity of the ensemble).
Secondly, each trajectory should have constant statistical characteristics in time
(stationarity, i.e., constant mean level and serial dependencies). In c
ase either
one (or both) of these two conditions is not met, the psychological process
concerned is non

ergodic, i.e., its structure of inter

individual variation will differ
from its structure of intra

individual variation. For a non

ergodic process, the
results obtained in standard analysis of inter

individual variation do not apply at
the individual level of intra

individual variation.
The meaning of the homogeneity
and stationarity
assumption
s
will
be elaborated
more fully in later sections, starting w
ith the section on
classical test theory
below.
The requirement that each subject in the ensemble should obey the same
dynamical laws
is expressed in the language of ergodic theory
,
which has its
roots in the
theoretical
foundations of statistical mechanic
s. Statistical mechanics
5
arose as the attempt
by Bol
t
zman
n
to explain the equilibrium characteristics of a
homogeneous gas
kept
under constant pressure and temperature
in a container
,
where the atoms of the homogeneous gas each obey the Newton laws of moti
on.
Nowadays ergodic theory is an independent mathematical discipline; standard
introductions are Petersen (1983) and Walters (1982). An excellent recent
monograph is Choe (2005).
T
he
theorem which for the ensuing discussion is the
most important one in
th
e set of classical ergodic theorems
has been
proven by
Birkhoff (1931).
3.
The non

ergodicity of classical test theory
.
Many of the psychological tests currently in use have been constructed according
to the principles of classical test theory
. The basic
concept in classical test theory
is the
concept of
true score: each observed score is conceived of as a linear
combination of a true score and an error score.
In their authoritative book on
classical test theory
,
Lord & Novick (1968) define the concept of
true score as
follows. They consider a
fixed
person P, i.e., P is not randomly drawn from some
population but is the given person for which the true score is to be defined. The
true score of P is defined as the expected value of the propensity distributio
n of
P’s observed scores. The propensity distribution is characterized as a “...
distribution function defined over repeated statistically independent
measurements on the same person” (Lord & Novick, 1968, p. 30). The concept of
error score then follows st
raightforwardly: the error score is the difference
between the observed score and the true score.
Several
aspects of this definition of true score are noteworthy.
T
he
definition
is
based on the intra

individual variation characterizing a fixed person P.
Repeated
administration of the same test to P yields a time series of scores of P, the mean
level of which is defined to be P’s true score.
Hence this definition of true score
does not involve any comparison w
ith other persons and therefore is not
at all
dependent on inter

individual variation. The single

subject repeated measures
design used to obtain P’s time series of
observed
scores
is akin to
standard
psychophysical measurement designs
(e.g., Gescheider, 1997).
Lord & Novick
(1968) require that the re
peated me
asurements are independent. This implies
that the time series of P’s scores
should lack
any sequential depe
ndencies
(autocorrelation).
At the close of this sec
tion we will further discuss
the
require
ment
that repeated measurements
have to
be indep
endent
.
Lord & Novick
(1968, p. 30) do not further
elaborate
their original definition of true
score
in the context of
intra

individual variation
because: “… it is not possible in
psycho
logy to obtain more than a few
independent
observations”
.
Instead of
considering an arbitrary large number of replicated measurements of a single
fixed person P,
Lord & Novick (1968, p. 32)
shift attention to
an
alternative
scheme in which an arbitrary large number of persons is measured at a single
fixed time: “Primarily,
test theory
treats individual differences
or, equivalently, the
distributio
n of measurements over people”
.
Apparently it is expected that using
6
an individual
differences approach
,
valid information can be obtained about the
distinct
propensity distribution
s underlying
individual
true scores. We will see
shortly that this expectation is unwarranted.
Before focusing
in the remainder of their book
solely on the latter definition of
true score based on inter

individual variation, Lord & Novick
(1968, p.32)
ma
ke
the following interesting comment about their initial definition of true score based
on intra

individual variation
:
“
The true and error scores defined above
[based on
intra

individual variation; PM]
are
not
those primarily considered in test theory …
Th
ey are, however, those that would be of interest to a theory that deals with
individuals, rather than with groups (counseling rather than selection)
”
.
This is a
remarkable, though somewhat oblique statement. What is clear is that Lord &
Novick consider a t
est theory based on their initial concept of true score, defined
as the mean of the intra

individual variation
characterizing
a fixed person P, to be
“… of interest to a theory that deals with individuals …”.
That is,
they consider
such a test theory based
on intra

individual variation
to
be important in the
context of individual assessment. But what is not clear is whether they also
consider the alternative concept of true score
based on inter

individual variation
(individual differences)
to be
not
of inte
rest to a theory that deals with individuals.
That is, do they imply that classical test theory as we know it
is
only appropriate
for the assessment of groups and not for individuals?
It will be shown that
classical test theory indeed is inappropriate
for
individual assessment.
To summarize the discussion thus far: Lord & Novick (1968) define the concept
of true score as the expected value of the propensity distribution of the observed
scores of a given individual person P. This definition of true score b
ased on intra

individual variation then is used in an inter

individual context focused on
individual differences, i.e., classical test theory as we know it.
This raises the all

important question whether the information provided by individual differences
(
inter

individual variation) is able to determine the individual propensity
distributions to a degree which is sufficient to a
pply the concept of true score
based on intra

individual variation.
It is noted that this is exactly the question
concerning the er
godicity of the psychological process
concerned: for a given
test, will an analysis of inter

individual variation of test scores yield the same
results as an analysis of intra

individual variation of test scores?
To answer this
question it has to be establ
ished that the psychological process presumed by
classical test theory to underlie the generation of test scores obeys the two
criteria for ergodicity.
The psychological process which according to classical test theory underlies the
generation of test sc
ores is very simple. It is implicit in the definition of true score
given by Lord & Novick (1968). Each individual person P is assumed to generate
a time series of independent scores in response to repeated administration of the
same test.
E
ach observed sc
ore of P’s time series constitutes a
n independent
random sample
drawn from P’s propensity distribution. Hence there exists a one

to

one relationship between the time series of P’s observed test scores and P’s
7
propensity distribution. The psychological proc
ess underlying P’s time series of
observed scores therefore is characterized, according to classical test theory, by
P’s propensity distribution. Statistical analysis of P’s intra

individual variation
boils down to statistical analysis based on P’s propens
ity distribution.
Classical
test theory only considers the first two central moments of P’s propensity
distribution (its mean and its variance).
According to classical test theory the propensity distributions of different persons
have different means and
different variances. The true score of person P
1
(i.e.,
the mean of the propensity distribution of P
1
) will in general differ from the true
score of person P
2
. Also the variance of P
1
’s observed scores will in general
differ from the variance of P
2
observ
ed scores. Hence, given the one

to

one
correspondence between individual time series and individual propensity
dist
ributions
noted above, the ensemble
involving
persons P
i
, i=1,2,…, is
populated by time series (propensity distributions)
which
h
ave differen
t mean
levels
(means of the propensity distributions)
and different variances
.
Clearly
such an ensemble is entirely heterogeneous: the psychological process
according to which P
i
’s time series of observed scores is generated is different
from the psycholog
ical process according to which P
k
’s time series of observed
scores is generated because, for i
≠
k, the underlying propensity distribution of P
j
has mean and variance different from P
k
’s propensity distribution. Consequently
the ensemble of time series (propensity distributions) violates at least one of the
two criteri
a for ergodicity: the trajector
ies
(time se
ries) in the ensemble do
not
obey
the homogeneity criterion for ergodity, i.e., trajectories associated with
different persons do not obey
exactly
the same dynamical laws. Stated more
specifically, the random motion characteri
zing
t
ime series o
f
observed scores
in
the ensemble
has different mean
level
and variance for different persons.
Consequently, the psychological process which according to classical test theory
underlies the generation of test scores is non

ergodic. That is, it follows from
the
classical ergodic theorems that results obtained in an analysis of inter

individual
variation (individual differences) of test scores based on classical test theory do
not apply at the individual level of intra

individual variation. In short, the resu
lts
obtained with classical test theory do not apply in the context of individual
assessment.
3.1 Some formal elaborations
.
We will now present some simple formal elaborations showing the invalidity of
classical test theory for individual assessment. In
particular we will focus on the
concept of reliability as defined in classical test theory, show how estimation of
an individual’s true score in classical test theory depends upon the reliability of
the test, and indicate why this leads to invalid inferen
ces. In what follows
expressions
related to classical test theory
are
based on Lord & Novick (1968).
Consider first the situation with respect to the definition of true score based on
intra

individ
ual variation. A particular test has been selected (it wil
l be understood
8
in the rest of this
section that the same
test is being considered). Also a particular
person P is given. Let y(P,t), t=1,2,… denote the time series of P’s scores
obtained by repeatedly administering the test. The number of repeated
measure
ments is left undefined: it is understood that this number can be taken to
be arbitrarily large. Then the true score of P,
(P), is defined as the expected
value (mean) of y(P,t) across all repeated measurements t. Notice that
(P) is a
constant.
The varia
nce of y(P,t) across all repeated measurements is denoted by
2
(P). The variance
2
(P) is a measure of the reliability of a single score y(P,t=T)
which is obtained at the T

th repeated measurement (T arbitrary), conceived of
as an indicator of P’s true sco
re
(P). If
2
(P) is large, y(P,t=T) can be very
different from
(P), whereas if
2
(P) is small its value will be close to
(P).
To reiterate, in classical test theory one does not consider an arbitrary large
number of repeated measurements of a single p
erson P, but instead one
considers an arbitrary large number of persons measured at a single time T. This
is the shift from an intra

individual variation perspective underlying the concept of
true score to an inter

individual variation perspective underlyi
ng classical test
theor
y as we know it. Accordingly we
consider an ensemble of time series of test
scores associated with different persons P
i
, i=1,2,…, where the number of
persons can be taken arbitrarily large. Associated with each distinct person P
i
is
a distinct propensity distribution which has, as explained above, a one

to

one
relationship with the psychological process according to which P
i
generates
his/her time series of
observed test scores. The mean (true score) of
the
propensity distribution
of
P
i
is
(P
i
) and the observed score of P
i
is y(P,t=T),
where T is arbitrary but fixed
. To ease the presentation we will denote
(P
i
) as
i
and
y(P
i
, t=T) as y
i
.
T
he error score associated with y(P
i
, t=T) =
y
i
is
(P
i
, t=T) and
will be denoted as
(P
i
, t=T)
=
i
,.
We now are ready to express the basic relationships of classical test theory:
(1
a
)
y
i
=
i
+
i
, i=1,2,…
(1
b
)
var[
y
i
] = var[
i
] + var[
i
]
.
According to (1
a
) the observed score y
i
of a randomly selected person P
i
is a
linear combination of the t
rue score
i
and the error score
i
of P
i
. According to
(1
b
) the variance of observed scores across persons consists of a linear
combination of the variance of
the
true scores across persons and the variance
of the error scores across persons. The reliabil
ity
of the test then is defined as:
(1
c
)
= var[
i
] / {
var[
i
] + var[
i
]}.
Hence the reliability
is the proportion of true score variance across persons in
the total variance of observed scores across persons.
9
Now suppose that the reliability
o
f our test is given and that also is given the
observed score y
i
of person P
i
. Then the following so

called Kelly estimator of the
true score
i
of P
i
can be defined (cf. Lord & Novick, 1968, p. 65, formula 3.7.2a):
(2
a
) est[
i
y
i
] =
y
i
+ (1

)
whe
re
is the mean of observed scores across persons. The error variance
associated with the Kelly estimator (2
a
) is (Lord & Novick, 1968, p. 68, formula
3.8.4a):
(2
b
) var{est[
i
y
i
]} = var[y
i
](1

)
.
Expressions (2
a
) and (2
b
) show that the estimate
and
associated standard error
of a person’s true score in classical test theory
are
a direct function of the test
reliability
.
The reliability itself is according to (1
c
) a direct function of the
variance of error scores var[
i
] across persons. Hence the Ke
lly estimate (2
a
) of a
person’s true score is a direct function of the error variance var[
i
] across
persons.
We have reached the conclusion that in classical test theory based on analysis of
inter

individual variation (individual differences), the estim
ate of a person’s true
score as well as the standard error
of this estimated true score de
pend directly
upon the reliability
of the test
.
In contrast, it was indicated at the beginning of
this section
that the variance
2
(P)
of the
propensity distributi
on
describing P’s
intra

individual variation
is a measure of the reliability
of a single score y(P,t=T)
estimating
P’s true score
(P).
Hence we have two different concepts of
reliability: an intra

individual definition in which the reliability is given by
2
(P) and
an inter

individual definition in which
the
reliability is a direct function of var[
i
].
Given that the definition of true score as the mean of a person P’s propensity
distribution is the starting point of both concepts of reliability, the defin
ition of
reliability in terms of the
intra

individual
variance
2
(P) is basic. The question
then arises whether the classical test theoretical definition of reliability in terms of
the inter

individual error variance
var[
i
] is a good approximation of
2
(P
).
The
answer to this question is given by the following expression (Lord & Novick,
1968, p. 35, formula 2.6.4):
(3) var[
i
] =
E
i
[
2
(P
i
)]
where
E
i
denotes the expectation taken
over
persons P
i
, i=1,2,… . Expression (3)
states that the inter

individual er
ror variance var[
i
] is the mean of the intra

individual variances of
individual
propensity distributions across persons P
i
,
i=1,2,… .
So, coming to
our
final verdict, how good an approximation is (3) for each of the
individual variances
2
(P
i
), i=1,2,
… ? Given that the number of persons in the
10
ensemble is taken to be arbitrarily large,
and given that the
2
(P
i
), i=1,2,… can
differ arbitrarily according to classical test theory,
it is immediately clear that in
general (3) bears no relationship to any of
the variances of the individual
propensity distributions.
Hence (3) is a poor approximation to the variances
2
(P
i
)
of the individual propensity distributions. Suppose
that (3) is small, which implies
that the
(
inter

individual
)
reliability
is high. T
hi
s leaves entirely open the
possibility that the variance
2
(P) of a given person P’s propensity distribution is
arbitrary large
(the psychological process generating test scores is
heterogeneous, hence non

ergodic)
. Estimation of P’s true score by means of
the
Kelly estimator (2
a
) then will yield a severely biased result. Also the standard
error (2
b
) of this estimate
will be severely biased,
suggest
ing
an illusory high
precision of the Kelly estimate.
Only the actual value of
2
(P) will provide the
correct
precision of taking P’s observed score as an estimate of P’s true score.
The true value of
2
(P) only can be
estimated
in an analysis of P’s intra

individual
variation. That is, the test should be repea
tedly administered to P, yielding a
time
series of P’
s
observed scores
.
The mean of P’s time series of observed scores
constitutes an unbiased estimate of P’s true score, and the standard deviation of
P’s time series of observed scores provides an unbiased estimate of the
precision of P’s estimated true score
.
3.2 Fundamental reasons or contingent circumstances
This section presents a critical discussion of the reasons why Lord & Novick
(1968), after having defined the concept of true score in terms of intra

individual
variation,
do not further pursue
an
in
tra

individual foundation for test theory and
turn instead to
an inter

individual perspective. It will be argued that their reasons
for doing so are not fundamental, but pertain to contingent circumstances that
can be dealt with by means of appropriate sta
tistical

methodological techniques
.
The key remark leading up to the re
jection of the possibility of a
test theory
based
on intra

individual variation
is the following: Characterizing the propensity
distribution
associated with the time series of a given
person P’s observed test
scores, Lord & Novick (1968, p. 30) require that the
“... distribution function
[is]
defined over repeated statistically independent measurements on the same
pers
on”
.
The important qualification is that the repeated measurements s
hould
be statistically independent. This implies the requirement that P’s time series of
observed test scores should lack sequential dependencies (e.g., autocorrelation).
After having postulated the requirement of obtaining statistically independent
obse
rved scores, Lord & Novick (1968, p. 30) conclude:
“… it is not possible in
psycho
logy to obtain more than a few
independent
observations”
.
This is the
reason why they do not consider the possibility of a test theory based on intra

individual variation to
be feasible. In general test scores obtained in a single

subject time series design will be sequentially dependent, i.e., have significant
autocorrelation. Moreover, the statistical properties of the psychological process
according to which test scores are
generated may change in time. For instance
,
11
the process concerned
may be vulnerable to learning and habituation influences
which induce time

dependent changes in the way test scores are being
generated.
Before scrutinizing the details of Lord & Novick’s
(1968) requirement that
repeated measurements of the same person P should be statistically
independent, we first consider their reason not to pursue a test theory based on
intra

individual variation. Because the basic concept underlying classical test
the
ory, the concept of true score, is defined at the level of intra

individual
variation, one would expect that the reason to leave that level and move to a
different level of inter

individual variation would
have to
be
a fundamental reason.
One would expect
to be given an argument involving issues of logical necessity
or impossibility. Yet the actual argument given by Lord & Novick
(1968)
concerns
more an issue of contingent character: repeated measurement of the same
person P yields test scores that are in g
eneral not statistically independent.
Indeed,
all
psychometricians will agree. But the statistical analysis techniques
used to determine P’s propensity distribution can accommodate the presence of
sequential dependencies, and then we still can have a test
theory which
is
directly based on the concept of true score as defined by Lord & Novick
(1968)
.
That is, a test theory based on intra

individual variation
which would be of
interest for individual assessment and counseling
.
The reason
which
Lord &
Novick (
1968) give for not further pursuing such a test theory is not fundamental
and does not prove the impossibility of such a theory.
We now turn to
discussion of the requirement that repeatedly measuring the
same person P should yield a time series of statist
ically independent scores.
To
reiterate, no psychometrician will expect this to occur: repeated measurement of
the same person generally will yield a time series of sequentially dependent
scores. But is this problematic? The time series of scores provides
the
information to determine the propensity distribution characterizing person P. In
particular, the mean and variance of P’s propensity distribution have to be
determined.
This is a standard problem in the statistical analysis of time series
that has been
completely solved in case the time series is stationary (cf.
Anderson,
1971
). Hence the important requirement is not that P’s time series
should consist of statistically independent scores, but that the time series is
stationary.
Stationarity of a time se
ries implies that the series has constant mean
level and that its autocorrelation only depends upon the relative distance (lag)
between measurement occasions.
The alternative requirement that a time series has to be stationary can be tested
for in severa
l ways (cf. Priestley,
1988
)
. In case such tests indicate that the
series is non

stationary, it can be analyzed by means of special techniques such
as evolutionary spectrum analysis (Priestley,
1988
) or wavelet analysis (e.g.,
Hogan & Lakey, 2005
; Houtveen
& Molenaar, 2001
). At the close of this chapter
a new modeling technique for multivariate non

stationary time series will be
presented.
Hence from a statistical analytic point of view non

stationary time
12
series can be handled satisfactorily. Yet from the
point of view of a test theory
based on intra

individual variation,
a
person P’s
time
series of test scores
should
be stationary in order to
allow estimation of the
constant
mean and
constant
variance
of P’s time

invariant propensity distribution.
In case
P’s time series of
test scores is non

stationary, the mean and/or variance of the series will
in
general
be time

varying. Lord & Novick’s (1968) definition of true score, however,
does not pertain to time

varying propensity distributions with time

varying
means
and/or variances.
Hence either methodological or statistical techniques have to be invoked in order
to guarantee that P’s time series of test scores is stationary. Only then can the
(constant)
mean and variance of P’s time series be used as estimat
es of the
mean and variance of P’s propensity distribution. Methodological techniques can
be used to guarantee that non

stationarity due to lear
ning and habituation
is
avoided. For instance, using a common approach in reaction time research,
registration o
f P’s time series
of test scores
only should begin if P has reached a
steady state after an initial transient due to novelty effects. This will require the
availability of a pool of many parallel test items
in order to avoid learning effects
.
Statistical t
echniques can be used a posteriori to remove transient effects due to
habituation and learning from P’s time series of test scores (e.g., Molenaar &
Roelofs, 1987).
Almost certainly new methodological and statistical techniques
will have to be developed i
n order to
accommodate the intricacies due to non

stationarity and
fully exploit the possibilities of a test theory based on intra

individual variation. Until now these possibilities have not been pursued
systematically
, for the wrong reasons as has been a
rgued in this section. Given
that the psychological process underlying the generation of test scores is non

ergodic according to classical test theory based on analysis of inter

individual
variation, psychometricians will have to seriously reconsider their
reasons for not
pursuing a test theory based on intra

individual variation.
One promising psychological paradigm which allows for
straightforward
determination of person

specific
propensity distributions is mental chronometry.
In his excellent monograph
on mental chronometry, Jensen (
2006, p.96) states:
“The main reasons for the usefulness of chronometry are not only the
advantages of its absolute scale properties, but also its sensitivity and precision
for measuring small changes in cognitive functionin
g,
the unlimited repeatability
of measurements under identical procedures
, the adaptability of chronometric
techniques for measuring a variety of cognitive processes, and the possibility of
obtaining the same measurements with consistently identical tasks
and
procedures over an extremely wide age range” (italics added).
The possibility to
obtain unlimited repeated measurements under identical procedures will allow for
the determination of person

specific reaction time propensity distributions with
arbitrary
precision.
Jensen presents impressive empirical results showing the
importance of not only the intra

individual means of person

specific
reaction time
distributions, but also their intra

individual variances in assessing cognitive status
and development (
e.g., in the context of the so

called neural noise hypothesis;
13
Jensen, 2006, p.122 ff.).
Consequently, I conjecture that mental chronometry
provides a very interesting approach to pursue a test theory based on intra

individual variation.
3.3 Additional t
houghts
The impact of the fact that the ensemble of time series underlying classical test
theory is non

ergodic is enormous. Psychological tests are applied for individual
assessment in all kinds of settings. Using the population average expressed by
form
ula (3) as estimate of the intra

individual variance
2
(P) of a given person P
can lead to entirely erroneous conclusions. To give an arbitrary example:
suppose that the norm
of a test is
= 100, that the inter

in
dividual reliability
of the test is
= 0.9, and that the between

subjects variance of test scores is
var[y
i
] = 25. Suppose also that a true score which is larger than
y
C
=
120 is
considered reason for special treatment (clinical, educational, or otherwise).
Finally, suppose that person P has
observed score
y
P
= y(P,t=T) =
12
6
. Then the
Kelly
estimate (2
a
) of P’s true score
P
is:
est[
P
y
P
] = 0.9*126 + (1
–
0.9)*100 =
123.4. According to (2
b
)
the error variance of this estimated true score is:
var{est[
P
y
P
]} = 25*(1
–
0.9)*0.9 = 2.25. Hence
the standard error is 1.5 and a
commonly used confidence interval ab
out the estimated true score is
: 123.4
±
2*1.5, yielding 120.4 < est[
P
y
P
] < 126.4.
This confidence interval is entirely
located above the criterion score y
C
= 120, hence it is conclude
d that P needs
special treatment.
Suppose, however, that the intra

individual variance
2
(P)
of
P’s propensity distribution is
2
(P) = 36. Then the difference between P’s
observed score, y
P
= 126, and the criterion score for special treatment, y
C
= 120,
is
only 1 standard deviation, which according to standard statistical criteria would
not
indicate that P needs special treatment.
Numerical exercises such as the one given above can be carried out in a variety
of formats, using Monte Carlo simulation techn
iques and alternative settings. We
intend to report the results of one such a simulation study in a separate
publication. But the overall message should be clear: using the (inter

individual)
population value of the error variance (based on the inter

indiv
idual reliability) as
approximation for the intra

individual variance of a person P’s propensity
distribution is vulnerable to lead to erroneous conclusions about P’s true score
,
and, consequently, to erroneous decisions about the necessity to apply specia
l
treatment to P. The fundamental rea
son for the invalidity of
(3) as approximation
for
2
(P) is because the ensemble of time series of observed scores is non

ergodic.
4. Hidden heterogeneity
In the previous section we discussed heterogeneity with resp
ect to the means
and variances of the propensity distributions underlying classical test theory.
That
14
kind of heterogeneity can be considered to be a special instance of a much wider
class of heterogeneous phenomena, including also qualitative heterogeneit
y. An
important example of qualitative heterogeneity concerns individual differences in
the loadings in a factor model. The standard factor model
of inter

individual
covariation
is
(using bold face lower case letters for vectors and bold face upper
case le
tters for matrices)
:
(4)
y
i
=
i
+
i
, i=1,2,…
where:
y
i
= [
y
1
i
,
y
2
i
, …,
y
p
i
]’
is the p

variate vector of observed scores of a
randomly drawn subject i (the apostrophe denotes transposition);
i
=
[
1i
,
2i
, …,
qi
]’ is the q

variate vector of factor sc
ores of subject I;
i
=
[
1i
,
2i
, …,
pi
]’ is the p

variate vector of measurement errors for subject i,
and
楳⁴ie
ⱱ)

d業en獩潮a氠la瑲txf fac瑯爠road楮g献s
he fac瑯爠mode氠l映楮ier

楮i楶楤ia氠捯va物a瑩tno琠on汹⁵nde牬楥猠捬c獳楣慬⁴e獴s
瑨eo特,
but猠 fen瑲a氠業po牴an捥 much映p獹捨o汯ly⸠he 晡捴c爠rode氠捡n
beeu物獴楣s汬y a牡捴e物zed猠fo汬ow献s䥮⁴he n瑥x琠of⁴he behav楯爠獰a捥
楮i牯ruced 捴楯n′⸱Ⱐ捨oose f楸ed 瑩tende汥捴 a tf p⁶a物ab汥猠
y
which are considere
d to be indicators of a
q

variate latent factor
⸠周en 瑨e
晡捴c爠road楮i猠
牥p牥獥n琠the g牥獳楯r e晦楣ien瑳 ⁴he楮 a爠re污l楯ish楰猠
be瑷een⁴he p d楣a瑯牳nd⁴he
q

va物a瑥
污len琠晡捴cr
⸠䥴猠 ns獥n瑩t氠
a獳smp瑩tn⁵nde牬y楮i⁴he 晡捴o爠
mode氠瑨a琠th
e 晡捴c爠road楮i猠s牥nva物an琠
a捲潳猠獵bje捴献
周a琠i猬s
doe猠notepend uponⰠwhe牥⁴he bs捲cp琠椠獴ind猠
for subject i in the population; i = 1,2,… .
Hence the assumption is that each
individual person
i
in the population has a person

s
pecific
q

variate
factor score
i
and person

specific p

variate error score
i
, but the factor model for each
person in the population has the same
(p.q)

dimensional matrix of
factor loadings
⸠
Suppo獥 now⁴hat⁷e 牲yu琠a業u污l楯n expe物men琠in⁷h
楣栠ea捨⁰e牳rno琠
on汹a猠s pe牳rn

獰e捩c楣i
q

va物a瑥
晡捴c爠獣o牥nd⁰

va物a瑥 e牲o爠獣r牥Ⱐbu琠
a汳漠a pe牳rn

獰e捩f楣i獥琠o映va汵l猠fo爠rhe 晡捴c爠roadings
i
, i = 1,2,…
. Hence
each person has a person

specific factor model
:
(5)
y
i
=
i
i
+
i
, i=1,
2,…
This
h
eterogeneity of factor loadings
i
, i = 1,2,…,
constitutes a severe violation
of an important assumpt
ion underlying the
standard
factor model, namely
the
assumption
that the matrix of factor loadings should be invariant (fixed) across
subjects.
The fact that the matrix of factor loadings in (5) is subject

specific
implies that the way in which factors are expressed in the observed scores is
qualitatively different for different subjects. These inter

individual differences
in
the values of factor
scores
are called qualitative because the
substantial
interpretation
(semantic labeling) of factors is based on these loading values.
15
Despite the fact that (5) involves a severe violation of the qualitative homogeneity
assumption (invariance of factor lo
adings across subjects) underlying the
standard factor model (4), it was shown in a number of simulation studies that
factor analysis of inter

individual covariation
appears to be
insensitive to this
violation.
The typical set

up of these simulation studie
s was to generate data
according to the person

specific (qualitatively heterogeneous) factor model (5),
and then fit the standard factor model (4) to the simulated data. Although one
would expect
the fit of model (4)
to be poor due to the fact that the sim
ulated data
violate the assumption of qualitative homogeneity underlying model (4), it turns
out that this is not at all the case. The general finding in these simulation studies
is that (variants of) factor model (4) provide
(
s
)
satisfactory fits to data g
enerated
according to (variants of) factor model (5).
Satisfactory fits, that is, according to
all usual criteria of goodness

of

fit, such as the chi

squared likelihood ratio test,
standardized root mean square residual, and root mean square error of
appro
ximation
(cf. Brown, 2006, for definitions and discussion of these criteria)
.
Nowhere in the obtained (Maximum Likelihood) solutions a flag is waving
indicating that something is fundamentally wrong.
These simulation studies were
based on the cross

section
al factor model (Molenaar, 1997), the longitudinal
factor model (Molenaar, 1999) and the behavior genetical factor model for
multivariate phenotypes of MZ and DZ twins (Molenaar et al., 2003). A
mathematical

statistical proof of the insensitivity of the fa
ctor model of inter

individual covariation to
the qualitative
heterogeneity of the factor loadings is
given in Kelderman & Molenaar (2006).
Evidently, the finding that the standard factor model of inter

individual covariation
is insensitive to the presen
ce of extreme qualitative heterogeneity in the
popu
lation of subjects, created
by the person

specific matrices of factor loadings
i
, i = 1,2,…, in (5), raises serio
us questions.
To reiterate, nothing in the
results
obtained with the
standard
factor analys
es based on model (4) indicates that the
true state of affairs is in severe violation of the assumptions underlying this
model.
The standard factor models yield satisfactory fits to the data generated
according to model (5). Consequently, t
he presence of s
ubstantial qualitative
heterogeneity in the simulated data remains entirely hidden in the standard factor
analyses based on inter

individual covariation.
Before discussing some of the
consequences of this finding, it is noted that there exist a prior reaso
ns to expect
that wide

spread qualitative heterogeneity actually exists in human populations.
The reasons have to do with the way in which cortical neural networks grow and
adapt during the life span, namely by means of self

organizing epigenetic
processes
(cf. Molenaar et al.,
1993). Self

organizing growth and adaptation
give
rise to
emergent
endogenous variation in neural network connections, even
between homologous structures located at the left and right sides
of the brain
within the same subject (cf.
E
delman, 1987). In so far as cognitive information
processing is associated with cortical neural activity, one can expect that these
endogenously generated
differences in neural network architectures will become
discernable as qualitative
heterogeneity
of
t
he structure of
observed behavior
of
16
different subjects
(see Molenaar, 2006, for further elaboration
and mathemat
ical

biological modeling of these epigenetic processes
).
On
e
direct consequence of the fact that standard factor analysis of inter

individual
covariation is insensitive to qualitative heterogeneity is the following.
Suppose that the standard
q

factor model (4) yields a satisfactory fit to the data
obtained with a test composed of p subtests (e.g., items). Let est[
deno瑥 瑨e
e獴業a瑥d
ⱱ)

摩
mens楯ia氠ma瑲tx映fac瑯爠road楮i猠shusb瑡楮ed⸠Suppo獥
a汳漠瑨a琠楮 a汩瑹ⁱua汩瑡瑩teete牯rene楴i猠 牥獥nt ⁴he popu污lion映
獵b橥捴猬 ⁴hat⁴he⁴rue
ⱱ)

d業en獩潮a氠la瑲txf 晡捴c爠road楮i猠
P
for
a
given subject P differs substantially
from the nominal loading matrix est[
崮
䙯爠
楮獴anceⰠ獥ve牡氠r映瑨e⁰ubte獴猠haveega瑩te爠re牯rad楮i猠楮
P
whereas
the analogous loadings in est[
崠a牥楧hnd⁰o獩瑩se⸠O映捯u牳r
P
is unknown
in the context of standard factor analysis of inter

individual variation.
The
estimate of P’s factor score, est[
P
],
is
based on the nominal loading matrix
est[
ndⰠbe捡u獥 es瑛
崠猠a⁰oo爠rpp牯r業a瑩tn映瑨e⁴牵e
P
, this estimate
est[
P
] will be
substantially
biased.
For quantitative details about
this bias the
reader is referred to the publications mentioned above (Molenaar, 1999;
Molenaar et al., 2003; Kelderman & Molenaar, 2006).
Another consequence of the insensitivity of standard factor analysis of intra

individual variation to qualitative he
teroge
neity concerns the fact that the
semantic interpretation of factors thus obtained
is inappropriate at the person

specific level
. Suppose that
standard
factor analysis of personality test scores
yields
the
expected
pattern of factor loadings in est[
co牲e獰ond楮i⁴o⁴he B楧
䙩ve⁴heo特
捦⸠Bo牫敮au…
佳Oendor昬‱998)
.⁔ en,fⁱua汩瑡瑩teete牯rene楴i猠
p牥獥ntⰠ瑨e 晡捴c爠road楮i猠楮
P
for a particular person P may not at all conform
to the Big Five pattern and hence the semantic interpretation
of the factors for P
will be different.
Stated more specifically, the nominal semantic interpretation of
the five factors obtained in standard factor analysis is inappropriate for P.
The
reader is referred to Hamaker, Dolan, & Molenaar (2005) for
an elabor
ate
illustration based on empirical personality test scores.
5. Heterogeneity in time
To reiterate, a (psychological) process should obey two criteria in order to qualify
as an ergodic process. Firstly, the trajectory of each subject in the ensemble
sho
uld conform to exactly the same dynamical laws (homogeneity of the
ensemble). Secondly, each trajectory should have constant statistical
characteristics in time (stationarity, i.e., constant mean level and serial
dependencies
which only depend upon relativ
e time differences
). In the previous
sections attention has been confined to psychological processes which are non

ergodic because t
hey violate the first criterion, i.e., heterogeneity
of different
trajectories in the ensemble.
Whereas the first criterion
involves a comparison
between different trajectories, the second stationarity criterion involves
comparison of the same
trajectory at different times.
In this section we will
17
consider psychological processes which are non

ergodic because the
y
violate
the s
econd criterion, i.e., they are non

stationary
.
In general, non

stationarity implies that parameters of a dynamic system are
time

varying. Prime examples of non

stationary systems are developmen
tal
systems
which typically have time

varying parameters suc
h as waxing and/or
waning factor loadings
.
For this reason developmental systems are non

ergodic
and their analysis should be based on intra

individual variation.
There exists a
long tradition in theoretical developmental psychology in which it is argued t
hat
developmental processes should be analyzed at the level of intra

individual
variation (time series data). The general denotation for this tradition is
Developmental Systems Theory (DST). Important contributions to DST include
Wohlwill’s (1973) monograp
h on the concept of developmental functions
describing intra

individual variation, Ford and Lerner’s (1992) integrative
approach based on the interplay between intra

individual variation and inter

individual variation and change, and Gottlieb’s (1992, 2003
) theoretical work on
probabilistic epigenetic development.
Intra

individual analysis of non

stationary multivariate time series requires the
availability of sophisticated statistical modeling techniques. We developed
such a
technique
based on a
systems
model wi
th arbitrarily time

varying
par
ameters
(
Molenaar, 1994;
Molenaar & Newell, 2003)
. Our model can be conceived of as a
suitab
ly generalized factor model for
non

stationary
p

variate time series
y
(t), t =
1,2,...,T. Its schematic form is
:
(
6
a
)
y
(t)
=
⡴(
⡴(
⡴(
†
⠶
b
)
⡴(1
⤠)
⡴(
⡴( +
⡴(1)
†
⠶
c
)
⡴(1⤠)
⡴(
⡴(1)
†
䥮
⠶
a
)
y
(t) denotes the
observed p

variate time series,
⡴(
楳i
瑨eⁱ

癡物a瑥 tent
晡捴c爠獥物es
獹獴sm
s瑡瑥 p牯捥獳s
Ⱐand
⡴(猠 he p

va物a瑥 a獵reme
n琠e牲o爠
p牯捥獳
⸠周e
晡捴c爠road楮i猠楮
⡴(崠depen
d⁵pon
the
r

va物a瑥
瑩te

vary楮i
pa牡mete爠
ve捴c爠
⡴(⸠
⠶
b
) describes the evolution of the latent factor series
⡴(
byeansf aⁱ

va物a瑥 ocha獴楣sdi晦e牥n捥 equa瑩tn
auto牥r牥獳楯r⤠)e污l楮i
(
琫ㄩ⁴o
⡴(Ⱐwhe牥
(琫ㄩeno瑥猠瑨eⁱ

va物a瑥 獩sua氠p牯捥獳
.
周e
)

d業en獩潮a氠la瑲txfeg牥獳楯r⁷e楧h瑳t
(琩崠tepend猠spon⁴he
r

va物a瑥
瑩te

vary楮i⁰a牡re瑥爠re捴o爠
⡴(⸠
⠶
c
)
describes the time

dependent variation
of
the
unknown paramete
rs. The r

variate parameter vector process
⡴(bey猠s
獰e捩慬c
獴o捨a獴楣sdi晦e牥r捥qua瑩tn㨠a ndom⁷a汫⁷楴i
r

va物a瑥
楮iova瑩tn猠
p牯捥獳s
⡴(
⸠
he 獴sm of equa瑩tn猠
⠶
a
), (6
b
) and (6
c
) allows for the modeling of a large
class of multivariate n
on

stationary (non

ergodic) processes. Equations (6
a
) and
(6
b
) have the same formal structure as the
well

known
inter

individual longitudinal
18
q

factor model,
which helps in their interpretation. Yet the system of
equations
(6
a
), (6
b
) and (6
c
) is
applied to
analyze the structure of intra

individual variation
underlying the observed p

variate time series
y
(t)
obtained with a single subject.
Generalization of this model to accommodate multivariate time series obtained in
a replicated time series design is stra
ightforward. Also extension of the model
with arbitrary mean trend functions and covariate processes having time

varying
effects is straightforward.
The fit of equations (6
a
), (6
b
) and (6
c
) to an observed p

variate time series
y
(t),
t=1,2,...,T, where T i
s the number of repeated measurements obtained with a
single subject P, is based on advanced statistical analysis techniques taken from
the engineering sciences (
Bar

Shalom
et al., 2001
; Ristic et al., 2004).
It
consists of a combination of recursive esti
mation (filtering), smoothing, and
iteration
(EKFIS: Extended Kalman Filter with Iteration and Smoothing). The
EKFIS yields a time series (trajectory) of estimated values for each of the r
parameters in
⡴(
㨠
k
(t), t=1,2,...,T
; k=1,2,...,r.
To illustra
te the performance of the EKFIS, the following small simulation study
has been carried out.
A 4

variate
(p = 4)
time series
y
(t)
has been generated by
means of the state

space model with time

varying parameters
(6
a
), (6
b
) and (6
c
)
.
The model has a univaria
te
(q = 1)
latent state process
(t)
. The autoregressive
coefficient
⡴(
崠=⡴(
楮i瑨e⁰牯re獳smode氠
⠶
b
)
for the latent state
increases
linearly from 0.0 to 0.9 over the observation interval comprising
T =
100 time
points
: b(t) = 9t/1000, t=1,2,…
,
100.
.
Hence the sequential dependence
(autocorrelation) of the latent state process (latent factor series) increases from
zero to 0.9 across 100 time points and therefore is highly time

varying (non

stationary, hence non

ergodic).
Depicted
in Figure 1
is the e
stimate of this
autoregressive weight
b(t)
obtained by means of the
EKFIS
based on a single
subject time series
y
(t), t=1,2,...,100
. It is clear that the estimated trajectory
closely tracks the true time

varying path of this parameter.
19
6. Discussion and conclusion
In this chapter some of the implications of the classical ergodic theorems have
been considered in the contexts of classical test theory, factor analysis of inter

individual covariation, and the analysis of non

sta
tionary developmental
processes.
In each of these contexts the classical ergodic theorems imply that
instead of using standard statistical approaches based on analysis of inter

individual variation
, it is necessary to use single

subject time series analysi
s of
intra

individual variation. This conclusion holds for individual assessment based
on classical test theory, for testing the assumption of homogeneity (fixed factor
loadings across subjects) in factor analysis of inter

individual covariation, and for
t
he analysis of non

stationary processes such as learning and developmental
processes.
20
The consequences of the classical ergodic theorems in these and many other
contexts in psychology imply that time series designs and time series analysis
techniques wil
l have to be assigned a much more prominent place than is
currently the case in psychological methodology. The overall aim of scientific
research in psychology still
should be
to arrive at general (nomothetic) laws that
hold for all subjects in a well

defi
ned population. But the inductive tools to arrive
at such general laws have to be fundamentally different from the currently
standard approaches
for those
psychological processes
which
are non

ergodic.
Only if a psychological process is ergodic, i.e., obey
s the two criteria of
homogeneity and stationarity, can results obtained by means of analysis of inter

individual variation be generalized to the level of intra

individual variation. But the
two criteria for ergodicity are very strict and many psychologica
l processes of
interest will fail to obey these criteria. Psychologists have to understand that
ergodicity is the special case, whereas non

ergodicity is the rule. For non

ergodic
psychological processes analysis of
inter

individual variation
yield
s
result
s that
may not apply to any of the individual subjects in the population of subjects.
In conclusion, the inductive tools which are necessary to arrive at general
(nomothetic) laws for non

ergodic processes involve the search for
communalities between sin
gle

subject process models fitted to time series data
obtained in replicated time series designs.
The latter search for communalities
between single

subject process models can be based on standard mixed
modeling techniques (see
the excellent textbook of De
midenko, 2004).
Having available appropriate time series models for each individual subject
opens up possibilities which are entirely new in psychology. These possibilities
involve the optimal control of ongoing psychological processes. For instance,
con
sider th
e fo
llowing special instance of the system of equations
(6
a
), (6
b
)
:
(7
a
)
y
(t) =
⡴(
⡴(
†
⠷
b
)
⡴(1⤠)
⡴(
u
(t) +
⡴(1)
䡥牥⁴he meef
i
n楴楯i猠spp汹猠fo爠equa瑩on猠
⠶
a
), (6
b
). Notice that in
(
7
a
) and
(
7
b
) the (p,q)

dimension
al matrix of factor loadings
and the (q,q)

dimensional
matrix of regression weights
are assumed to be constant in time. This is to
ease the presentation; generalization of what follows to the non

stationary model
given by
(6
a
), (6
b
) and
(6
c
) is straigh
tforward. Notice also that
(
7
b
) contains a new
component:
u
(t)
.
The process s

variate process
u
(t) represents a know
n
process
that can be manipulated
; for instance dose of medication
or
environmental stimulation.
is a
(q,s)

dimensional matrix of regress
ion weights.
Suppose that
(
7
a
) and
(
7
b
) provide a faithful description of the p

variate time
series
y
(t)
for subject P. It then is possible to determine
u
(t)
in such a way that
the state process
⡴(
i猠獴ee牥d⁴o瑳 de獩sed ve氠
#
, where
#
is chose
n by the
21
controller. The optimal input
u
@
(t)
is determined according to the following
schematic feedback function:
(8)
u
@
(t)
=
F
[
y
(t),t]
where
F
[.] denotes an (s,p)

dimensional nonlinear feedback function. Application
of
u
@
(t)
at time t guarantees that t
he state process
⡴
⬱
)
a琠瑨eex琠瑩te po楮琠
琫ㄠw楬氠le猠捬s獥s po獳楢汥 瑯⁴he de獩sed ve氠
#
.
Optimal control is an important field of research in the engineering sciences.
There exists a vast literature on many different variants of optimal c
ontrol (cf.
Kwon, 2005, for
a thorough explanation of the currently most advanced
approaches). These control techniques can be applied straightforwardly in
analyses of intra

individual variation in order to steer psychological processes in
desired directio
ns (cf. Molenaar, 1987, for an application to the optimal control of
a psychotherapeutic process). This opens up an entirely new
promising
field of
applied psychology
: person

specific modeling and
adaptive
control of ongoing
psychological processes.
References
Anderson, T.W. (1971).
The statistical analysis of time series
. New York: Wiley.
Bar

Shalom, Y., Li, X.R., & Kirubarajan, T. (2001). Estimation with applications to
tracking and navigation. New York: Wiley.
22
Birkhoff, G.D. (1931)
. Proof of the ergodic theorem.
Proceedings of the National
Academy of Sciences USA
,
17
, 656

660.
Borkenau, P., & F. Ostendorf, (1998). The Big Five as states: How useful is the
five

factor model to describe intra

individual variations over time?
Journal
of
Personality Research
,
32
, 202

221.
Brown, T.A. (2006).
Confirmatory factor analysis for applied research
. New York:
Guilford Press.
Cattell, R.B. (1952). The three basic factor

analytic designs
–
Their interrelations
and derivatives.
Psychological Bu
lletin
,
49
, 499

520.
Choe, G.H. (2005).
Computational ergodic theory
. Berlin: Springer.
De Groot, A.D. (1954).
Scientific personality diagnosis.
Acta Psychologica
,
10
,
220

241.
Demidenko, E. (2004).
Mixed models: Theory and applications
. Hoboken, NJ:
Wi
ley.
Edelman, G.M. (1987).
Neural Darwinism: The theory of neuronal group
selection
. New York: Basic Books.
Ford, D.H., & Lerner, R.M. (1992).
Developmental systems theory
. Newbury
Park: Sage.
Gescheider, G.A. (1997).
Psychophysics: The fundamentals
. Ma
hwah, NJ:
Erlbaum.
Gottlieb, G. (1992).
Individual development and evolution: The genesis of novel
behavior
. New York: Oxford University Press.
Gottlieb, G. (2003). On making behavioral genetics truly developmental.
Human
Development
,
46
, 337

355.
Hamak
er, E.L., Dolan, C.V., & Molenaar, P.C.M. (2005).
Statistical modeling of
the individual: Rationale and application of multivariate time series analysis.
Multivariate Behavioral Research
,
40
, 207

233.
Hogan, J.A., & Lakey, J.D. (2005).
Time

frequency and
time

scale methods:
Adaptive decompositions, uncertainty principles, and sampling
. Boston:
Birkh
ä
user
23
Houtveen, J.H., & Molenaar, P.C
.M. (2001).
Comparison between the Fourier
and wavelet methods of spectral analysis applied to stationary and non

stationa
ry heart period data.
Psychophysiology
,
38
, 729

735.
Kelderman, H., & Molenaar, P.C.M. (2006).
The effect of individual differences in
factor loadings on the standard factor model (to appear in
Multivariate Behavioral
Research
).
Jensen,
A.R. (2006).
Cloc
king the mind: Mental chronometry and individual
differences
. Amsterdam: Elsevier.
Kwon, W.H. (2005).
Receding horizon control: Model predictive control for state
models
. London: Springer.
Lord, F.M., & Novick, M.R. (1968).
Statistical theories of mental
test scores
.
Reading, MA: Addison

Wesley.
Molenaar, P.C.M. (1987), Dynamic assessment and adaptive optimization of the
therapeutic process.
Behavioral Assessment
,
9
, 389

416.
Molenaar, P.C.M., & Roelofs, J.W. (1987).
The analysis of multiple habituation
profiles of single trial evoked potentials.
Biological Psychology
,
24
, 1

21.
Molenaar, P.C.M., Boomsma, D.I., & Dolan, C.V. (1993).
A third source of
developmental differences.
Behavior Genetics
,
23
, 519

524.
Molenaar, P.C.M. (1994). Dynamic latent vari
able models in developmental
psychology. In: A. von Eye & C.C. Clogg (Eds.),
Analysis of latent variables in
developmental research
. Newbury Park: Sage, pp. 155

180.
Molenaar, P.C.M. (1997). Time series analysis and its relationship with
longitudinal anal
ysis.
International Journal of Sports Medicine
,
19
, 232

237.
Molenaar, P.C.M. (1999). Longitudinal analysis. In: H.J. Ader & G.J. Mellenbergh
(Eds.),
Research methodology in the social, behavioral and life sciences
.
London: Sage, pp. 143

167.
Molenaar, P
.C.M., Huizenga, H.M., & Nesselroade, J.R. (2003).
The relationship
between the structure of interindividual and intraindividual variability: A
theoretical and empirical vindication of Developmental Systems Theory. In: U.M.
Staudinger & U. Lindenberger (Ed
s.),
Understanding human development:
Dialogues with life

span psychology
. Dordrecht: Kluwer, pp. 339

360.
Molenaar, P.C.M. (2003).
State space techniques in structural equation
modeling: Transformation of latent variables in and out of latent variable mo
dels
.
111 pages.
Website:
http://www.hhdev.psu.edu/hdfs/faculty/molenaar.html
24
Molenaar, P.C.M., & Newell, K.M. (2003). Direct fit of a theoretical model of
phase transition in oscillator
y finger motions.
British Journal of Mathematical and
Statistical Psychology
,
56
, 199

214.
Molenaar, P.C.M. (2004). A manifesto on psychology as idiographic science:
Bringing the person back into scientific psychology, this time forever.
Measurement
,
2
, 2
01

218.
Molenaar, P.C.M. (2006).
On the implications of the classic ergodic theorems:
Analysis of developmental processes has to focus on intra

individual variation
(submitted).
Nayfeh
, A.H., & Balachandran, B. (1995
)
.
Applied nonlinear dynamics: Analyt
ical,
computational, and experimental methods
. New York: Wiley.
Petersen, K.
Ergodic theory
. Cambridge: Cambridge University Press.
Priestley, M.B. (1988).
Non

linear and non

stationary time series analysis
.
London: Academic Press.
Ristic, B., Arulampal
am, S., & Gordon, N. (2004).
Beyond the Kalman filter:
Particle filters for tracking applications
. London: Artech House.
Walters, P. (1982).
An introduction to ergodic theory
. 2
nd
edition. New York:
Springer.
Wohlwill, J.F. (1973).
The study of behavioral development
. New York: Academic
Press.
Comments 0
Log in to post a comment