© Michael Thomas, 2007
1
www.psyc.bbk.ac.uk/research/DNL/stats/Thomas_trajectories.html
Worksheet on using SPSS to analyse and compare cross

sectional developmental
trajectories
The following
worksheet steps through the use of SPSS to characterise cross

sectional
developmental trajectories. In particular, we focus on comparing typically developing
trajectories with those derived from a group of individuals with a developmental
disorder. This w
orksheet accompanies the submitted paper: Thomas, M. S. C., Annaz,
D., Ansari, D., Scerif, G., Jarrold, C., & Karmiloff

Smith, A. (2007).
The use of
developmental trajectories in studying genetic developmental disorders
.
In the first section, we begin by
characterising a single developmental trajectory
,
including generating and plotting confidence intervals around the trajectory, checking
for outliers, assessing the linearity of the trajectory, and comparing goodness

of

fit of
different linear and non

line
ar functions. Readers familiar with linear regression
methods may wish to skip this section.
Focusing on the use of linear methods, we then introduce a
between

groups
comparison of trajectories
that allows one to evaluate whether developmental
trajectorie
s generated from different groups differ significantly in terms of their
gradients or intercepts. We contrast comparisons between the groups for trajectories
plotted according to chronological age versus those plotted according to mental age.
In cases whe
re the typically developing group produces a reliable trajectory but the
disorder group does not, we then offer a new method to distinguish between two
different types of null trajectory in the disorder group: a
zero trajectory
, where there
is no improveme
nt with age; and
no

systematic

relationship
, where the task
performance is essentially random with respect to the participant’s age.
In the third section, we show how SPSS can be used to carry out
repeated measures
linear regressions
to compare two trajec
tories generated by a single group based on
performance in two tasks. With these trajectories, we also show how confidence
intervals can be used to demonstrate the age at which the trajectories for the two tasks
reliably converge or diverge.
In the fourth
section, we demonstrate the use of SPSS to analyse
mixed design
regressions
, for example with one between

groups factor and one repeated measure.
For example, where the typically developing group is characterised by a divergence in
development on two task
s, one might want to assess whether a disorder group
demonstrates the same pattern of divergence.
Where appropriate, analyses are illustrated with worked examples using sample data.
Charts are mostly generated within Excel and statistical results in SPSS
12.0 for
Windows.
© Michael Thomas, 2007
2
1. Characterising a single developmental trajectory
The sample data in
sample TD trajectory.sav
contain a single cross

sectional
developmental trajectory for a sample of 25 typically developing children, aged from
2 years and 9 months
(2;9) to 12;5. The data depict age in months of the children and
the accuracy of their performance on a given experimental task across the age range.
Using the Analyze

Regression

Linear function, SPSS demonstrates that a straight line
produces a reliabl
e fit to these data (R
2
= .880, F(1, 24) = 168.20, p<.001). Inspection
of the (unstandardized) regression coefficients reveals an intercept (constant) of .13
and a gradient of .006
The linear developmental trajectory, predicting task performance based on
the
chronological age of the children, is therefore
13
.
006
.
CA
Task
Here is an Excel chart of these data (employing the XY (Scatter) chart function; after
the chart was created, the Add Trendline function under the Chart menu was used to
add a b
est

fit linear trend; the Options tab in this dialogue permits display of the R
2
value and the regression equation on the chart).
y = 0.0061x + 0.0135
R
2
= 0.881
0%
20%
40%
60%
80%
100%
30
50
70
90
110
130
150
170
Chronological age (months)
Task performance
TD group
© Michael Thomas, 2007
3
Illustrative Excel dialogues:
Several further pieces of information are important to explor
e a single trajectory.
First, we wish to find out whether any of the data points exerts undue influence on the
trajectory (i.e., constitutes an outlier). For this, we use Cook’s distance (Cook’s D).
Second, we wish to assess the reliability of the paramete
rs (the intercept and
gradient). Third, we may wish to generate confidence intervals for the trajectory itself
(i.e., the region within which the best

fit line falls with 95% confidence). To generate
the additional bits of information, in the Linear Regres
sion dialogue, click on the
Statistics button and make sure both Estimates and Confidence intervals are selected.
Click on the Save button and select Cook’s under Distances and Mean under
Prediction Intervals (95% confidence is offered as the default value
; this may be
changed as desired). Then run the regression again.
© Michael Thomas, 2007
4
The Save function in the dialogue has added three additional columns of data to the
SPSS Data Editor. The first is Cook’s distance (COO_1). Cook’s D combi
nes
diagnostic information about
Distance
(useful for identifying potential outliers in the
dependent variable, here Task) and
Leverage
(useful for identifying potential outliers
in the independent variable, here CA) to identify unusually influential obser
vations on
the regression line (see Howell, 2007, p.516

520 for discussion). Cook’s D assesses
how much the residuals of all cases would change if a particular case were to be
excluded from the calculation of the regression coefficients. This is repeated f
or each
case so that each data point is assigned a value measuring influence. There is no
general rule for what value of Cook’s D definitively indicates that a given point is an
outlier. However, as a rule of thumb, a Cook’s D of over 1.00 suggests that a
data
point exerts undue influence on the regression. In this case, the analysis should be re

run without the data point in question.
Note, however, that unless there are
a priori
grounds to exclude this participant’s data
from the analysis (e.g., if exist
ing experimenter notes indicate that the child was not
paying attention during testing or there was an equipment malfunction), then the
© Michael Thomas, 2007
5
results must be reported for the analyses both with and without the identified data
point(s). Treatment of variability i
n disorders is an important issue, since
behaviourally defined developmental disorders frequently exhibit marked variability,
while variability is also found in disorders where an independent genetic diagnosis is
available (see Thomas, 2003, for discussion
and related analytical techniques). For the
sample trajectory, no value of Cook’s D exceeds 0.2 and so no value is identified as a
potential outlier.
The two additional variables (LMCI_1, UMCI_1) created by the Save function are the
lower mean confide
nce interval and the upper mean confidence interval for the
dependent variable Task for each value of the predictor (i.e., for each age). The
‘mean’ confidence interval represents the region within which there is 95% chance
that the actual mean (trendline)
sits. Note that SPSS also gives you the option to save
the ‘individual’ confidence intervals. These demarcate the region within which there
is 95% probability that the individual data points sit (these confidence intervals are
typically wider).
Confidenc
e intervals can be added to the Excel chart by adding (CA, LMCI_1) and
(CA, LMCI_2) as two additional X

Y scatter series. Click on your original chart to
select it; under the Chart menu, select Chart

Source, select the Data

Series tab; click
on the Add but
ton and select the column of CA values and the column of LMCI_1
values as the X values and Y values respectively; repeat for CA and LMCI_2. To
present these as thin lines as depicted below, select the data series (click on any
point), right

click to Format
data series, select None under Marker, and select Custom
under Line, then select desired line format (thin, dotted, etc.). (Note that due to a bug
in the charting function in Excel, we have sometimes found that the data series need
to be ordered with the
ages low

to

high on the spreadsheet for these to come out
© Michael Thomas, 2007
6
nicely. No idea why. This can be done by selecting the full chart data range, using the
Data

Sort function, and sorting in ascending mode by the column that contains age).
Excel d
ialogues:
Lastly, in SPSS, selecting ‘confidence intervals’ under Statistics in the Linear
Regression dialogue provides information on the upper and lower bounds of the
intercept and gradient of the developmental trajectory. The results ar
e shown in the
below table. For our sample trajectory, since the upper and lower bounds of the
confidence interval on the intercept span zero (

.076 to .102), this indicates that the
intercept is not significantly different from zero (reflected in the non

significant t

test
result on this coefficient). By contrast, the gradient is reliably greater than zero,
indicating improvement with age.
y = 0.0061x + 0.0135
R
2
= 0.881
0%
20%
40%
60%
80%
100%
30
50
70
90
110
130
150
170
Chronological age (months)
Task performance
TD group
Lower mean CI
Upper mean CI
© Michael Thomas, 2007
7
Linearity and model comparison
The sample data represent a trajectory that is reasonably linear. What if we are n
ot
confident that a simple line gives a best fit to the data? We can, of course, check that
the residuals (the difference between the actual performance of each individual and
the performance that is predicted by the trajectory given each individual’s age)
are
normally distributed and do not vary systematically across the age range. These
indicators (along with the R
2
) would warn if a linear function does not capture the
cross

sectional trajectory very well. SPSS, however, also allows us to assess whether
a
nother non

linear function fits the data better.
SPSS permits multiple functions to be simultaneously fitted to the same set of data
points using the Analyze

Regression

Curve Estimation Function. Parameter
information and proportion of variance explained
can be derived for several functions
at once, linking performance (Y) to age (t) using parameters b0, b1, b2 . . . in the
following ways:
(Taken from SPSS 12.0 for Windows help function under ‘Curve Estimation’)
Linear
. Model whose equation is Y = b0 +
(b1 * t). The series values are modeled as a
linear function of time
Logarithmic
. Model whose equation is Y = b0 + (b1 * ln(t))
Inverse
. Model whose equation is Y = b0 + (b1 / t)
Quadratic
. Model whose equation is Y = b0 + (b1 * t) + (b2 * t**2). The qu
adratic model
can be used to model a series which "takes off" or a series which dampens
Cubic
. Model defined by the equation Y = b0 + (b1 * t) + (b2 * t**2) + (b3 * t**3)
Power
. Model whose equation is Y = b0 * (t**b1) or ln(Y) = ln(b0) + (b1 * ln(t))
C
ompound
. Model whose equation is Y = b0 * (b1**t) or ln(Y) = ln(b0) + (ln(b1) * t)
S

curve
. Model whose equation is Y = e**(b0 + (b1/t)) or ln(Y) = b0 + (b1/t)
Logistic
. Model whose equation is Y = 1 / (1/u + (b0 * (b1**t))) or ln(1/y

1/u)= ln (b0) +
(ln
(b1)*t) where u is the upper boundary value. After selecting Logistic, specify the upper
boundary value to use in the regression equation.
The value must be a positive number,
greater than the largest dependent variable value [for our sample data, largest
value of
Task, e.g., choose 101]
Growth
. Model whose equation is Y = e**(b0 + (b1 * t)) or ln(Y) = b0 + (b1 * t)
Coefficients
a
.013
.043
.305
.763
.076
.102
.006
.000
.938
12.970
.000
.005
.007
(Constant)
CA
Model
1
B
Std. Error
Unstandardi zed
Coeffi ci ents
Beta
Standardi zed
Coeffi ci ents
t
Si g.
Lower Bound
Upper Bound
95% Confi dence Interval for B
Dependent Vari abl e: task
a.
© Michael Thomas, 2007
8
Exponential
. Model whose equation is Y = b0 * (e**(b1 * t)) or ln(Y) = ln(b0) + (b1 * t)
A plot of these fits can also be obtained by click
ing on the Plot models box:
For our sample trajectory, most of the functions give a reliable fit to the data. Here are
their R
2
(taken from the ANOVA table for each function in the SPSS output; see
below):
Function
R
2
Parameters estimated
(
b0, b1, etc.)
Linear
.87972
2
Logarithmic
.84760
2
Inverse
.75463
2
Quadratic
.87975
3
Cubic
.87979
4
Compound
.81382
2
Power
.85834
2
S
.84181
2
Growth
.81382
2
Exponential
.81382
2
Logistic
.81435
2
© Michael Thomas, 2007
9
Cubic gives the highe
st R
2
, while Compound, Growth, and Exponential give the
lowest.
How do we decide which is the best function to fit the data? Here, the heuristic of
parsimony comes in. We want to explain
the most amount of variance using the least
number of parameters
. Fo
r example, although the Quadratic function fits the data
better than the Linear (i.e., has a larger R
2
), it does so with one more parameter; the
Cubic fits marginally better than the Quadratic but uses one more parameter again.
There are two statistical t
ests that can be used to determine which model/function to
choose in these cases. One method, called the ‘extra sum

of

squares’ test, is only
applicable for
nested models
. A nested model is one where one model is a subset of
the other, that is, the first m
odel is a version of the second model but with one of the
parameters set to zero. Thus the Linear model is a version of the Quadratic model
with the
x
2
coefficient set to zero, and a version of the Cubic model with the
x
2
and
x
3
coefficients set to zero. N
ested models will have different degrees of freedom. The
extra sum

of

squares approach derives an
F

ratio from the relative increase in the
sum

of

squares and the relative increase in the degrees of freedom reflecting the
number of parameters used (this in
formation is available in the ANOVA table for
each regression fit). These two values are played off against each other, where an
increase in model fit is good and an increase in parameters is bad. For regression fits 1
and 2, the equation is
2
/
2
1
2
/
2
1
DF
DF
DF
SS
SS
SS
F
where
SS
stands for sum

of

squares and
DF
for degrees of freedom. This
F

ratio has
DF1

DF2
degrees of freedom for the numerator and
DF2
degrees of freedom for the
denominator (see Motulsky & Christopoulos, 2004, for more details).
© Michael Thomas, 2007
10
For example,
let us compare the Linear and Cubic fits to the sample data. The SPSS
printouts provide the sum

of

squares and degrees

of

freedom information (relevant
information highlighted in
blue
):
Dependent variable.. task Method.. LINEAR
Listwise De
letion of Missing Data
Multiple R .93793
R Square .87972
Adjusted R Square .87449
Standard Error .07629
Analysis of Variance:
DF Sum of Squares Mean Square
Regression
1 .97897
124
.97897124
Residuals
23 .13385276
.00581969
F = 168.21722 Signif F = .0000

Variables in the Equation

Variable B SE B Beta T Sig T
C
A .006060 .000467 .937933 12.970 .0000
(Constant) .013136 .043018 .305 .7628
Dependent variable.. task Method.. CUBIC
Listwise Deletion of Missing Data
Multiple R .93797
R Square .87979
Adjusted R Square .86261
Standard Error .07981
Analysis of Variance:
DF Sum of Squares Mean Square
Regression
3 .97904851
.32634950
Residuals
21 .13377
549
.00637026
F = 51.23016 Signif F = .0000

Variables in the Equation

Variable B SE B Beta T Sig T
CA .007201 .011874 1.114523
.606 .5507
CA**2

1.254574265500E

05 .000141

.351645

.089 .9298
CA**3 4.212775775628E

08 5.1479E

07 .179103 .082 .9356
(Constant)

.017479 .303698

.058 .9546
_
For this model comparison, an F

t
est produces the following outcome: F(2,3)=.0009,
p=.999. This indicates that the greater number of parameters in the Cubic model is
much more expensive than the slightly greater fit to the data that the function
provides. Therefore the simpler Linear mode
l is the better model.
For non

nested models, the ‘extra sum

of

squares’ method is not applicable. This
becomes evident if one tries to compare models that have the same number of
parameters. The formula requires the denominator to calculate the differenc
e in
© Michael Thomas, 2007
11
degrees of freedom but the difference (
DF1

DF2
) is now zero. Division by zero
generally causes things to go badly wrong.
In the case of non

nested models, a
hypothesis testing
approach to comparing the
models may be replaced by one drawn from
informa
tion theory
. This technique
employs Akaike’s Information Criterion (see Motulsky & Christopoulos, 2004, p.143,
for further details). In this case, the result of the test does not indicate
the likelihood of
the more complicated model fitting the data better
by chance
. Instead, it computes the
relative likelihood of each model being correct
.
The Akaike’s Information Criterion (
AIC
) for each model is
K
N
SS
N
AIC
2
ln
where
N
is the number of data points,
K
is the number of parameters fit by the
regre
ssion plus one, and
SS
is the sum of the squares value taken from the regression
equation. Let us compare the
AIC
scores for the Linear and Logistic models for the
sample data, both of which fit two parameters (‘CA’ and ‘constant’ below).
Dependent varia
ble.. task Method.. LGSTIC
Listwise Deletion of Missing Data
Multiple R .90241
R Square .81435
Adjusted R Square .80628
Standard Error .20389
Analysis of Variance:
DF Sum of Squar
es Mean Square
Regression 1
4.1941911
4.1941911
Residuals 23
.9561769
.0415729
F = 100.88760 Signif F = .0000

Variables in the Equation

Variable
B SE B Beta T Sig T
CA
.987535 .001233 .405590 800.739 .0000
(Constant)
6.000506 .689910 8.698 .0000
Applying the equation,
AIC
for Linear fit =

124.7
AIC
for Logist
ic fit =

75.6
The model with the smallest
AIC
value is most likely to be correct
–
in this case, the
Linear model.
Excel formulae for computing the
extra sum

of

squares test
and
Akaike’s Information
Criterion
can be found
here
(including use of correcte
d
AIC
values and the way to
© Michael Thomas, 2007
12
compute the relative probability that you are correct if you choose one or other
model).
What do you do if a non

linear function provides a better fit to the data? In the
following, we focus on linear methods to compare traject
ories. The primary
motivation for this is that linear methods render interaction terms more interpretable
and thus allow us to distinguish different types of descriptive delays. It is a practical
rather than a theoretical decision, since there is no requir
ement that development
should occur at a constant rate, and in many cases does not do so (e.g., the rate of
vocabulary acquisition is children is famously non

linear). However, the methods
presented with linear functions are in principle extendible to non

linear regression
methods, where for example, differences in the intercept parameter can index delays
in onset and other parameters can index differences in rates of non

linear growth (see
Motulsky & Christopoulos, 2004, for a review of non

linear regressi
on methods with
biological data).
Linear methods may be used in cases where the relationship between age and
performance is non

linear by transforming either or both of these dimensions to
improve the linearity of the relationship between them (so long as
the transformation
is applied to both typical and disorder group). Alternatively, subsections of the full
non

linear trajectory may be explored where development appears to be more linear.
For example, in cases where there are early floor or late ceiling
effects, the portion of
the trajectory between floor and ceiling may be more linear. Thus, for an S

shaped
curve, only the central part of the trajectory might be considered with linear methods.
The more restricted analysis would allow you to identify diff
erences in the average
age that experimental groups reach ceiling performance on the task. Analyses run
over subsets of the experimental group will, of course, compromise statistical power.
We now turn to consider methods to compare the developmental traj
ectory generated
by a disorder group to the typically developing profile.
© Michael Thomas, 2007
13
2. Between

group comparisons of two developmental trajectories
The SPSS data file
sample TD disorder between group.sav
contains data for two
groups, the typically developing (TD) g
roup of 25 children with ages spanning from
2;9 to 12;5, and a group of 16 children with a developmental disorder with
chronological ages ranging from 5;4 to 11;2. Note that all children have also been
given a standardised test to produce a mental age (i.e
., a test age equivalent score).
For the TD group, their mental ages range from 3;3 to 12;10, with the average MA 4.7
months in advance of CA (that is, in advance of the sample of TD kids on whom the
standardised test was normed. Information about sampling
would be required to
decide whether the TD or norming sample is in some sense any more or less
‘normal’).
For the disorder group, MAs range from 4;7 to 10;4, with the average MA 18.1
months behind CA. In the SPSS file, group is encoded with the Group va
riable (coded
1 for TD, 2 for disorder in our example). CA and MA are coded in months and task
performance is coded in proportion correct on the experimental task.
Note that, by design, the TD groups age range spans from the youngest mental age of
the dis
order group on any of the standardised tests used to assess this group, to the
oldest CA of the disorder group. This is because it is only sensible to compare
developmental trajectories for overlapping chronological or mental age ranges.
Comparing non

over
lapping trajectories necessitates extrapolating a prediction of task
performance for one or other group outside of the age or ability range over which
performance has been measured.
© Michael Thomas, 2007
14
Charting in Excel, these trajectories are as fo
llows:
Are the individuals with the disorder performing on the task as you would expect
given their chronological age? One way to answer this question is on a case

by

case
basis. This can be done using the confidence intervals arou
nd the TD trajectory
derived in the previous section. For each child with the disorder, we can see whether,
when their performance is plotted on the chart according to their chronological age,
the data point falls within the 95% confidence intervals around
the TD trajectory.
However, our intention here is to characterise a developmental trajectory for the
disorder group as a whole, given that we have reasonable participant numbers (N=16)
across an age range. We therefore need to generate a trajectory for t
he disorder group
and compare it to the TD trajectory. Are the two trajectories significantly different,
and if so, in what way?
SPSS does not include a direct method to compare linear regressions. Instead, we
need to adapt the Analysis of Covariance func
tion within the General Linear Model.
Assuming we have already verified the approximate linearity of the disorder
trajectory on its own, we begin by comparing the developmental trajectories for the
two groups as they are predicted by chronological age. Se
lect Analyze

General Linear
Model

Univariate. Add Task to the Dependent Variable box, Group to the Fixed
Factor(s) box, and CA to the Covariate(s) box. (The Save dialogue may be used to
generate Cook’s D values for the two trajectories if this has not been
done
previously).
y = 0.0061x + 0.0135
R
2
= 0.881
y = 0.0022x + 0.1181
R
2
= 0.104
0%
20%
40%
60%
80%
100%
30
50
70
90
110
130
150
170
Chronological age (months)
Task performance
TD group
Disorder group
© Michael Thomas, 2007
15
Importantly, as it stands, the SPSS ANCOVA function has a default configuration to
‘partial out’ differences in the dependent variable due to differences in the covariate.
For example, the ANCOVA is freq
uently used in behavioural studies where one
wants to ‘partial out’ differences in IQ and so focus on differences in performance
that solely arise from manipulation of the independent variables. In ‘partialling out’
the influence of the Group factor is eff
ectively evaluated after each participant’s
dependent variable score (performance) has been divided by their covariate score
(age) and it implicitly assumes a linear relationship between performance and the
covariate.
However, the default setting of SPSS
prevents us from examining whether for our
data the relationship between performance and age differs between the two groups;
that is, there is no Group x Age interaction term included in the statistical model
design. We must therefore add this interaction
term by hand in a Custom Model.
(Note, the Univariate Model dialogue box seems to imply that its default model is
already Full factorial. This is not the case since the interaction term is missing. We
have to use the Custom mode to construct the fully fact
orial model).
© Michael Thomas, 2007
16
We add the Group x Age interaction term as follows. Click on the Model button. In
the Model dialogue, select Custom. Set the Build Terms drop down to Main effects.
Click on Group(F) and CA(C) and use the right arrow button to move these acro
ss to
the Model box. (Nb., the F and C stand for Fixed and Covariate, respectively). Then
set the Build Terms drop down to Interaction. Select both Group(F) and CA(C) by
clicking on them in turn, then click on the right arrow button to add this interaction
term to the model. The dialogue box should now look like this:
Click on Continue, and then run the analysis by clicking on OK in the main
Univariate dialogue.
Two results tables are now of interest. The Tests of Between

Subjects

Effect
s allows
us to assess how much of the variance in the data we have explained. The overall R
2
is
.774 (calculated by dividing the sum

of

squares for the Error, .409, by the Corrected
Total sum

of

squares, 1.809, and subtracting the result from 1). The model
explains a
signi
ficant proportion of this variance [F(3, 37)=42.23, p<.001, η
2
=.774].
Inspection of the results for each factor indicate that there is no overall effect of
Group [F(1, 37)=.52, p=.474, η
2
=.014]. This tells us that the intercepts of the two
groups are not
reliably different. There is no Delayed Onset in development for the
disorder group here.
With the groups combined, chronological age significantly predicts level of
performance [F(1, 37)=32.88, p<.001, η
2
=.470]. However, crucially, there is a
significant
Group X CA interaction. The disorder group is developing more slowly on
this task [F(1, 37)=7.40, p=.010, η
2
=.167]. They are exhibiting a Slower Rate of
development.
© Michael Thomas, 2007
17
Interpretation:
There is a subtly in interpreting these results. When th
ere is no
difference in rate, differences in onset are unambiguous. But when there is a
difference in rate, an absence of a difference in onset can be more ambiguous.
Clearly, the disorder trajectory falls below that of the TD group. The lack of
signific
ance in the Group factor
–
which is notionally the statistic to tell us that the
disorder group is performing at a different level
–
is because the difference in
intercepts is evaluated where the trajectories meet the y

axis, i.e., when age is zero.
This i
s not especially meaningful because our analyses only pertain to the age range
for which we have measured performance (let alone the idea that individuals could be
good at the task the moment they are born!) Across the range we are measuring, the
disorder
group trajectory falls well below the TD group. However, in terms of
statistically characterising this difference, it arises from a difference in rate rather than
onset. The two trajectories appear to have begun at the same level at some point in the
past,
but to have developed at different rates.
The Parameter Estimates table allows us to reconstruct the regression equations for
the two trajectories (and include confidence intervals on these estimates). These
parameters allow us to quantify the difference
between the trajectories.
In interpreting this table, note that SPSS selects one group to have the derived values
of the intercept and gradient (CA), and then provides a modifier to these values if the
group membership is different. Thus the intercept fo
r the disorder group (Group 2) is
.118 and the gradient is .002, while the intercept for the TD group (Group 1) is (.118

.104)=.014 and the gradient is (.002+.004)=.006. These values should correspond to
the parameters generated either by carrying out indi
vidual linear regressions for each
trajectory in SPSS or by Excel’s trendline fitting algorithm in the X

Y Scatter

plot
Chart.
Tests of BetweenSubj ects Effects
Dependent Vari abl e: task
1.400
a
3
.467
42.228
.000
.774
.009
1
.009
.820
.371
.022
.006
1
.006
.523
.474
.014
.363
1
.363
32.876
.000
.470
.082
1
.082
7.404
.010
.167
.409
37
.011
10.365
41
1.809
40
Source
Corrected Model
Intercept
Group
CA
Group * CA
Error
Total
Corrected Total
Type III Sum
of Squares
df
Mean Square
F
Si g.
Parti al Eta
Squared
R Squared = .774 (Adj usted R Squared = .756)
a.
Parameter Estimates
Dependent Vari abl e: task
.118
.132
.893
.377
.149
.384
.021
.104
.144
.723
.474
.397
.188
.014
0
a
.
.
.
.
.
.
.002
.001
1.686
.100
.000
.005
.071
.004
.001
2.721
.010
.001
.007
.167
0
a
.
.
.
.
.
.
Parameter
Intercept
[Group=1.00]
[Group=2.00]
CA
[Group=1.00] * CA
[Group=2.00] * CA
B
Std. Error
t
Si g.
Lower Bound
Upper Bound
95% Confi dence Interval
Parti al Eta
Squared
This parameter i s set to zero because it i s redundant.
a.
© Michael Thomas, 2007
18
The two regression equations are as follows:
014
.
006
.
age
e
Performanc
TD
118
.
002
.
age
e
Performanc
disorder
Straightforwardly,
we can say that the disorder group is
developing at a third of the
rate of the TD group
(i.e., .002/.006=.333).
In terms of the onset, we can describe this difference in one of two ways.
Remembering that the comparison must take place within the range of
ages where the
two trajectories overlap (i.e., the ages for which we have collected data), can either:
(a)
report the performance difference between the groups at the youngest age of
the disorder group, corresponding to
a performance deficit of 15%
[youngest
disorder age: 64 months; TD trajectory: .006x64+.014=.398; disorder
trajectory: .002x64+.118=.246; difference: .398

.246=.152];
or:
(b)
report the age difference at the lowest performance of the disorder group
derived in (a), corresponding to
an onset delay
of 25 months
[predicted
performance at youngest age of 64 months: .002x64+.118=.246; age for TD
trajectory at which performance is .246: (.246

.014)/.006=38.67; difference:
64

39=25 months].
The statistical comparison of the trajectories tells us that, in
this case, there is a 15%

performance deficit / 25

month disparity owing not to a difference in onset but a
difference in the rate at which the two groups are developing.
Plotting performance against mental age
The analysis so far has indicated that the
disorder group is not at a level we would
expect given their chronological age. In some cases, this is unsurprising for a disorder,
particularly if we are examining an area of apparent weakness (such as reading in
developmental dyslexia). In other cases,
the CA analysis may be central, such as when
we believe we are examining an area of potential strength or normal development in
the disorder (e.g., non

verbal reasoning in developmental dyslexia).
Given that performance is not at CA level in the disorder
group, our next question
becomes: is the performance of the disorder group at a level we would expect
given
their level of cognitive development
, as measured by our selected standardised test? If
we plotted the disorder trajectory according to mental age (
MA) rather than CA,
would the disorder trajectory now fall on top of the TD trajectory?
© Michael Thomas, 2007
19
Note that this is a theory

dependent comparison, because it relies on us having made
the right, theoretically motivated choice about which standardised test is approp
riate
to evaluate the level of cognitive development in the domain that pertains to our
experimental task (e.g., one might use a test of receptive grammar to generate MAs
for a disorder group in an experimental task investigating sentence repetition; one
m
ight, more tentatively, use Matrices; one might be less likely to use a block design
or face recognition test, although one could of course try).
To carry out this second comparison, we run the statistical test again, but now
substituting MA as the covari
ate. Remember, this requires that we re

specify the
Custom model to include the following factors: Group, MA, and Group*MA.
When task performance is plotted against MA, the trajectories look like this:
The results of comparing disord
er and TD trajectories based on MA yield the
following SPSS results tables:
y = 0.0064x  0.0462
R
2
= 0.8728
y = 0.0062x  0.1762
R
2
= 0.7559
0%
20%
40%
60%
80%
100%
30
50
70
90
110
130
150
170
Mental age (months)
Task performance
TD group
Disorder group
Tests of BetweenSubj ects Effects
Dependent Vari abl e: task
1.589
a
3
.530
89.149
.000
.878
.032
1
.032
5.388
.026
.127
.011
1
.011
1.881
.178
.048
.756
1
.756
127.286
.000
.775
.000
1
.000
.021
.885
.001
.220
37
.006
10.365
41
1.809
40
Source
Corrected Model
Intercept
Group
MA
Group * MA
Error
Total
Corrected Total
Type III Sum
of Squares
df
Mean Square
F
Si g.
Parti al Eta
Squared
R Squared = .878 (Adj usted R Squared = .869)
a.
© Michael Thomas, 2007
20
As suggested by the data plot, the two trajectories are now parallel: there is no reliable
interaction of Group x Age [F(1, 37)=.21, p=.885, η
2
=.001]. Whi
le mental age is a
strong predictor of performance over all participants [F(1, 37)=127.29, p<.001,
η
2
=.775], there is no group difference [F(1, 37)=1.88, p=.178, η
2
=.048]. While the
disorder group performs at a marginally lower level than the TD group, thi
s difference
is not statistically reliable.
In short, there is no delay in onset or rate: statistically, the trajectory is normalised by
plotting according to MA. Therefore, in the disorder group, performance on this task
is in line with general developm
ent in the domain.
Note, however,
we could not say that the disorder group is developing normally
on
this task. We have already established that according to CA, the disorder group is
developing at a slower rate. The MA analysis merely demonstrates that t
he slower rate
is in keeping with the slower development exhibited by this domain as a whole (to the
extent that the standardised test we have chosen is a valid measure of the domain).
Comparing CA

and MA

based trajectories
Note that the R
2
value for t
he disorder group increases from 0.104 in the CA plot [F(1,
14)=1.60, p=.227
, η
2
=.103] to 0.76 for the MA plot [F(1, 14)=42.37, p<.001,
η
2
=.752]. For disorders with learning disability, this is a frequently observed pattern.
Where the standardised test is indeed relevant to the experimental task, or where the
standardised test loa
ds heavily on the general factor of intelligent (such as Ravens
matrices),
MA will usually be a better predictor of task performance than CA.
This is
especially the case in a cross

sectional design. In a longitudinal design, MA and CA
may correlate more st
rongly and therefore their predictive power may be more equal.
(See main paper for a theoretical discussion of the predictive power of CA vs. MA).
For the disorder group in our sample data, the correlation between CA and MA was
only 0.58. Stepwise regress
ion indicates that MA predicts reliably more of the
variance in task performance than CA [R
2
produced by entering both CA and MA into
the linear model = .805, F(2, 13)=26.78, p<.001; R
2
change produced by removing
MA = .702, F(1, 13)=46.73, p<.001].
Parameter Estimates
Dependent Vari abl e: task
.180
.085
2.121
.041
.352
.008
.108
.134
.098
1.372
.178
.064
.332
.048
0
a
.
.
.
.
.
.
.006
.001
6.226
.000
.004
.008
.512
.000
.001
.145
.885
.002
.002
.001
0
a
.
.
.
.
.
.
Parameter
Intercept
[Group=1.00]
[Group=2.00]
MA
[Group=1.00] * MA
[Group=2.00] * MA
B
Std. Error
t
Si g.
Lower Bound
Upper Bound
95% Confi dence Interval
Parti al Eta
Squared
This parameter i s set to zero because it i s redundant.
a.
© Michael Thomas, 2007
21
CA and MA

based analyses may be compared by examining the confidence intervals
on the parameters to see whether they overlap or not. They may be compared more
directly by including age_type as an additional factor in a between

subjects
comparison (e
nsuring all 2

way and 3

way interactions are specified in the Custom
model), although this is a conservative comparison since the two trajectories are
treated as between

rather than repeated measures. We won’t go into any further
details on this analysis
here.
Interpreting null trajectories in the disorder group
Consider the following data, drawn from one author’s studies (DA). These relate to
the development of perceptual thresholds in two tasks. In both cases, the TD group
generates a reliable cross

sectional trajectory showing improvement on perceptual
recognition. In both cases, the disorder group fails to produce a reliable trajectory, so
that chronological age does not predict performance (in this case, nor did any of the
mental ages derived from
standardised tests deemed to be relevant to the task).
Closer inspection of the disorder trajectories for the two tasks suggests there does
seem to be a difference between them. In Task 2, performance seems to be random
with regard to the
predictor of age: the trajectory is no more than drawing an arbitrary
line through a data cloud. One might interpret this to indicate that other cognitive
factors predict performance and these factors differ between the children. Let us call
this pattern
no systematic relationship
. However, in Task 1, it seems to be more the
Task 2: Perceptual Threshold
y = 0.0026x + 0.535
R
2
= 0.2103
y = 0.0005x + 0.4865
R
2
= 0.0041
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
50
70
90
110
130
150
170
Age (months)
Task performance
Disorder group
TD group
Task 1: Perceptual Threshold
y = 0.0022x + 0.6914
R
2
= 0.3824
y = 0.0005x + 0.6242
R
2
= 0.0812
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
50
70
90
110
130
150
170
Age (months)
Task performance
Disorder group
TD group
Model Summary
.897
a
.805
.775
.06785
.805
26.777
2
13
.000
.320
b
.103
.038
.14015
.702
46.732
1
13
.000
Model
1
2
R
R Square
Adjusted
R Square
Std. Error of
the Esti mate
R Square
Change
F Change
df1
df2
Si g. F Change
Change Statisti cs
Predi ctors: (Constant), MA, CA
a.
Predi ctors: (Constant), CA
b.
© Michael Thomas, 2007
22
case that the children with the disorder have prematurely reached the best level they
can achieve (note the measure is still in the sensitive range; the children are not at
floor perfo
rmance). This looks like a real trajectory whose gradient is zero. Such a
pattern would be consistent with the interpretation that the processing constraints of
these children’s cognitive systems mean that development simply cannot get past a
certain level
of performance for the age range we are examining. Let us call this a
zero trajectory
.
Unfortunately, however, despite this apparent visual difference, statistically the two
cases are identical with respect to linear regression methods: the disorder grou
p
produces null trajectories in both tasks.
These two types of (theoretically different) null results appear the same statistically.
Why is this? It is for two reasons. First, the linear regression model simply evaluates
whether the gradient of the line i
s statistically different from zero. If it is not, then
values of the predictor (age) are of no use in predicting values of the dependent
variable (performance). They may be of no use because the dependent variable is
random or because it is always the sam
e value.
Second, and more technically, the regression equation is calculated according to the
standardised residuals of the data points from the derived trajectory. In other words,
although the data points for the disorder group may appear to be clustered
more
tightly around the flat trajectory in Task 1 than Task 2, the regression model rescales
this difference so that the two flat trajectories look equally noisy.
However, statistical models depend on the assumptions built into them. In this case,
becaus
e of the experimental design (a comparison between TD and disorder groups),
we may add an additional assumption into the statistical model: the TD group
provides us with an independent verification of the range of variability expected in the
task. Therefor
e, we can add in the assumption that residuals need not be standardised.
If we pay attention to the tighter clustering of the disorder data points in Task 1 than
Task 2, it becomes possible to distinguish between the two types of null result.
The method w
e have developed to do this relies on rotating the data in X

Y
coordinates. A flat line produces a gradient of zero. A line at 45 degrees produces a
gradient of 1. If the tightly clustered trajectory is real, the R
2
value should change
from around zero to
around 1 (for the ideal case) if the graph is simply rotated by 45
degrees anti

clockwise. However, if one rotates a random data cloud by 45 degrees
anti

clockwise, it should make no difference. The cloud should still be a cloud with an
flat line through i
t; the R
2
value of the originally

null trajectory should remain close to
zero. In this way, repeating the linear regressions on rotated data should distinguish
between a
zero trajectory
and one where there is
no systematic relationship
.
The Excel file
rot
ating trajectories two sample.xls
includes formulae for this
transformation for two idealised cases of a zero trajectory and no

systematic

relationship. The method involves 2 steps:
(1)
rescale the trajectories so that values on both axes fall between 0 and 1
. Ages
are scaled to be proportions of the total age range in both TD and disorder
groups; performance levels are scaled to be proportions of the total
© Michael Thomas, 2007
23
performance range in both groups. It is here where the TD group variability is
used to scale the disorde
r group’s variability.
(2)
The rescaled TD and disorder trajectories are rotated on the new axes, using
the formulae
y
x
x
sin
cos
x
y
y
sin
cos
where
x
and
y
are the original
x

y
coordinates,
x’
and
y’
are the rotated coordinates,
and
θ
is the angle of rotation in units of radians. The regressions on the rotated
trajectories are shown below. For Task 1, the suspected zero trajectory, the initial R
2
increases from .08 [F(1, 21)=1.86, p=.187] to a statistically reliable .55 [F(1,
21)=25.
54, p<.001]. For Task 2, the suspected no systematic relationship, the R
2
changes from .00 [F(1, 21)=.09, p=.772] to .05 [F(1, 21)=1.15, p=.295].
Our
zero trajectory
has become reliable after rotation while our
no system
relationship
has not. Statistical
ly, therefore, the rotation method has proved sufficient
to distinguish between the two null results.
Remember, the rotation method only works because we have independent grounds for
not standardising the residuals: we know the variabil
ity produced by the TD group on
the same task. The rotation method would not be applicable characterising a single
trajectory, nor would it be applicable for comparing two null trajectories produced by
a single group (unless we have some strong reason to b
elieve variability in one
measure should be related to variability in a second).
Note also that the rotation method does not provide an interpretation. Where one
group shows a zero trajectory and the other does not, one must ensure that this is not
becaus
e one of the groups is at floor or ceiling. Tracing back a putative zero trajectory
Task 2: Rotated Trajectories
y = 0.0601x + 0.5696
R
2
= 0.0097
y = 0.2143x + 0.8213
R
2
= 0.052
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
x'
y'
Disorder group
TD group
Task 1: Rotated Trajectories
y = 0.1644x + 0.7539
R
2
= 0.1071
y = 0.6075x + 1.0513
R
2
= 0.5488
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
x'
y'
Disorder group
TD group
© Michael Thomas, 2007
24
to cognitive developmental constraints requires confidence that the experimental task
is providing a sensitive measure of performance in both groups.
© Michael Thomas, 2007
25
3. Repeated measures
linear regression: Comparing cross

sectional
developmental trajectories for two tasks carried out by the same group
The SPSS data file
sample TD repeated measures.sav
contains performance data for
the typically developing (TD) group of 25 children (ages
2;9 to 12;5) on two
experimental tasks.
The respective trajectories are:
y = 0.0061x + 0.0135
R
2
= 0.881
y = 0.0045x + 0.2809
R
2
= 0.7116
0%
20%
40%
60%
80%
100%
30
50
70
90
110
130
150
170
Chronological age (months)
Task performance
TD group Task 1
TD group Task 2
© Michael Thomas, 2007
26
To compare these two trajectories treating Task as a repeated (within

participants)
measure, select Analyze

General Linear Model

Repeated Measures. Define a Withi
n

Subject Factor ‘task’ with two levels. Click on Define. In the Repeated Measures
dialogue, add the two variables Task1 and Task2 as Within

Subjects Variables and
TD_CA as the covariate. Select Estimates of effect size and Parameter Estimates in
the Optio
ns dialogue. Cook’s distance information may also be generated using the
Save dialogue, to check for outliers. Then run the analysis by clicking on OK in the
Repeated Measures dialogue.
Overall, performance significantly improv
es with age [F(1, 23)=197.52
, p<.001,
η
2
=.896]. The repeated measure indicates a significant difference between the tasks,
with performance on Task 2 producing reliably higher scores with a medium effect
size [F(1, 23)=14.28, p=.001, η
2
=.383]. Finally, there is a marginally significa
nt
interaction between age and task [F(1, 23)=3.88, p=.061, η
2
=.144], suggesting a
difference in the rate of development on the two tasks.
Since 100% is the maximum score in both tasks, there is a possibility that this
interaction stems from a ceiling eff
ect for Task 2. The performance advantage of 20%
for Task 2 over Task 1 at around 30 months could not be replicated at 150 months
since Task 1 is already at 90%: Task 2 would have to exceed the ceiling score.
Finally, the Parameter Estimates provide the c
oefficients for trajectories two separate
task trajectories in the TD group.
© Michael Thomas, 2007
27
Tests of WithinSubj ects Contrasts
Measure: MEASURE_1
.112
1
.112
14.278
.001
.383
.031
1
.031
3.883
.061
.144
.181
23
.008
task
Li near
Li near
Li near
Source
task
task * TD_CA
Error(task)
Type III Sum
of Squares
df
Mean Square
F
Si g.
Parti al Eta
Squared
Tests of BetweenSubj ects Effects
Measure: MEASURE_1
Transformed Vari abl e: Average
.136
1
.136
17.943
.000
.438
1.499
1
1.499
197.521
.000
.896
.175
23
.008
Source
Intercept
TD_CA
Error
Type III Sum
of Squares
df
Mean Square
F
Si g.
Parti al Eta
Squared
Parameter Estimates
.014
.043
.317
.754
.075
.102
.004
.006
.000
13.035
.000
.005
.007
.881
.281
.056
5.057
.000
.166
.396
.527
.005
.001
7.541
.000
.003
.006
.712
Parameter
Intercept
TD_CA
Intercept
TD_CA
Dependent Vari abl e
Task1
Task2
B
Std. Error
t
Si g.
Lower Bound
Upper Bound
95% Confi dence Interval
Parti al Eta
Squared
© Michael Thomas, 2007
28
Using confidence intervals to assess when trajectories diverge/converge
In some circumstances, theory predicts that trajectories should conve
rge or diverge.
For instance, as children get better at recognising faces, they find it increasingly hard
to detect differences in faces when they are presented upside

down. Therefore, normal
development should produce an increasing trajectory of accuracy
in upright face
recognition and a decreasing trajectory of accuracy in inverted face recognition, and
this is the pattern that has been observed in trajectory analysis for children between 6
and 12 years (Annaz, 2006). In cases like this, it is useful to d
erive the age at which
two trajectories reliably diverge. This can be achieved by plotting the trajectories for
the two tasks (or two groups for a between

participants comparison) along with their
95% confidence intervals, which can be generated using line
ar regression function on
each trajectory on its own (see section 1). The point at which the upper confidence
interval of one trajectory and the lower confidence interval cease to overlap provides
an estimate of the age (or mental age) at which the traject
ories diverge. The following
plot illustrates this method for the repeated measures TD sample data. It indicates that
the trajectories for the two tasks reliably converge above the age of 114 months (or
reliably diverge below the age of 114 months).
y = 0.0061x + 0.0135
R
2
= 0.881
y = 0.0045x + 0.2809
R
2
= 0.7116
0%
20%
40%
60%
80%
100%
30
50
70
90
110
130
150
170
Chronological age (months)
Task performance
TD group Task 1
TD group Task 2
© Michael Thomas, 2007
29
4. Analysing mixed design linear regressions: does the disorder group show the
same relationship between the development of two abilities as the TD group?
The SPSS data file
sample TD disorder mixed design.sav
contains performance data
for the typical
ly developing (TD) group of 25 children (ages 2;9 to 12;5) and the 16
children in the disorder group (ages 5;4 to 11;2) on two experimental tasks. We will
therefore be constructing and comparing four developmental trajectories, two for each
group.
The four trajectories constructed according to chronological age are:
y = 0.0061x + 0.0135
R
2
= 0.881
y = 0.0022x + 0.1181
R
2
= 0.104
y = 0.0045x + 0.2809
R
2
= 0.7116
y = 0.0033x  0.1135
R
2
= 0.3343
0%
20%
40%
60%
80%
100%
30
50
70
90
110
130
150
170
Chronological age (months)
Task performance
TD group Task 1
Disorder group Task 1
TD group Task 2
Disorder group Task 2
© Michael Thomas, 2007
30
We can assess the performance of the individual group on an individual basis using
the confidence intervals around the two TD trajectories. Say that a
given individual
from the disorder group scores at 20% on Task 1 and 33% on Task 2, a disparity of
13%. We can evaluate whether this pattern (20, 33) is found anywhere across the TD
trajectories, with each point inside the respective TD trajectory’s confi
dence intervals;
or, indeed, whether the disparity size of 13% is found anywhere across the TD
trajectories. These questions can be answered irrespective at the age at which any
such patterns are exhibited in the TD group. This is a theory

neutral comparis
on of
individuals with the disorder to the normal pattern of development. However, as
before, our primary interest lies with group comparisons.
To compare these trajectories statistically, we need to construct a mixed

design linear
regression model in SPS
S, with Group as a between

participants factor and Task as a
within

participants factor. Select Analyze

General Linear Model

Repeated Measures.
Define a Within

Subject Factor ‘task’ with two levels. Click on Define.
In the Repeated Measures dialogue, add
the two variables Task1 and Task2 as Within

Subjects Variables. Add Group to the Between

Subjects Factor(s) box and CA to the
Covariates box. Select Estimates of effect size and Parameter Estimates in the Options
dialogue. Cook’s distance information may a
lso be generated using the Save dialogue
to check for outliers.
As with the between

groups analysis in Section 2, we now need to us the Model
dialogue box to ensure that we have specified a fully factorial design for the analysis.
Speci
fy a Custom model. Select Main effects in the Build Terms dropdown menu,
then highlight each of the factors task, Group, and CA(C) in turn and use the right
arrow to transfer them to the Within

Subjects Model box and Between

Subjects
Model box, respectivel
y. Then select Interactions (or All 2

way does the same thing
here) from the Build Terms dropdown menu, highlight Group and CA(C) by clicking
on them, and then the right arrow to transfer the CA*Group interaction term to the
© Michael Thomas, 2007
31
Between

Subjects Model box. The
n click on Continue and run the analyses by
clicking on OK in the Repeated Measures dialogue.
We start with the key theoretical question: Does the disorder group show the same
developmental
relationship between the two tasks as that fo
und in the TD group?
This corresponds to a 3

way interaction between task x Group x CA. The answer is
yes, for this interaction is non

significant [F(1, 37)=2.11, p=.155
, η
2
=.054]. However,
the two groups do show a different pattern of accuracy on each task, with the TD
group performing more accurately on Task 2 than Task 1, while the disorder group
performs more accurately on Task 1 than 2 [task x Group: F(1, 37)=7.07, p
=.012,
η
2
=.160; this interaction then renders the main effect of task non

significant].
Chronological age is a strong predictor of performance overall [CA: F(1, 37)=60.15,
p<.001, η
2
=.619], but development once again occurs more slowly in the disorder
grou
p [Group x CA: F(1, 37)=6.06, p=.019, η
2
=.141].
The Excel chart clearly shows the disorder trajectories falling below the TD
trajectories: why isn’t there a significant main effect of Group? As in Section 2, this
occurs because of the slower rate of the d
isorder group and the fact that the
comparison of intercepts is carried out at the y

axis (i.e., when x=0). Using the
method outlined in the previous section for deriving a numerical value of onset delay
from the regression equations, at the youngest age m
easured for the disorder group,
there is already a performance decrement of 15% on Task 1 and 47% on Task 2.
However, the analysis suggests that these accuracy differences stem from two systems
for which accuracy levels did not initially differ (at an age
before we started
measuring) but which have diverged based on their different rates of growth.
(Remember, trajectory analyses aim to offer a richer set of statistical descriptor that
allow us to distinguish different ways in which trajectories can differ:
in onset, rate,
linearity, and so forth.)
© Michael Thomas, 2007
32
The Parameter Estimates table allows the equations for each of the four trajectories to
be constructed. Parameters are listed separately for each task. Within the task, the
default interc
ept and gradient (onset and rate) are listed for Group 2, with Group 1
values corresponding to modifiers to these default values. These parameter values
should correspond to the regression equations on the Excel Scatter

plot chart
trendlines.
Tests of BetweenSubj ects Effects
Measure: MEASURE_1
Transformed Vari abl e: Average
.024
1
.024
2.046
.161
.052
.022
1
.022
1.919
.174
.049
.697
1
.697
60.154
.000
.619
.070
1
.070
6.060
.019
.141
.429
37
.012
Source
Intercept
Group
CA
Group * CA
Error
Type III Sum
of Squares
df
Mean Square
F
Si g.
Parti al Eta
Squared
Parameter Estimates
.118
.130
.906
.371
.146
.382
.022
.104
.143
.731
.470
.394
.185
.014
0
a
.
.
.
.
.
.
.002
.001
1.697
.098
.000
.005
.072
.004
.001
2.754
.009
.001
.007
.170
0
a
.
.
.
.
.
.
.113
.126
.901
.373
.368
.141
.021
.394
.138
2.857
.007
.115
.673
.181
0
a
.
.
.
.
.
.
.003
.001
2.733
.010
.001
.006
.168
.001
.001
.878
.386
.002
.004
.020
0
a
.
.
.
.
.
.
Parameter
Intercept
[Group=1.00]
[Group=2.00]
CA
[Group=1.00] * CA
[Group=2.00] * CA
Intercept
[Group=1.00]
[Group=2.00]
CA
[Group=1.00] * CA
[Group=2.00] * CA
Dependent Vari abl e
Task1
Task2
B
Std. Error
t
Si g.
Lower Bound
Upper Bound
95% Confi dence Interval
Parti al Eta
Squared
This parameter i s set to zero because it i s redundant.
a.
Tests of WithinSubj ects Contrasts
Measure: MEASURE_1
.000
1
.000
.037
.849
.001
.066
1
.066
7.066
.012
.160
.000
1
.000
.030
.864
.001
.020
1
.020
2.112
.155
.054
.345
37
.009
task
Li near
Li near
Li near
Li near
Li near
Source
task
task * Group
task * CA
task * Group * CA
Error(task)
Type III Sum
of Squares
df
Mean Square
F
Si g.
Parti al Eta
Squared
© Michael Thomas, 2007
33
Plotting performance against mental age
Our next question is, is the developmental relation between the two tasks what we
would expect in the disorder group
given their level of cognitive development in the
domain
, as measured by our selected standard
ised test? In some circumstances,
disorder groups can show an apparently atypical relationship between performance on
two tasks that is in fact a sign of immaturity, i.e., it is commensurate with the overall
stage of development in the cognitive domain. If
this is the case, we would expect the
developmental relationship to normalise when trajectories are constructed against MA
instead of CA. (In that case, normalisation would be marked by a significant 3

way
task x Group x CA interaction but a non

significa
nt task x Group x MA interaction).
Alternatively, a different relationship between the development of the two tasks
according to MA would be suggestive of an atypically developing cognitive system
(see Karmiloff

Smith et al., 2004; Thomas et al., 2001; Tho
mas et al., 2006, for
examples of the application of the mixed

design method to test for atypical
development in visuospatial and language domains, respectively).
For our current sample data, the trajectories plotted against MA are as follows:
To carry out the analysis, replace CA with MA as the covariate in the Repeated
Measures dialogue, remembering to ensure that the custom Model now contains
Group, MA, and Group x MA as the Between

Subjects Factors:
y = 0.0064x  0.0462
R
2
= 0.8728
y = 0.0062x  0.1762
R
2
= 0.7559
y = 0.0049x + 0.2306
R
2
= 0.7231
y = 0.0027x + 5E05
R
2
= 0.1919
0%
20%
40%
60%
80%
100%
30
50
70
90
110
130
150
170
Mental age (months)
Task performance
TD group Task 1
Disorder group Task 1
TD group Task 2
Disorder group Task 2
© Michael Thomas, 2007
34
For the sampl
e data, the 3

way interaction of T
ask x Group x MA remains non

significant [F(1, 37)=1.05, p=.313, η
2
=.028], indicating a normal developmental
relationship between the tasks in the disorder group.
What has changed in assessing task development with reference to the level of ability
in the
general domain (as indexed by the standardised test) as opposed to
chronological age? The Task x Group effect has become non

significant, a result of
the disorder group’s two task trajectories becoming less distinguishable at younger
MAs. Group x MA is al
so non

significant: the disorder group’s task trajectories are
mildly diverging and the TD group’s mildly converging: the average trajectory for
each group is now roughly similar. Averaging over tasks, the disorder group is
developing at the rate one would
expect given their level of mental ability (again, note
that the CA comparison shows it is
not
developing at a normal rate). However, mean
task performance is now at a reliably lower level in the disorder group [F(1, 37)=5.08,
p=.030, η
2
=.121]. That is, g
iven their level of mental ability, development is at the
normal rate but
there is an onset delay.
Performance is below the level one would
expect based on the standardised test.
How much below? Plugging the youngest MA measured in the disorder group into
the
equations for the four lines, shown on the chart and also derivable from the Parameter
Estimates table, the initial task disparity between the groups is 14% for Task 1 and
33% for Task 2. This is an average performance disparity of 23%.
© Michael Thomas, 2007
35
Lastly, note that in this mixed

design, we have principally focused on reporting the
results involving the Group factor, either as a main effect or in interactions. We have
generally found that mixed

designs are mostly useful for ev
aluating how disorder
group status modifies the normal pattern of development (either when plotted against
CA or against MA). Particularly when the variability within the groups is different, it
is often not useful to explore main effects of task or CA or
MA in the mixed

design
analysis, since this serves to conflate the groups. Instead, our usual practice for data of
the kind presented here would be to characterise the pattern of normal development
using a repeated measures design just with the TD group, t
hen characterise the pattern
observed in the disorder group, again using a repeated measures design just with the
disorder group, and then finally to test whether differences between the TD and
disorder pattern are reliable by using a mixed

design and noti
ng the involvement of
the Group factor. (Since one is theoretical interested both in normal development and
development within the disorder group, these individual group trajectories are planned
comparisons. Therefore, the omnibus comparison need not be th
e first statistical
analysis).
Tests of WithinSubj ects Contrasts
Measure: MEASURE_1
.064
1
.064
7.485
.009
.168
.003
1
.003
.365
.550
.010
.060
1
.060
6.981
.012
.159
.009
1
.009
1.048
.313
.028
.316
37
.009
task
Li near
Li near
Li near
Li near
Li near
Source
task
task * Group
task * MA
task * Group * MA
Error(task)
Type III Sum
of Squares
df
Mean Square
F
Si g.
Parti al Eta
Squared
Tests of BetweenSubj ects Effects
Measure: MEASURE_1
Transformed Vari abl e: Average
.000
1
.000
.003
.959
.000
.040
1
.040
5.080
.030
.121
.961
1
.961
120.862
.000
.766
.013
1
.013
1.689
.202
.044
.294
37
.008
Source
Intercept
Group
MA
Group * MA
Error
Type III Sum
of Squares
df
Mean Square
F
Si g.
Parti al Eta
Squared
Parameter Estimates
.176
.084
2.100
.043
.346
.006
.106
.130
.096
1.349
.186
.065
.325
.047
0
a
.
.
.
.
.
.
.006
.001
6.251
.000
.004
.008
.514
.000
.001
.198
.844
.002
.002
.001
0
a
.
.
.
.
.
.
.000
.114
.003
.998
.231
.231
.000
.230
.131
1.755
.088
.036
.496
.077
0
a
.
.
.
.
.
.
.003
.001
2.008
.052
.000
.005
.098
.002
.002
1.439
.159
.001
.005
.053
0
a
.
.
.
.
.
.
Parameter
Intercept
[Group=1.00]
[Group=2.00]
MA
[Group=1.00] * MA
[Group=2.00] * MA
Intercept
[Group=1.00]
[Group=2.00]
MA
[Group=1.00] * MA
[Group=2.00] * MA
Dependent Vari abl e
Task1
Task2
B
Std. Error
t
Si g.
Lower Bound
Upper Bound
95% Confi dence Interval
Parti al Eta
Squared
This parameter i s set to zero because it i s redundant.
a.
© Michael Thomas, 2007
36
If you have any queries or comments on these methods, please feel free to contact:
[
add your favourite email address here
].
Acknowledgements
This research was supported UK Medical Research Council Grant G0300188 to
Michae
l Thomas.
References
Annaz, D. (2006).
The development of visuospatial processing in children with
autism, Down syndrome, and Williams syndrome
. Unpublished PhD thesis.
University of London.
Howell, D. C. (2007).
Statistical methods for psychology (6
th
Ed.)
. Thomson
Wadsworth. Belmont, CA.
Karmiloff

Smith, A., Thomas, M. S. C., Annaz, D., Humphreys, K., Ewing, S., Grice,
S., Brace, N., Van Duuren, M., Pike, G., & Campbell, R. (2004). Exploring the
Williams syndrome face processing debate: The importanc
e of building
developmental trajectories.
Journal of Child Psychology and Psychiatry and Allied
Disciplines
,
45(7)
, 1258

1274.
Motulsky, H, & Christopoulos, A. (2004).
Fitting models to biological data using
linear and non

linear regression
. Oxford: Oxfor
d University Press.
Thomas, M. S. C. (2003). Multiple causality in developmental disorders:
Methodological implications from computational modelling.
Developmental
Science
,
6 (5)
, 537

556.
Thomas, M. S. C., Dockrell, J. E., Messer, D., Parmigiani, C., An
sari, D., &
Karmiloff

Smith, A. (2006). Speeded naming, frequency and the development of
the lexicon in Williams syndrome.
Language and Cognitive Processes
,
21(6)
, 721

759.
Thomas, M. S. C., Grant, J., Gsödl, M., Laing, E., Barham, Z., Lakusta, L., Tyler,
L.
K., Grice, S., Paterson, S. & Karmiloff

Smith, A. (2001). Past tense formation in
Williams syndrome.
Language and Cognitive Processes
,
16
, 143

176.
Comments 0
Log in to post a comment