An introduction to the
concepts of multilevel
modelling
Kelvyn Jones, School of Geographical
Sciences, University of Bristol
Oxford Research Methods Festival
1
st
July 2004
2
MULTILEVEL MODELS
AKA random

effects models, hierarchical
models, variance

c
omponents models, random

coefficient models, mixed models
Three KEY Notions
Modelling contextuality

eg LOS varies from hospital to hospital

eg LOS varies differentially for patients
of different ages from hospital to hospital
Modelling heterogeneity

sta
ndard regression models ‘averages’, ie
the general relationship

ML models variances

Eg between

hospital AND between

patient, within

hospital variation
Modelling data with complex structure

ML deals with complex structure deriving
from reality AND design
of study
3
Realistically complex modelling
‘
Statistical models as a formal framework of
analysis with a complexity of structure that matches
the system being studied’
OUTLINE

Complexity?

Diagrams representing complexity

Graphs of varying relatio
ns

From graphs to equations

What sort of questions does multilevel
modelling answer?

Resources for going further
4
COMPLEXITY?
Data from populations with a complex structure
Two Forms and Two Types
As ‘Structure’

naturally occurring dependencies
Eg:
pupils (1) in schools (2)
people (1) in neighbourhoods (2)
many to fewer

‘imposed

by

design’ dependencies
Eg:
multistage sample
As ‘Missingness’

naturally occurring imbalances
Eg:
not answering in a panel study

‘imposed

by

desi
gn’ imbalances
Eg:
rotational questions
ALL substantively interesting
Eg
Contexuality (
school effects, area effects
)
Heterogeneity(
differentiated differentiation
)
AND technically important
Eg
dependency leading to mis

estimated precision;
ecologi
cal fallacy, atomistic fallacy
………….
5
COMPLEXITY: UNIT DIAGRAMS
1:
Hierarchical structures
a) Pupils nested within schools: modelling progress
b) Repeated measures of voting behaviour at the UK
general election
c) Multivariate design for healt
h

related behaviours
6
COMPLEXITY: UNIT DIAGRAMS(
continued
)
NB: multilevel structures: overwhelming majority of
applications are hierarchical
BUT reality can be more complex…………
2:
Non

Hierarchical structures
a) cross

classified structure
b) multiple membership with weights
Can represent reality by COMBINATIONS of
different types of structures
But can get complex so….
7
COMP
LEXITY:
CLASSIFICATION
DIAGRAMS(
IOE
)
a) 3

level hierarchical structure
b) cross

classified structure
c) multiple membership structure
d) spatial structure
8
CLASSIFICATION DIAGRAMS FOR ALSPAC
(Source: Jon Rasbash
)
All children born in Avon in 1990 followed
longitudinally
Multiple attainment measures on a pupil as the
y pass
through primary school
Pupils span 3 school

year cohorts (say 1996,1997,1998)
Pupils move between teachers, schools, neighbourhoods
Pupils progress potentially affected by their own
changing characteristics, the pupils around them, their
current
and past teachers, schools and neighbourhoods
IS SUCH COMPLEXITY NEEDED?
Complex models are reducible to simpler models
Confounding of variation across levels (eg primary and
secondary school variation)
M. occasions
Pupil
Teacher
School Cohort
Primary school
Area
N’oo
dod
9
VARYING RELATIONS
Multilevel modelling can ha
ndle

multiple outcomes

categorical & continuous predictors

categorical and continuous responses
But KISS………
Single response

house prices
Single predictor

number of rooms

centred for ease of interpretation

average

sized house has 5 room
s
Original
2
3
4
5
6
7
8
Centred

3

2

1
0
1
2
3
Two level hierarchy

houses at level 1 nested within

neighbourhoods at level 2
Set of characteristic plots………………
10
VARYING RELATIONS(
continued
)
11
FROM GRAPHS T
O EQUATIONS I
1
2
3
4
5
6
G
M
3
2
1
0

1

2

3
2
5
0
2
0
0
1
5
0
1
0
0
R
o
o
m
s

5
P
r
i
c
e
2

level model: individuals and neighbourhoods (VC)
Response
Fixed effects
(Random effects)
Price of
=
Cost of
+
Cost of
+
N’hood
+House
House i
Average size
extra
Effect
Effect
In N’hood j
House
room
y
ij
=
β
0
+
β
1
x
1i j
+ (
μ
j
+ ε
ij
)
μ
j
~ N(0,
2
):
between neighbourhood variance
(conditional)
ε
ij
~ N(0,
2
):
within neighbourhood, between house
variation
Terminology
(
random =
’allowed to vary’)
β
0
,
β
1
:
f
ixed effects (grand mean, grand slope)
μ
j
ε
ij
:
random effects or multilevel residuals
2
2
:
random parameters
Specifying more complex models on handout
12
FROM GRAPHS TO EQUATIONS II
In MLwiN software
1
2
3
4
5
6
G
M
3
2
1
0

1

2

3
2
5
0
2
0
0
1
5
0
1
0
0
R
o
o
m
s

5
P
r
i
c
e
VP:
within N’hood dependence:
187.1/ (187.1 + 776.7) = 0.19
2
2
2
e
u
u
13
WHAT IS A LEVEL?
WHAT IS DIFFERENCE B
ETWEEN A
VARIABLE AND A LEVEL
?
EG why are
schools
a level but
gender
a variable?
Schools = Level
=
a population of units from which
we have taken
a random sample
Gender = variable = Boy/girl exhaust the categories
Boy/girl
≠ sample out of all possible gender
categories
Random classification
generalisation of a Level eg households, neigh’hoods
random effects come from a
distribution
;
all schools contribute to between

school variance
F
ixed classifications
discrete categories of a variable
not sample from a wider population
fixed effects are averages
only boys contribute to mean for boys
Households in the fixed part referring to specific
households
Households in the random part referring to househo
lds
in general
14
TYPES OF QUESTIONS TACKLED BY MULTILEVEL
MODELLING
(
in relation to fixed and random
)
2

level model:
Current Attainment given prior attainment
of pupils (1) in schools (2)
a
random sample
of schools from a
population
of schools.
Do boys
make greater progress than girls? (F)
Are boys more or more variable in their progress than
girls? (R)
What is the between

school variation in progress? (R)
Is school X different from other schools in the sample in
their effect? (F)
Are schools more
variable in their progress for pupils
with low prior attainment (R)
Doe the gender gap vary across schools (R)
Do pupils make more progress in denominational
schools? (F)
Are pupils in denominational schools less variable in
their progress? (R)
Do girl
s make greater progress in denominational
schools? (F)

cross

level interaction
15
RESOURCES

web

based guide to resources
( text

based resources, references, software, training
associated with software, freeware, individual web

pages…… )
http://www.ggy.
bris.ac.uk/staff/information/mlwebresources.doc

hands

on training
(Essex Summer School; h
ttp://multilevel.ioe.ac.uk/)
MULTILEVEL MODELS:
Substantive: Modelling problems with complex
structure
Interested in variability, heterogeneity
Technical
:
handles ‘clustering’ and dependencies
Otherwise tendency to infer a
relationship where non

exists
But
make demanding assumptions (eg
school effects follow a Normal
distribution) and data hungry
And
not needed when little structura
l
complexity
16
MULTILEVEL WEB

BASED RESOURCES
http://www.ggy.bris.ac.uk/staff/information/mlwebresources.doc
These resources have been put together by Kelvyn Jones, Myles Gould and VS Subramanian. It derives from the
questions we are frequently asked whe
n teaching introductory multilevel courses. The listing of resources is
designed as an ‘organised meta

site’ whereby other resources for multilevel modelling on the web can be accessed.
We have tended to list a specific resource only once, so you may have
to search through the document to find what
you want. To that end we have put it on the web as a single document and you can then use Find or Search facility
on your browser to scan the entire document.
General
The best general site is the Centre for Mu
ltilevel Modelling:
http://multilevel.ioe.ac.uk/
The multilevel mailing list is also a key general resource as it is searchable; it represents many years of accumulated
questions and answers:
http://www.mailbase.org.uk/lists/multilev
el/
Another vital resource is provided by UCLA Academic Technology Services who maintain data and worked
examples in a number of different software packages for a number of different multilevel textbooks:
http://www.ats.ucla.edu/stat/mlm/default.htm
Book and other text

based downloads
Goldstein’s classic text (in its 2
nd
edition with corrections) on multilevel modelling can be downloaded from
http://www.arnoldpublishers.com/support/goldstein.htm
Supplementary material for Snijders, T and Bosker, R (
1999)
Multilevel Analysis: An introduction to basic and
advanced multilevel modelling
(including updates and corrections, data sets used in examples, with set

ups for
running the examples in MLwiN and in HLM, and an introduction to MLwiN) can be found at:
http://stat.gamma.rug.nl/snijders/mlbook1.htm
Joop Hox has some down

loadable example chapters from his textbook
Multilevel Analysis: Techniques and
Applications
at
http://www.fss.uu.nl/ms/jh/mlbook/leabook.htm
the complete contents of his 1995
Applie
d Multilevel Analysis
, Amsterdam: TT

Publikaties is available from
http://www.fss.uu.nl/ms/jh/publist/amaboek.pdf
There is also a very brief introduction to modelling change with random effects models at:
http://www.gmu.edu/departmen
ts/psychology/ployhart/Gcourses/P892F01/Modeling%20Change%20hando
ut.doc
To read a comparison of multilevel modelling with traditional approaches to running ANOVA, regression, and
logistic regression with memories/events being "nested" within people/testi
ng session see Wright, D. B. (1998).
Modelling clustered data in autobiographical memory research: The multilevel approach
. Applied Cognitive
Psychology, 12,
339

357 which is available as a download from:
http://www.cogs.susx.ac.uk/users/danw/pdf/multil.p
df
To keep up to date with developments in the field have a look at the downloadable
Multilevel Newsletters
:
http://multilevel.ioe.ac.uk/publref/newsletters.html
References to publishe
d work on multilevel modelling
List of references to material that uses multilevel modelling can be found at Jung

Ho Yang’s
myschoolofeducation.net
http://www.mysoe.net/1multilevel/reference.htm
and the Centre for Multilevel Modelling has a very extens
ive and growing list of references
http://multilevel.ioe.ac.uk/publref/references.html
Wolfgang Ludwig

Mayerhofe’s annotated references on multilevel modelling
http://www.lrz

muenchen.d
e/~wlm/wlmmule.htm#Literature
17
Software in general
If you want to compare the different packages that are available for multilevel modelling, detailed comparisons are
being developed at
http://multilevel.ioe.ac.uk/softrev/index.html
If you want to see h
ow a particular model can be fitted in particular software, there are the developing resources at
UCLA
http://www.ats.ucla.edu/stat/mlm/default.htm
For those wishing to analyse longitudinal data, software instructions in a wide range of programs is provi
ded by
UCLA to accompany the textbook ‘
Applied Longitudinal Data Analysis: Modelling Change and Event Occurrence
’
by Judith D. Singer and John B. Willett
http://www.ats.ucla.edu/stat/examples/alda/
A listing of available software is also at Jung

Ho Yan
g’s myschoolofeducation.net
http://www.mysoe.net/1multilevel/software.htm
Training associated with software
A growing amount of web

based (or at least down

loadable) training materials are being developed. We have
chosen to organise this section by th
e particular software that is being used, and rather arbitrarily separated
commercial software from the freeware that follows
aML
can be used to fit a range of multilevel models but has specific features for fitting multi

process or
simultaneous equation
models to hierarchical data where predictor variables may be non

random
or endogenous, and other types of models used by economists such as a multilevel Heckman
selection models
http://www.applied

ml.com/product/multiprocess.html
HLM
how to undertake
2

level analyses http://www.ssicentral.com/hlm/example.htm
HLM
Jason Newsom’s Multilevel Regression course that uses HLM, but covers a lot of other ground too
(eg handout ‘ Distinguishing between random and fixed: variables, effects, and coefficients’)
http://www.upa.pdx.edu/IOA/newsom/mlrclass/default.htm
MLWIN
you can download a version of the software, data and training manuals from TRAMMS (Teaching
Resources and Materials for Social Scientists)
http://tramss.data

archive.ac.uk
MLWIN
James Bro
wn has a multilevel course (with data) using MLwiN at
http://www.socstats.soton.ac.uk/courses/st622/workshops.html
MLWIN
the down

loadable manuals are of themselves a course in the practical aspects of multilevel
models
http://multilevel.ioe.ac.uk/d
ownload/manuals.html
MLWIN
The substantial enhancement of the MCMC procedures in MLwiN is discussed in full in 'MCMC
Estimation in
MLwiN'
which is to be used with the development version of the program
http://multilevel.ioe.ac.uk/dev/develop.html
Mplu
s
This software allows structural equation modelling, multilevel modelling and mixture modelling;
the home site has training downloads and examples
http://www.statmodel.com/mplus/examples/
webnote.html
SAS
Judy Singer has a pdf download that shows how to fit multilevel models in PROC MIXED; it is
very well written
http://gseweb.harvard.edu/~faculty/singer/
UCLA has implemented the Singer example in other software
(eg R
\
Splus; HLM. MlwiN, SPSS)
http://www.ats.ucla.edu/stat/paperexamples/singer/default.htm
SAS
C.J. Anderson has a lot of material for his course online at:
18
http://www.ed.uiuc.edu/courses/EdPsy490CK/
SAS
The code and data to fit the models cont
ained in SAS System for Mixed Models (1996) by RC
Littell, GA Milliken, WW Stroup, and RD Wolfinger, is to be found at:
http://www.sas.com/samples/A55235
SPSS
A useful HTML

based tutorial demonstrating the use of the recently introduced Linear Mixed
M
odels procedure in SPSS Advanced Models is to be found at: (search under Linear Mixed
models cases studies)
http://www.spss.com/downloads/Papers.cfm
‘Freeware’
There are a number of programs that are available at low or nil cost
; some of these are general (like R), others are
more specific but can have special features that make them particularly attractive; we have tried to identify these
special features below. We have also pointed to some appropriate training resources.
BAYE
SX
has a number of distinctive features including handling structured (correlated) and/or unstructured
(uncorrelated) effects of spatial covariates (geographical data) and unstructured random effects of
unordered group indicators. It allows non

parametric
relationships between the response and the
predictors (generalized additive models) and does this for continuous and discrete outcomes, it can
manipulate and display geographical maps:
http://www.stat.uni

muenchen.de/~lang/bayesx/bayesx.html
BUGS
B
ay
esian inference
U
sing
G
ibbs
S
ampling is really a flexible language that allows the fitting of a
very wide range of models using MCMC methods; this is a very rich site developed by the MRC
Biostatistics Research Unit in Cambridge which has lots of freely
down

loadable software and
detailed manuals
http://www.mrc

bsu.cam.ac.uk/bugs
BUGS
A number of courses using BUGS have been put online, a listing is given at
http://www.mrc

bsu.cam.ac.uk/bugs/weblinks/webresource.shtml
BUGS
Pet
er Congdon has written two books based around BUGS (Bayesian Statistical Modelling, and
Applied Bayesian Modelling) data and programmes are available for both books at
ftp://www.wiley.co.uk/pub/books/congdon/
GeoBUGS
is an add

on to
BUGS that has been developed by a team at Imperial College to fit spatial models
and produce a range of maps as output.
http://www.mrc

bsu.cam.ac.uk/bugs/winbugs/geobugs.shtml
GLLAMM
this software usefully undertakes multilevel latent class and fac
tor analysis, adapative quadrature
to derive the full likelihood with discrete and normal response, and has facilities for fitting non

parametric models in which the distribution at the higher level can be non

normal (you need
STATA to run this software;
preferably STATA 8) ; this software is particularly useful for the
models listed above, but can be slow to converge. This site also a rich one with growing number
of downloads of lectures and papers showing how the approach can be used in practice
http:
//www.iop.kcl.ac.uk/IoP/Departments/BioComp/programs/gllamm.html
MIX
These are a set of stand

alone programmes that fir a number of specific models including mixed

effects linear regression, mixed

effects logistic regression for nominal or ordinal outcom
es,
mixed

effects probit regression for ordinal outcomes, mixed

effects Poisson regression, and
mixed

effects grouped

time survival analysis. They have a common interface, and importantly
they calculate the likelihood directly so allowing comparison of the
change in deviance for nested
models. The are versions for Windows as well as for PowerMac and Solaris
http://www.uic.edu/~hedeker/mix.html
R
R is complete system for statistical computation and graphics, it can be seen as an Open Source
implementatio
n of the S language which in turn underlies the S

Plus software. It is distributed
freely under the GNU General Public License and can be used for commercial purposes. It
operates across a very wide range of platforms. The latest version and documentation
can be
obtained via CRAN, the Comprehensive R Archive Network
19
http://cran.r

project.org.
R
normal

theory models are fitted in R using lme and nlme functions described in full in ‘Mixed

effects models in S and S

PLUS' by J. C. Pinhei
ro and D. M. Bates (2000), there is an online
support for this book at
http://nlme.stat.wisc.edu/MEMSS/
R
NLME: Software for mixed

effects models, further information with downloads can be found at
htt
p://cm.bell

labs.com/cm/ms/departments/sia/project/nlme/
R
for discrete responses there is the function glmmPQL which is discussed in the 4th edition of
Modern applied statistics with S W. N. Venables and B. D. Ripley; the book also covers normal
theory
models; there is online support for the book at;
http://www.stats.ox.ac.uk/pub/MASS4/
R
Jeff Gill maintains a website that provides help, tutorials and references for those who want to use
R
http://web.clas.ufl.edu/%7Ejgill/s

language.help.html
U
seful macros and other software
PreML
There is a very useful utility written so as to export an SPSS file into a MLwiN worksheet, it is
down

loadable from Tom Snijders webpage
http://stat.gamma.rug.nl/snijders/PreML.inc
Diagnostics
Tom Snijders’ homepage contains a set of MLwiN macros for producing diagnostics
http://stat.gamma.rug.nl/snijders/mlnmac.htm
PINT
For determining appropriate required sample sizes and power in a two

level model; there is a
manual
http://stat.gamma.rug.nl/snijders/multilevel.htm#progPINT
OD
is another program for power analysis and optimal design, it has excellent graphical output, but as
yet no manual, you will need to read the published papers by Steve Raudenbush and Xiao

F
eng
Liu
http://www.ssicentral.com/other/hlmod.htm
DismapWin
is a public domain software for the statistical analysis of epidemiological maps; it allows the
analysis of unobserved heterogeneity using mixture models; the program offers a Poisson
regress
ion approach which links disease and exposure data
http://ftp.ukbf.fu

berlin.de/sozmed/DismapWin.html
Websites maintained by individuals
Douglas Bates who developed the LME and NLME functions in R and S

plus has a website at
http://franz.stat.wisc.edu
/~bates/bates.html
Bill Browne (who has made major contributions to the MCMC component of MLwiN) has a large number of down

loadable papers at
http://www.maths.nott.ac.uk/personal/pmzwjb/bill.html
David Draper’s home page has a lot of material about th
e Bayesian approach to hierarchical models
http://www.cse.ucsc.edu/~draper/
Harvey Goldstein, who is the instigator of the MLwiN software has a number of down

loadable papers at his
personal website
http://www.ioe.ac.uk/hgpersonal
Don Hedeker who has b
een behind the MIX set of programs has lecture transparencies and class notes on
longitudinal analysis at
http://tigger.uic.edu/~hedeker/
20
Joop Hox’s webpage has papers, programs and lectures to download at
http://www.fss.uu.nl/ms/jh
/
Bengt Muthen who
is the developer of Mplus which allows multilevel factor analysis has a site at
http://www.gseis.ucla.edu/faculty/muthen/muthen3.htm
Jon Rasbash who has written most of the code for MLwiN has down

loadable papers at
http://multilevel.ioe.ac.uk/team/jon.h
tml
Steve Raudenbush’s LAMMP website has publications and pre

prints and links to the projects he is currently
working on:
http://www

personal.umich.edu/~rauden/
Tom Snijders homepage
http://stat.gamma.rug.nl/snijders/multilevel.htm
Fiona Steel has a nu
mber of down

loadable papers particularly on multilevel event history analysis
http://multilevel.ioe.ac.uk/team/fiona.html
Tutorials in MCMC estimation
MCMC estimation is increasingly being used to estimate complex models; there are number of sites wit
h really
helpful resources to get you started:
Simon Jackman’s Estimation and Inference via Markov chain Monte Carlo: a resource for social scientists
http://tamarama.stanford.edu/mcmc/
Jeff Gill’s homepage is a mine of information in this area, it inc
ludes some down

loadable chapters from his 2002
book
Bayesian Methods for the Social and Behavioral Sciences
which is to be thoroughly recommended
http://web.clas.ufl.edu/%7Ejgill/
Sujjit Sahu’s tutorial on MCMC
http://www.maths.soton.ac.uk/staff/Sahu/u
trecht/
There is a lot of background material on MCMC in 'MCMC Estimation in MLwiN'
http://multilevel.ioe.ac.uk/dev/develop.html
A Brief Introduction to Graphical Models and Bayesian Networks is to be found at
http://www.ai.mit.edu/~murphyk/Bayes/bay
es.html
To keep up to date in this area, you can visit the MCMC preprint service
http://www.statslab.cam.ac.uk/~mcmc/
21
S
pecification of multilevel models
A simple single

level regression model of student achievement can be specified as follows:
Universi
ty
= Constant
+ A

level
+Gender +
Student
Score
Score
Effect
Effect
y
i
= β
0
x
0i
+ β
1
x
1i +
β
2
x
2i
+ (ε
0i
x
0i
)
where
y
i
is the grade point average score for student i at their end of the degree course (the
approach can also handl
e binary data, such as good degree or not, or multiple categories
such as good, poor, fail);
x
0i
is the constant, the value 1 for every student;
x
1i
is the continuous predictor of the pre

entry score; it makes a lot of sense to ‘centre’ this
value around
some plausible value such as the average score for all students.
x
2i
has the value of 1 if the student is a female, 0 otherwise
With this specification
, β
0
, the intercept is the average university score for a male
student with an average pre

entry score, β
1
(if positive)is the increase in the
University score as the pre

entry score goes up by one unit, and β
2
represents the
difference
on average between
males and females on university score after taking
account of the pre

entry score. The β’s are averages that give the general result across
all students; ‘fixed’ parameters that do not vary from student to student. The final
term, ε
0i
, is allowed to var
y between students, and this ‘random’ term (signified by
brackets) represents the difference between the actual university score and that
predicted by the model. The random term is assumed to come from a distribution (in
this case a Gaussian one because of
the continuous response) which can be
summarised by a single variance term , ε
0i
~ N(0,
2
). This variance term assesses
how much variability remains between students in the university score after taking
account of their pre

entry score and gender. This type of model is kn
own as an
homoscedastic one, as each student is assumed to have the same variance taking
account of the pre

entry score.
We can develop this single

level model by including additional terms associated with gender and
pre

entry score in the random part of
the model:
y
i
=
β
0
x
0i
+β
1
x
1i
+β
2
x
2i
+ (ε
0i
x
0i
+ε
1i
x
1i
+ε
2i
x
2i
)
These three random terms allow for variability in the university achievement
differentiated by type of student. The exact pattern depends on the size and nature of
the estimates, but it could be that fema
les are more consistent in the performance,
while students with low entry scores are more variable. This
random

coefficient
model
can only be estimated by multilevel software such as the package MLwiN.
We turn now to a two

level hierarchical model, and re

specify the previous model to
reflect this structure:
22
y
ij
=
β
0j
x
0ij
+β
1
x
1ij
+β
2
x
2ij
+ (ε
0ij
x
0ij
+ε
1ij
x
1ij
+ε
2ij
x
2ij
)
The are two changes. The subscript is now ij as student i is nested in school j. The
intercept has been indexed (β
0j
) as there a set of intercepts.. We also write a school
level model with its
distributional assumptions
β
0j
= β
0
+(μ
0j
);
μ
0j
~ N(0,
2
)
μ
0j
is the differential effect for having attended school j and it is allowed to vary
around the overall intercept, β
0.
If a student’s university performance is not influenc
ed
by the school attended the μ
0j
will be close to zero as will the between

school
variance,
2
.
This model is known as a variance components or random
–
intercepts model and
implies that schools make a difference but do so in a general
way for all type of
students. A more complex random

coefficient model would index the slope of the
university/pre

entry relationship in the student

level, micro

model:
y
ij
= β
0j
x
0ij
+β
1j
x
1ij
+β
2
x
2ij
+(ε
0ij
x
0ij
+ ε
1ij
x
1ij
+ ε
2ij
x
2ij
)
and there would
be two macro models for the random intercepts and slopes
β
0j
= β
0
+ (μ
0j
)
β
1j
= β
1
+ (μ
1j
)
Whereas β
1
is the general relationship across all schools, the μ
1j
are the differential
slopes. If this is a positive value, a pupil going to this school does be
tter at university
than their A

level score would suggest; a negative value for this term would suggest
they do not do as well as predicted in general by their A

level score. There are now
two random terms at the school level and we can summarise they dis
tribution as a
joint multivariate Gaussian distribution:
(
μ
0j,
μ
1j
) ~ N
2
1
01
2
01
2
0
The off

diagonal covariance term is important. For example, if
01
, the association
between the differential intercept and the differential slope for pre

entry is positive the
variance between school
s will be greatest for students with high pre

entry scores (a
positive μ
0j
being associated with a positive μ
1j
). If none of these variance

covariance
terms are significantly different from zero, the school a student attended as no effect
on subsequent per
formance once gender and pre

entry score has been taken into
account.
If there is significant variation between schools we would like to account for it by
school characteristics. We do this by including in the macro
–
model, either continuous
or categorica
l measures. Thus, we may assess if fee

paying schools produce an
increase in the performance of the average student, but attenuate the university/ pre
entry relationship. We would then include a categorical predictor (w
1j
equals 1 for
fee

paying, otherwise
0) in both macro

models:
23
β
0j
= β
0
+α
1
w
1j
+(μ
0j
)
β
1j
= β
1
+α
2
w
1j
+(μ
1j
)
and these can be combined into an overall multilevel model:
y
ij
=
β
0
x
0ij
+β
1
x
1ij
+β
2
x
2ij
+α
1
w
1j
x
0ij
+α
2
w
1j
x
1ij
+
(μ
0j
x
0ij
+μ
1j
x
1ij
+ε
0ij
x
0ij
+ε
1ij
x
1ij
+ε
2ij
x
2ij
)
The key
terms are β
1
which represents the relation
between pre

entry score and
attainment for non

fee

paying schools and α
2
which represents the differential for
those who
attended a fee

paying school. The latter is a cross

level interaction for it
involves variab
les at both levels, the school type and the student pre

entry score.
More complex models can be developed in the same way; including a subscript for
each classification or level, and including variance

covariances terms at
each level associated with partic
ular variables.
This form of specification can become very unwieldy with more complex crossings and nestings
introduce an alternative, simplified and very general notation. Level 1 is always represented by
subscript i. Higher levels are then defined by a
classification of these level 1 units, using the
classification names as subscripts for the random terms. No distinction is made between levels or
between crossings
–
each classification above level 1 is simply indexed separately as a source of
variation;
for example,
i
i
hood
n
i
school
i
i
x
y
)
3
(
)
(
'
)
2
(
)
(
1
1
0
with distributional assumptions
)
2
(
)
(
i
school
~
)
(
2
)
2
(
N
;
)
3
(
)
(
'
i
hood
n
~
)
(
2
)
3
(
N
;
i
~
)
(
2
N
The superscript (2) identifies a
school classification and superscript (3) identifies a
neighbourhood classification; the level 1 classification is assumed to be the first and its
identification is omitted). If there are two or more random coefficients associated with a
classification, s
ay with random school effects for pre

test or gender parameters then the
subscripts 0,1, … are used as in the standard notation. This notation does not increase in
complexity as classifications are added, but it does not convey information on crossing and
nesting. Consequently it is useful to accompany the model with a classification diagram.
Comments 0
Log in to post a comment