An introduction to the concepts of multilevel modelling

fancyfantasicAI and Robotics

Nov 7, 2013 (3 years and 10 months ago)

121 views



An introduction to the
concepts of multilevel
modelling



Kelvyn Jones, School of Geographical
Sciences, University of Bristol




Oxford Research Methods Festival

1
st

July 2004

2

MULTILEVEL MODELS

AKA random
-
effects models, hierarchical
models, variance
-
c
omponents models, random
-
coefficient models, mixed models


Three KEY Notions




Modelling contextuality

-

eg LOS varies from hospital to hospital

-

eg LOS varies differentially for patients
of different ages from hospital to hospital




Modelling heterogeneity

-

sta
ndard regression models ‘averages’, ie
the general relationship

-

ML models variances

-

Eg between
-
hospital AND between
-
patient, within
-
hospital variation




Modelling data with complex structure

-

ML deals with complex structure deriving
from reality AND design

of study


3


Realistically complex modelling




Statistical models as a formal framework of
analysis with a complexity of structure that matches
the system being studied’





OUTLINE


-

Complexity?


-

Diagrams representing complexity


-

Graphs of varying relatio
ns


-

From graphs to equations


-

What sort of questions does multilevel
modelling answer?


-

Resources for going further



4


COMPLEXITY?


Data from populations with a complex structure


Two Forms and Two Types




As ‘Structure’

-

naturally occurring dependencies


Eg:


pupils (1) in schools (2)




people (1) in neighbourhoods (2)




many to fewer


-

‘imposed
-
by
-
design’ dependencies


Eg:


multistage sample





As ‘Missingness’

-

naturally occurring imbalances


Eg:


not answering in a panel study


-

‘imposed
-
by
-
desi
gn’ imbalances


Eg:


rotational questions


ALL substantively interesting

Eg

Contexuality (
school effects, area effects
)


Heterogeneity(
differentiated differentiation
)


AND technically important


Eg

dependency leading to mis
-
estimated precision;
ecologi
cal fallacy, atomistic fallacy
………….



5

COMPLEXITY: UNIT DIAGRAMS


1:

Hierarchical structures


a) Pupils nested within schools: modelling progress




b) Repeated measures of voting behaviour at the UK
general election







c) Multivariate design for healt
h
-
related behaviours









6

COMPLEXITY: UNIT DIAGRAMS(
continued
)


NB: multilevel structures: overwhelming majority of
applications are hierarchical


BUT reality can be more complex…………



2:

Non
-

Hierarchical structures


a) cross
-
classified structure







b) multiple membership with weights



Can represent reality by COMBINATIONS of
different types of structures


But can get complex so….


7

COMP
LEXITY:
CLASSIFICATION

DIAGRAMS(
IOE
)






a) 3
-
level hierarchical structure


b) cross
-
classified structure

c) multiple membership structure


d) spatial structure






8

CLASSIFICATION DIAGRAMS FOR ALSPAC




(Source: Jon Rasbash
)




All children born in Avon in 1990 followed
longitudinally




Multiple attainment measures on a pupil as the
y pass
through primary school




Pupils span 3 school
-
year cohorts (say 1996,1997,1998)




Pupils move between teachers, schools, neighbourhoods




Pupils progress potentially affected by their own
changing characteristics, the pupils around them, their
current

and past teachers, schools and neighbourhoods



IS SUCH COMPLEXITY NEEDED?




Complex models are reducible to simpler models




Confounding of variation across levels (eg primary and
secondary school variation)

M. occasions

Pupil

Teacher

School Cohort

Primary school

Area
N’oo
dod


9

VARYING RELATIONS


Multilevel modelling can ha
ndle

-

multiple outcomes

-


categorical & continuous predictors

-

categorical and continuous responses


But KISS………


Single response

-

house prices


Single predictor


-

number of rooms

-

centred for ease of interpretation

-

average
-
sized house has 5 room
s




Original


2

3

4

5

6

7

8



Centred


-
3

-
2

-
1

0

1

2

3


Two level hierarchy


-

houses at level 1 nested within

-

neighbourhoods at level 2


Set of characteristic plots………………


10

VARYING RELATIONS(
continued
)




11

FROM GRAPHS T
O EQUATIONS I

1


2


3


4


5


6


G
M

3
2
1
0
-
1
-
2
-
3
2
5
0
2
0
0
1
5
0
1
0
0
R
o
o
m
s
-
5
P
r
i
c
e

2
-
level model: individuals and neighbourhoods (VC)

Response

Fixed effects



(Random effects)

Price of

=

Cost of


+

Cost of

+
N’hood

+House

House i


Average size


extra


Effect


Effect

In N’hood j

House



room


y
ij

=

β
0




+

β
1
x
1i j

+ (
μ
j


+ ε
ij
)


μ
j
~ N(0,


2
):

between neighbourhood variance
(conditional)

ε
ij
~ N(0,


2
):

within neighbourhood, between house
variation


Terminology

(
random =
’allowed to vary’)

β
0
,
β
1

:
f
ixed effects (grand mean, grand slope)

μ
j

ε
ij

:

random effects or multilevel residuals



2



2

:
random parameters


Specifying more complex models on handout


12

FROM GRAPHS TO EQUATIONS II

In MLwiN software

1


2


3


4


5


6


G
M

3
2
1
0
-
1
-
2
-
3
2
5
0
2
0
0
1
5
0
1
0
0
R
o
o
m
s
-
5
P
r
i
c
e





VP:

within N’hood dependence:




187.1/ (187.1 + 776.7) = 0.19

2
2
2
e
u
u





13

WHAT IS A LEVEL?

WHAT IS DIFFERENCE B
ETWEEN A
VARIABLE AND A LEVEL
?


EG why are

schools
a level but
gender
a variable?


Schools = Level


=

a population of units from which

we have taken

a random sample


Gender = variable = Boy/girl exhaust the categories





Boy/girl




≠ sample out of all possible gender




categories


Random classification



generalisation of a Level eg households, neigh’hoods

random effects come from a
distribution
;

all schools contribute to between
-
school variance


F
ixed classifications



discrete categories of a variable



not sample from a wider population


fixed effects are averages


only boys contribute to mean for boys



Households in the fixed part referring to specific
households



Households in the random part referring to househo
lds
in general


14

TYPES OF QUESTIONS TACKLED BY MULTILEVEL

MODELLING

(
in relation to fixed and random
)


2
-
level model:

Current Attainment given prior attainment
of pupils (1) in schools (2)


a
random sample

of schools from a
population
of schools.




Do boys

make greater progress than girls? (F)




Are boys more or more variable in their progress than
girls? (R)




What is the between
-
school variation in progress? (R)




Is school X different from other schools in the sample in
their effect? (F)




Are schools more
variable in their progress for pupils
with low prior attainment (R)




Doe the gender gap vary across schools (R)




Do pupils make more progress in denominational
schools? (F)




Are pupils in denominational schools less variable in
their progress? (R)




Do girl
s make greater progress in denominational
schools? (F)

-

cross
-
level interaction




15


RESOURCES

-

web
-
based guide to resources

( text
-
based resources, references, software, training
associated with software, freeware, individual web
-
pages…… )

http://www.ggy.
bris.ac.uk/staff/information/mlwebresources.doc


-

hands
-
on training


(Essex Summer School; h
ttp://multilevel.ioe.ac.uk/)



MULTILEVEL MODELS:

Substantive: Modelling problems with complex




structure




Interested in variability, heterogeneity


Technical
:

handles ‘clustering’ and dependencies




Otherwise tendency to infer a





relationship where non
-
exists


But



make demanding assumptions (eg




school effects follow a Normal





distribution) and data hungry


And


not needed when little structura
l




complexity

16

MULTILEVEL WEB
-
BASED RESOURCES

http://www.ggy.bris.ac.uk/staff/information/mlwebresources.doc

These resources have been put together by Kelvyn Jones, Myles Gould and VS Subramanian. It derives from the
questions we are frequently asked whe
n teaching introductory multilevel courses. The listing of resources is
designed as an ‘organised meta
-
site’ whereby other resources for multilevel modelling on the web can be accessed.
We have tended to list a specific resource only once, so you may have
to search through the document to find what
you want. To that end we have put it on the web as a single document and you can then use Find or Search facility
on your browser to scan the entire document.


General

The best general site is the Centre for Mu
ltilevel Modelling:


http://multilevel.ioe.ac.uk/


The multilevel mailing list is also a key general resource as it is searchable; it represents many years of accumulated
questions and answers:


http://www.mailbase.org.uk/lists/multilev
el/


Another vital resource is provided by UCLA Academic Technology Services who maintain data and worked
examples in a number of different software packages for a number of different multilevel textbooks:


http://www.ats.ucla.edu/stat/mlm/default.htm



Book and other text
-
based downloads

Goldstein’s classic text (in its 2
nd

edition with corrections) on multilevel modelling can be downloaded from


http://www.arnoldpublishers.com/support/goldstein.htm


Supplementary material for Snijders, T and Bosker, R (
1999)

Multilevel Analysis: An introduction to basic and
advanced multilevel modelling

(including updates and corrections, data sets used in examples, with set
-
ups for
running the examples in MLwiN and in HLM, and an introduction to MLwiN) can be found at:


http://stat.gamma.rug.nl/snijders/mlbook1.htm


Joop Hox has some down
-
loadable example chapters from his textbook
Multilevel Analysis: Techniques and
Applications

at


http://www.fss.uu.nl/ms/jh/mlbook/leabook.htm


the complete contents of his 1995
Applie
d Multilevel Analysis
, Amsterdam: TT
-
Publikaties is available from


http://www.fss.uu.nl/ms/jh/publist/amaboek.pdf


There is also a very brief introduction to modelling change with random effects models at:

http://www.gmu.edu/departmen
ts/psychology/ployhart/Gcourses/P892F01/Modeling%20Change%20hando
ut.doc


To read a comparison of multilevel modelling with traditional approaches to running ANOVA, regression, and
logistic regression with memories/events being "nested" within people/testi
ng session see Wright, D. B. (1998).
Modelling clustered data in autobiographical memory research: The multilevel approach
. Applied Cognitive
Psychology, 12,

339
-
357 which is available as a download from:

http://www.cogs.susx.ac.uk/users/danw/pdf/multil.p
df



To keep up to date with developments in the field have a look at the downloadable
Multilevel Newsletters
:


http://multilevel.ioe.ac.uk/publref/newsletters.html


References to publishe
d work on multilevel modelling

List of references to material that uses multilevel modelling can be found at Jung
-
Ho Yang’s

myschoolofeducation.net


http://www.mysoe.net/1multilevel/reference.htm


and the Centre for Multilevel Modelling has a very extens
ive and growing list of references


http://multilevel.ioe.ac.uk/publref/references.html


Wolfgang Ludwig
-
Mayerhofe’s annotated references on multilevel modelling



http://www.lrz
-
muenchen.d
e/~wlm/wlmmule.htm#Literature



17


Software in general

If you want to compare the different packages that are available for multilevel modelling, detailed comparisons are
being developed at


http://multilevel.ioe.ac.uk/softrev/index.html


If you want to see h
ow a particular model can be fitted in particular software, there are the developing resources at
UCLA


http://www.ats.ucla.edu/stat/mlm/default.htm


For those wishing to analyse longitudinal data, software instructions in a wide range of programs is provi
ded by
UCLA to accompany the textbook ‘
Applied Longitudinal Data Analysis: Modelling Change and Event Occurrence

by Judith D. Singer and John B. Willett



http://www.ats.ucla.edu/stat/examples/alda/


A listing of available software is also at Jung
-
Ho Yan
g’s myschoolofeducation.net


http://www.mysoe.net/1multilevel/software.htm




Training associated with software

A growing amount of web
-
based (or at least down
-
loadable) training materials are being developed. We have
chosen to organise this section by th
e particular software that is being used, and rather arbitrarily separated
commercial software from the freeware that follows


aML

can be used to fit a range of multilevel models but has specific features for fitting multi
-
process or
simultaneous equation

models to hierarchical data where predictor variables may be non
-
random
or endogenous, and other types of models used by economists such as a multilevel Heckman
selection models



http://www.applied
-
ml.com/product/multiprocess.html


HLM


how to undertake
2
-
level analyses http://www.ssicentral.com/hlm/example.htm


HLM


Jason Newsom’s Multilevel Regression course that uses HLM, but covers a lot of other ground too
(eg handout ‘ Distinguishing between random and fixed: variables, effects, and coefficients’)



http://www.upa.pdx.edu/IOA/newsom/mlrclass/default.htm


MLWIN


you can download a version of the software, data and training manuals from TRAMMS (Teaching
Resources and Materials for Social Scientists)



http://tramss.data
-
archive.ac.uk


MLWIN


James Bro
wn has a multilevel course (with data) using MLwiN at



http://www.socstats.soton.ac.uk/courses/st622/workshops.html


MLWIN


the down
-
loadable manuals are of themselves a course in the practical aspects of multilevel
models



http://multilevel.ioe.ac.uk/d
ownload/manuals.html


MLWIN


The substantial enhancement of the MCMC procedures in MLwiN is discussed in full in 'MCMC
Estimation in
MLwiN'
which is to be used with the development version of the program



http://multilevel.ioe.ac.uk/dev/develop.html


Mplu
s

This software allows structural equation modelling, multilevel modelling and mixture modelling;
the home site has training downloads and examples


http://www.statmodel.com/mplus/examples/
webnote.html


SAS


Judy Singer has a pdf download that shows how to fit multilevel models in PROC MIXED; it is
very well written



http://gseweb.harvard.edu/~faculty/singer/


UCLA has implemented the Singer example in other software
(eg R
\
Splus; HLM. MlwiN, SPSS)



http://www.ats.ucla.edu/stat/paperexamples/singer/default.htm


SAS


C.J. Anderson has a lot of material for his course online at:


18



http://www.ed.uiuc.edu/courses/EdPsy490CK/


SAS


The code and data to fit the models cont
ained in SAS System for Mixed Models (1996) by RC
Littell, GA Milliken, WW Stroup, and RD Wolfinger, is to be found at:



http://www.sas.com/samples/A55235


SPSS


A useful HTML
-
based tutorial demonstrating the use of the recently introduced Linear Mixed
M
odels procedure in SPSS Advanced Models is to be found at: (search under Linear Mixed
models cases studies)



http://www.spss.com/downloads/Papers.cfm





‘Freeware’

There are a number of programs that are available at low or nil cost
; some of these are general (like R), others are
more specific but can have special features that make them particularly attractive; we have tried to identify these
special features below. We have also pointed to some appropriate training resources.



BAYE
SX

has a number of distinctive features including handling structured (correlated) and/or unstructured
(uncorrelated) effects of spatial covariates (geographical data) and unstructured random effects of
unordered group indicators. It allows non
-
parametric

relationships between the response and the
predictors (generalized additive models) and does this for continuous and discrete outcomes, it can
manipulate and display geographical maps:



http://www.stat.uni
-
muenchen.de/~lang/bayesx/bayesx.html



BUGS


B
ay
esian inference
U
sing
G
ibbs
S
ampling is really a flexible language that allows the fitting of a
very wide range of models using MCMC methods; this is a very rich site developed by the MRC
Biostatistics Research Unit in Cambridge which has lots of freely

down
-
loadable software and
detailed manuals



http://www.mrc
-
bsu.cam.ac.uk/bugs


BUGS


A number of courses using BUGS have been put online, a listing is given at



http://www.mrc
-
bsu.cam.ac.uk/bugs/weblinks/webresource.shtml


BUGS


Pet
er Congdon has written two books based around BUGS (Bayesian Statistical Modelling, and
Applied Bayesian Modelling) data and programmes are available for both books at



ftp://www.wiley.co.uk/pub/books/congdon/


GeoBUGS

is an add
-
on to

BUGS that has been developed by a team at Imperial College to fit spatial models
and produce a range of maps as output.





http://www.mrc
-
bsu.cam.ac.uk/bugs/winbugs/geobugs.shtml


GLLAMM

this software usefully undertakes multilevel latent class and fac
tor analysis, adapative quadrature
to derive the full likelihood with discrete and normal response, and has facilities for fitting non
-
parametric models in which the distribution at the higher level can be non
-
normal (you need
STATA to run this software;
preferably STATA 8) ; this software is particularly useful for the
models listed above, but can be slow to converge. This site also a rich one with growing number
of downloads of lectures and papers showing how the approach can be used in practice



http:
//www.iop.kcl.ac.uk/IoP/Departments/BioComp/programs/gllamm.html


MIX


These are a set of stand
-
alone programmes that fir a number of specific models including mixed
-
effects linear regression, mixed
-
effects logistic regression for nominal or ordinal outcom
es,
mixed
-
effects probit regression for ordinal outcomes, mixed
-
effects Poisson regression, and
mixed
-
effects grouped
-
time survival analysis. They have a common interface, and importantly
they calculate the likelihood directly so allowing comparison of the

change in deviance for nested
models. The are versions for Windows as well as for PowerMac and Solaris



http://www.uic.edu/~hedeker/mix.html


R


R is complete system for statistical computation and graphics, it can be seen as an Open Source
implementatio
n of the S language which in turn underlies the S
-
Plus software. It is distributed
freely under the GNU General Public License and can be used for commercial purposes. It
operates across a very wide range of platforms. The latest version and documentation
can be
obtained via CRAN, the Comprehensive R Archive Network


19



http://cran.r
-
project.org.


R


normal
-
theory models are fitted in R using lme and nlme functions described in full in ‘Mixed
-
effects models in S and S
-
PLUS' by J. C. Pinhei
ro and D. M. Bates (2000), there is an online
support for this book at



http://nlme.stat.wisc.edu/MEMSS/


R


NLME: Software for mixed
-
effects models, further information with downloads can be found at



htt
p://cm.bell
-
labs.com/cm/ms/departments/sia/project/nlme/


R


for discrete responses there is the function glmmPQL which is discussed in the 4th edition of
Modern applied statistics with S W. N. Venables and B. D. Ripley; the book also covers normal
theory
models; there is online support for the book at;



http://www.stats.ox.ac.uk/pub/MASS4/


R


Jeff Gill maintains a website that provides help, tutorials and references for those who want to use
R



http://web.clas.ufl.edu/%7Ejgill/s
-
language.help.html




U
seful macros and other software

PreML


There is a very useful utility written so as to export an SPSS file into a MLwiN worksheet, it is
down
-
loadable from Tom Snijders webpage



http://stat.gamma.rug.nl/snijders/PreML.inc


Diagnostics

Tom Snijders’ homepage contains a set of MLwiN macros for producing diagnostics



http://stat.gamma.rug.nl/snijders/mlnmac.htm


PINT


For determining appropriate required sample sizes and power in a two
-
level model; there is a
manual


http://stat.gamma.rug.nl/snijders/multilevel.htm#progPINT


OD


is another program for power analysis and optimal design, it has excellent graphical output, but as
yet no manual, you will need to read the published papers by Steve Raudenbush and Xiao
-
F
eng
Liu



http://www.ssicentral.com/other/hlmod.htm


DismapWin

is a public domain software for the statistical analysis of epidemiological maps; it allows the
analysis of unobserved heterogeneity using mixture models; the program offers a Poisson
regress
ion approach which links disease and exposure data



http://ftp.ukbf.fu
-
berlin.de/sozmed/DismapWin.html



Websites maintained by individuals

Douglas Bates who developed the LME and NLME functions in R and S
-
plus has a website at


http://franz.stat.wisc.edu
/~bates/bates.html


Bill Browne (who has made major contributions to the MCMC component of MLwiN) has a large number of down
-
loadable papers at


http://www.maths.nott.ac.uk/personal/pmzwjb/bill.html


David Draper’s home page has a lot of material about th
e Bayesian approach to hierarchical models


http://www.cse.ucsc.edu/~draper/


Harvey Goldstein, who is the instigator of the MLwiN software has a number of down
-
loadable papers at his
personal website


http://www.ioe.ac.uk/hgpersonal


Don Hedeker who has b
een behind the MIX set of programs has lecture transparencies and class notes on
longitudinal analysis at


http://tigger.uic.edu/~hedeker/



20

Joop Hox’s webpage has papers, programs and lectures to download at


http://www.fss.uu.nl/ms/jh
/


Bengt Muthen who

is the developer of Mplus which allows multilevel factor analysis has a site at

http://www.gseis.ucla.edu/faculty/muthen/muthen3.htm


Jon Rasbash who has written most of the code for MLwiN has down
-
loadable papers at

http://multilevel.ioe.ac.uk/team/jon.h
tml


Steve Raudenbush’s LAMMP website has publications and pre
-
prints and links to the projects he is currently
working on:

http://www
-
personal.umich.edu/~rauden/


Tom Snijders homepage

http://stat.gamma.rug.nl/snijders/multilevel.htm


Fiona Steel has a nu
mber of down
-
loadable papers particularly on multilevel event history analysis


http://multilevel.ioe.ac.uk/team/fiona.html




Tutorials in MCMC estimation

MCMC estimation is increasingly being used to estimate complex models; there are number of sites wit
h really
helpful resources to get you started:


Simon Jackman’s Estimation and Inference via Markov chain Monte Carlo: a resource for social scientists


http://tamarama.stanford.edu/mcmc/


Jeff Gill’s homepage is a mine of information in this area, it inc
ludes some down
-
loadable chapters from his 2002
book
Bayesian Methods for the Social and Behavioral Sciences

which is to be thoroughly recommended


http://web.clas.ufl.edu/%7Ejgill/


Sujjit Sahu’s tutorial on MCMC


http://www.maths.soton.ac.uk/staff/Sahu/u
trecht/


There is a lot of background material on MCMC in 'MCMC Estimation in MLwiN'



http://multilevel.ioe.ac.uk/dev/develop.html


A Brief Introduction to Graphical Models and Bayesian Networks is to be found at

http://www.ai.mit.edu/~murphyk/Bayes/bay
es.html


To keep up to date in this area, you can visit the MCMC preprint service

http://www.statslab.cam.ac.uk/~mcmc/


21

S
pecification of multilevel models

A simple single
-
level regression model of student achievement can be specified as follows:


Universi
ty

= Constant

+ A
-
level

+Gender +

Student

Score






Score


Effect


Effect


y
i

= β
0
x
0i
+ β
1
x
1i +
β
2
x
2i


+ (ε
0i
x
0i
)


where


y
i

is the grade point average score for student i at their end of the degree course (the
approach can also handl
e binary data, such as good degree or not, or multiple categories
such as good, poor, fail);


x
0i

is the constant, the value 1 for every student;


x
1i

is the continuous predictor of the pre
-
entry score; it makes a lot of sense to ‘centre’ this
value around

some plausible value such as the average score for all students.


x
2i


has the value of 1 if the student is a female, 0 otherwise



With this specification
, β
0
, the intercept is the average university score for a male

student with an average pre
-
entry score, β
1
(if positive)is the increase in the

University score as the pre
-
entry score goes up by one unit, and β
2
represents the

difference

on average between

males and females on university score after taking

account of the pre
-
entry score. The β’s are averages that give the general result across

all students; ‘fixed’ parameters that do not vary from student to student. The final

term, ε
0i
, is allowed to var
y between students, and this ‘random’ term (signified by

brackets) represents the difference between the actual university score and that

predicted by the model. The random term is assumed to come from a distribution (in

this case a Gaussian one because of

the continuous response) which can be

summarised by a single variance term , ε
0i
~ N(0,


2
). This variance term assesses

how much variability remains between students in the university score after taking

account of their pre
-
entry score and gender. This type of model is kn
own as an

homoscedastic one, as each student is assumed to have the same variance taking

account of the pre
-
entry score.


We can develop this single
-
level model by including additional terms associated with gender and
pre
-
entry score in the random part of
the model:


y
i

=
β
0
x
0i

1
x
1i

2
x
2i
+ (ε
0i
x
0i


1i
x
1i


2i
x
2i
)


These three random terms allow for variability in the university achievement

differentiated by type of student. The exact pattern depends on the size and nature of

the estimates, but it could be that fema
les are more consistent in the performance,

while students with low entry scores are more variable. This
random
-
coefficient

model

can only be estimated by multilevel software such as the package MLwiN.


We turn now to a two
-
level hierarchical model, and re
-
specify the previous model to

reflect this structure:



22

y
ij

=
β
0j
x
0ij

1
x
1ij

2
x
2ij
+ (ε
0ij
x
0ij


1ij
x
1ij


2ij
x
2ij
)


The are two changes. The subscript is now ij as student i is nested in school j. The

intercept has been indexed (β
0j
) as there a set of intercepts.. We also write a school

level model with its
distributional assumptions


β
0j

= β
0

+(μ
0j
);


μ
0j
~ N(0,


2
)


μ
0j

is the differential effect for having attended school j and it is allowed to vary

around the overall intercept, β
0.
If a student’s university performance is not influenc
ed

by the school attended the μ
0j
will be close to zero as will the between
-
school

variance,


2
.


This model is known as a variance components or random

intercepts model and

implies that schools make a difference but do so in a general
way for all type of

students. A more complex random
-
coefficient model would index the slope of the

university/pre
-
entry relationship in the student
-
level, micro
-
model:


y
ij

= β
0j
x
0ij

1j
x
1ij

2
x
2ij
+(ε
0ij
x
0ij

+ ε
1ij
x
1ij

+ ε
2ij
x
2ij
)


and there would
be two macro models for the random intercepts and slopes


β
0j

= β
0

+ (μ
0j
)

β
1j

= β
1

+ (μ
1j
)


Whereas β
1
is the general relationship across all schools, the μ
1j

are the differential

slopes. If this is a positive value, a pupil going to this school does be
tter at university

than their A
-
level score would suggest; a negative value for this term would suggest

they do not do as well as predicted in general by their A
-
level score. There are now

two random terms at the school level and we can summarise they dis
tribution as a

joint multivariate Gaussian distribution:


(
μ
0j,
μ
1j
) ~ N
















2
1
01
2
01
2
0

The off
-
diagonal covariance term is important. For example, if


01
, the association

between the differential intercept and the differential slope for pre
-
entry is positive the

variance between school
s will be greatest for students with high pre
-
entry scores (a

positive μ
0j

being associated with a positive μ
1j
). If none of these variance
-
covariance

terms are significantly different from zero, the school a student attended as no effect

on subsequent per
formance once gender and pre
-
entry score has been taken into

account.


If there is significant variation between schools we would like to account for it by

school characteristics. We do this by including in the macro

model, either continuous

or categorica
l measures. Thus, we may assess if fee
-
paying schools produce an

increase in the performance of the average student, but attenuate the university/ pre

entry relationship. We would then include a categorical predictor (w
1j

equals 1 for

fee
-
paying, otherwise

0) in both macro
-
models:



23

β
0j

= β
0


1
w
1j

+(μ
0j
)

β
1j

= β
1


2
w
1j

+(μ
1j
)


and these can be combined into an overall multilevel model:


y
ij

=

β
0
x
0ij

1
x
1ij

2
x
2ij

1
w
1j
x
0ij


2
w
1j
x
1ij

+


0j
x
0ij


1j
x
1ij


0ij
x
0ij


1ij
x
1ij


2ij
x
2ij
)


The key
terms are β
1
which represents the relation

between pre
-
entry score and

attainment for non
-
fee
-
paying schools and α
2
which represents the differential for

those who

attended a fee
-
paying school. The latter is a cross
-
level interaction for it

involves variab
les at both levels, the school type and the student pre
-
entry score.

More complex models can be developed in the same way; including a subscript for

each classification or level, and including variance
-
covariances terms at

each level associated with partic
ular variables.


This form of specification can become very unwieldy with more complex crossings and nestings
introduce an alternative, simplified and very general notation. Level 1 is always represented by
subscript i. Higher levels are then defined by a
classification of these level 1 units, using the
classification names as subscripts for the random terms. No distinction is made between levels or
between crossings


each classification above level 1 is simply indexed separately as a source of
variation;
for example,



i
i
hood
n
i
school
i
i
x
y










)
3
(
)
(
'
)
2
(
)
(
1
1
0


with distributional assumptions


)
2
(
)
(
i
school


~
)
(
2
)
2
(


N
;

)
3
(
)
(
'
i
hood
n

~
)
(
2
)
3
(


N
;

i

~
)
(
2


N


The superscript (2) identifies a

school classification and superscript (3) identifies a
neighbourhood classification; the level 1 classification is assumed to be the first and its
identification is omitted). If there are two or more random coefficients associated with a
classification, s
ay with random school effects for pre
-
test or gender parameters then the
subscripts 0,1, … are used as in the standard notation. This notation does not increase in
complexity as classifications are added, but it does not convey information on crossing and
nesting. Consequently it is useful to accompany the model with a classification diagram.