Introduction to Quantile Regression

spotlessstareΑσφάλεια

29 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

73 εμφανίσεις



Taupo, Biometrics 2009

Introduction to Quantile
Regression

David Baird


VSN NZ, 40 McMahon Drive,

Christchurch, New Zealand



email: David@vsn.co.nz



Taupo, Biometrics 2009

Reasons to use quantiles rather than means


Analysis of distribution rather than
average


Robustness


Skewed data


Interested in representative value


Interested in tails of distribution


Unequal variation of samples



E.g. Income distribution is highly skewed so
median relates more to typical person that mean.



Taupo, Biometrics 2009

Quantiles


Cumulative Distribution Function





Quantile Function





Discrete step function




)
Prob(
)
(
y
Y
y
F


)
)
(
:
min(
)
(




y
F
y
Q
CDF
1.0
0.6
0.2
2.01.51.00.50.0
0.4
-0.5-1.0
0.0
0.8
-1.5-2.0
Quantile (n=20)
-1.0
-1.5
1.0
0.0
1.00.8
1.5
0.6
0.5
0.40.2
-0.5


Taupo, Biometrics 2009

Optimality Criteria


Linear absolute loss



Mean optimizes



Quantile
τ

optimizes








I
= 0,1 indicator function




i
y
min









i
i
i
i
y
e
e
I
e
)
0
(
min
-1
1
0
-1
1
0

1




Taupo, Biometrics 2009

Regression Quantile



X
y
e
e
I
e
i
i
i
i







)
0
(
min

Optimize






Solution found by Simplex algorithm


Add slack variables


split
e
i

into positive and

negative residuals



Solution at vertex of feasible region


May be non
-
unique solution (along edge)


-

so solution passes through
n

data points

0

,
0




i
i
i
i
i
v
u
v
u
e


Taupo, Biometrics 2009

Simple Linear Regression

Food
Expenditure
vs Income

Engel 1857

survey of 235
Belgian households

Range of
Quantiles

Change of
slope at
different
quantiles?



Taupo, Biometrics 2009

Variation of Parameter with Quantile



Taupo, Biometrics 2009

Estimation of Confidence Intervals


Asymptotic approximation of
variation


Bootstrapping


Novel approach to bootstrapping by
reweighting rather than resampling


W
i

~ Exponential(1)


Resampling is a discrete
approximation of

exponential weighting


Avoids changing

design points so

faster and identical

quantiles produced

5
7
0
3
1
6
4
2
32 54 610 7


Taupo, Biometrics 2009

Bootstrap Confidence Limits



Taupo, Biometrics 2009

Polynomials


Support


points



Taupo, Biometrics 2009

Groups and interactions



Taupo, Biometrics 2009

Splines


Generate basis functions

10 30 50200 40 60
Motorcycle Helmet data

Acceleration vs Time
from impact



Taupo, Biometrics 2009

Loess


Generate moving weights using
kernel and

specified

window width



Taupo, Biometrics 2009

Non
-
Linear Quantile Regression


Run Linear
quantile
regression

in non
-
linear
optimizer

Quantiles for
exponential
model



Taupo, Biometrics 2009

Example Melbourne Temperatures



Taupo, Biometrics 2009

Example Melbourne Temperatures



Taupo, Biometrics 2009

Wool Strength Data

5 Farms

Breaking
strength
and cross
-
sectional
area of
individual
wool fibres
measured



Taupo, Biometrics 2009

Fitted Quantiles



Taupo, Biometrics 2009

Fitted Quantiles



Taupo, Biometrics 2009

Fitted Quantiles



Taupo, Biometrics 2009

Fitted Quantiles



Taupo, Biometrics 2009

Fitted Quantiles



Taupo, Biometrics 2009

Wool Strength Data



Taupo, Biometrics 2009

Between Farm Comparisons



Taupo, Biometrics 2009

Software for Quantile Regression


SAS Proc QUANTREG

(experimental v 9.1)


R Package quantreg


GenStat 12 edition procedures:

RQLINEAR & RQSMOOTH

Menu: Stats | Regression | Quantile Regression



Taupo, Biometrics 2009

Reference


Roger Koenker, 2005.

Quantile Regression
,

Cambridge University Press.