Limited Dependent Variables

strangerwineAI and Robotics

Oct 19, 2013 (3 years and 10 months ago)

77 views

Limited Dependent Variables


Often there are occasions where we are
interested in explaining a dependent
variable that has only limited
measurement


Frequently it is even dichotomous.

Examples


War(1) vs. no War(0)


Vote vs. no vote


Regime change vs. no change




These are often Probability Models


E.g.


Power disparity leads to war:





Where Y
t

is the occurrence (or not) of war, and X
t
is a measure of power disparity


We call this a Linear Probability Model

Problems with LPM Regression


OLS in this case is called the Linear
Probability Model


Running regression produces some problems


Errors are not distributed normally


Errors are heteroskedastic


Predicted Ys can be outside the 0.0
-
1. bounds
required for probability


Logistic Model


We need a model that produces true probabilities


The Logit, or cumulative logistic distribution offers one
approach.






This produces a sigmoid curve.


Look at equation under 2 conditions:


X
i

= +



X
i

=
-


Sigmoid curve


Probability Ratio


Note that





Where

Log Odds Ratio


The logit is the log of the odds ratio, and is given
by:





This model gives us a coefficient that may be
interpreted as a change in the weighted odds of
the dependent variable


Estimation of Model


We estimate this with maximum likelihood


The significance tests are z statistics


We can generate a Pseudo R
2
which is an attempt to
measure the percent of variation of the underlying
logit function explained by the independent
variables


We test the full model with the Likelihood Ratio
test (LR), which has a
χ
2
distribution with k degrees
of freedom

Neural Networks


The alternate formulation is representative of a
single
-
layer perceptron in an artificial neural
network.

Probit


If we can assume that the dependent variable is
actually the result of an underlying (and
immeasurable) propensity or utility, we can use the
cumulative normal probability function to estimate
a Probit model


Also, more appropriate if the categories (or their
propensities) are likely to be normally distributed


It looks just like a logit model in practice

The Cumulative Normal Density
Function


The normal distribution is given by:





The Cumulative Normal Density Function is:

The Standard Normal CDF


We assume that there is an underlying threshold
value (I
i
) that if the case exceeds will be a 1, and 0
otherwise.


We can standardize and estimate this as

Probit estimates


Again, maximum likelihood estimation


Again, a Pseudo R2


Again, a LR ratio with k degrees of freedom

Assumptions of Models


All Y

s are in {0,1} set


They are statistically independent


No multicollinearity


The P(Y
i
=1) is normal density for probit, and
logistic function for logit

Ordered Probit


If the dependent variable can take on ordinal
levels, we can extend the dichotomous Probit
model to an n
-
chotomous, or ordered, Probit
model


It simply has several threshold values
estimated


Ordered logit works much the same way

Multinomial Logit


If our dependent variable takes on different
values, but they are nominal, this is a
multinomial logit model

Some additional info


The Modal category is good benchmark


Present % correctly predicted


This can be calculated and presented.


This, when compared to the modal category,
gives us a good indication of fit.

Stata


Use Leadership Change data


(1992 cross section)

1992
-
Stata

Test different models


Dependent variable Leadership change


Examine distribution

tables ledchan1


Independent variables


Try different


Try
corr

and then (
pwcorr
)


Try the following

regress ledchan1 grwthgdp hlthexp illit_f polity2

logit ledchan1 grwthgdp hlthexp illit_f polity2

logistic ledchan1 grwthgdp hlthexp illit_f polity2

probit ledchan1 grwthgdp hlthexp illit_f polity2

ologit ledchan1 grwthgdp hlthexp illit_f polity2

oprobit ledchan1 grwthgdp hlthexp illit_f polity2

mlogit ledchan1 grwthgdp hlthexp illit_f polity2

tobit ledchan1 grwthgdp hlthexp illit_f polity2, ul ll



Tobit


Assumes a 0 value, and then a scale


E.g., the decision to incarcerate


0 or 1


(Imprison or not)


If Imprison, than for how many years?

Other models


This leads to many other models


Count models & Poisson regression


Duration/Survival/hazard models


Censoring and truncation models


Selection bias models