# Limited Dependent Variables

AI and Robotics

Oct 19, 2013 (4 years and 8 months ago)

89 views

Limited Dependent Variables

Often there are occasions where we are
interested in explaining a dependent
variable that has only limited
measurement

Frequently it is even dichotomous.

Examples

War(1) vs. no War(0)

Vote vs. no vote

Regime change vs. no change

These are often Probability Models

E.g.

Where Y
t

is the occurrence (or not) of war, and X
t
is a measure of power disparity

We call this a Linear Probability Model

Problems with LPM Regression

OLS in this case is called the Linear
Probability Model

Running regression produces some problems

Errors are not distributed normally

Errors are heteroskedastic

Predicted Ys can be outside the 0.0
-
1. bounds
required for probability

Logistic Model

We need a model that produces true probabilities

The Logit, or cumulative logistic distribution offers one
approach.

This produces a sigmoid curve.

Look at equation under 2 conditions:

X
i

= +

X
i

=
-

Sigmoid curve

Probability Ratio

Note that

Where

Log Odds Ratio

The logit is the log of the odds ratio, and is given
by:

This model gives us a coefficient that may be
interpreted as a change in the weighted odds of
the dependent variable

Estimation of Model

We estimate this with maximum likelihood

The significance tests are z statistics

We can generate a Pseudo R
2
which is an attempt to
measure the percent of variation of the underlying
logit function explained by the independent
variables

We test the full model with the Likelihood Ratio
test (LR), which has a
χ
2
distribution with k degrees
of freedom

Neural Networks

The alternate formulation is representative of a
single
-
layer perceptron in an artificial neural
network.

Probit

If we can assume that the dependent variable is
actually the result of an underlying (and
immeasurable) propensity or utility, we can use the
cumulative normal probability function to estimate
a Probit model

Also, more appropriate if the categories (or their
propensities) are likely to be normally distributed

It looks just like a logit model in practice

The Cumulative Normal Density
Function

The normal distribution is given by:

The Cumulative Normal Density Function is:

The Standard Normal CDF

We assume that there is an underlying threshold
value (I
i
) that if the case exceeds will be a 1, and 0
otherwise.

We can standardize and estimate this as

Probit estimates

Again, maximum likelihood estimation

Again, a Pseudo R2

Again, a LR ratio with k degrees of freedom

Assumptions of Models

All Y

s are in {0,1} set

They are statistically independent

No multicollinearity

The P(Y
i
=1) is normal density for probit, and
logistic function for logit

Ordered Probit

If the dependent variable can take on ordinal
levels, we can extend the dichotomous Probit
model to an n
-
chotomous, or ordered, Probit
model

It simply has several threshold values
estimated

Ordered logit works much the same way

Multinomial Logit

If our dependent variable takes on different
values, but they are nominal, this is a
multinomial logit model

The Modal category is good benchmark

Present % correctly predicted

This can be calculated and presented.

This, when compared to the modal category,
gives us a good indication of fit.

Stata

(1992 cross section)

1992
-
Stata

Test different models

Examine distribution

tables ledchan1

Independent variables

Try different

Try
corr

and then (
pwcorr
)

Try the following

regress ledchan1 grwthgdp hlthexp illit_f polity2

logit ledchan1 grwthgdp hlthexp illit_f polity2

logistic ledchan1 grwthgdp hlthexp illit_f polity2

probit ledchan1 grwthgdp hlthexp illit_f polity2

ologit ledchan1 grwthgdp hlthexp illit_f polity2

oprobit ledchan1 grwthgdp hlthexp illit_f polity2

mlogit ledchan1 grwthgdp hlthexp illit_f polity2

tobit ledchan1 grwthgdp hlthexp illit_f polity2, ul ll

Tobit

Assumes a 0 value, and then a scale

E.g., the decision to incarcerate

0 or 1

(Imprison or not)

If Imprison, than for how many years?

Other models

This leads to many other models

Count models & Poisson regression

Duration/Survival/hazard models

Censoring and truncation models

Selection bias models