# Conditional Stereotype Logistic Regression A new estimation command

Security

Nov 30, 2013 (4 years and 5 months ago)

86 views

1

Rob Woodruff

Battelle Memorial Institute, Health & Analytics

Email: woodruffr@battelle.org

Cynthia Ferre

Centers for Disease Control and Prevention

Conditional Stereotype Logistic Regression

A new estimation command

2

Overview

What is it?

-

Stereotype Logistic Regression

-

Conditional on what?

What‘s it good for?

Syntax and Examples

3

Constrained Multinomial Logistic Regression

Multinomial Model

-
Categorical Outcome Variable

-
Vector of Explanatory Variables

-
Related through the m
logits
:

4

Constrained Multinomial (continued)

-
The stereotype model imposes the constraints:

Note:
The phi’s are scalar quantities

5

Full multinomial has m(p+1) parameters

Stereotype model has m
-
1 + m + p = 2m
-
1+p

The phi parameters give a way to quantify
ordinality

of the
outcome variable. If

Then we have evidence of ordinal effect.

Also allow tests of
distinguishability

of outcome categories

6

So what’s the condition?

The multinomial and stereotype logistic regression models
are implemented in
Stata

by
mlogit

and

slogit

Assume independence of observations, not true for matched
case
-
control data

For matched case control study, only independence of matched
groups (strata, panels, clusters, etc)

For 1:M matching, condition on stratum total for outcome
variable and focus instead on conditional likelihood

Do I have to?

Why condition on this particular event?

7

Conditional vs. Unconditional Likelihood

8

Conditional vs. Unconditional Likelihood

9

CSTEREO

cstereo

command

Basic syntax:

.
cstereo

depvar

indepvars

[if] [in], group(
varname
)
[options]

10

Example with Real Data:

Preterm Birth and Vitamin D

1:2 (some 1:1) Pooled, Matched Case
-
Control Study of 2,583
Mothers in 870 matched groups

A case defined as gestational age at delivery of <37 weeks

outcome4=3 (<32 weeks), outcome4=2, (32
-
35 weeks), outcome4=1
(36 weeks) and outcome4=0 (control: 37+ weeks)

Primary exposure variable of interest: Vitamin D levels,
ohd25_total: blood serum concentration of (25)OHD in
ng
/ml

Sample of other covariates measured:

edu

= 0/1 indicator of post
-
high school education

vitamin = 0/1 indicator of vitamin use during pregnancy

11

Example Continued (
nolog

option):

P
-
v
a
l
u
e
:

.
0
3
1
3
8
2
8
1
C
h
i
2

v
a
l
u
e

o
n

4

d
e
g
r
e
e
s

o
f

f
r
e
e
d
o
m
:

1
0
.
6
0
4
8
6
1
L
o
g
-
L
i
k
e
l
i
h
o
o
d

f
r
o
m

C
o
n
d
i
t
i
o
n
a
l

M
u
l
t
i
n
o
m
i
a
l

M
o
d
e
l
:

-
8
3
5
.
8
3
6
7
9

a
l
l

n
e
g
a
t
i
v
e

o
u
t
c
o
m
e
s
.
n
o
t
e
:

7
7

g
r
o
u
p
s

(
1
3
9

o
b
s
)

d
r
o
p
p
e
d

b
e
c
a
u
s
e

o
f

a
l
l

p
o
s
i
t
i
v
e

o
r
.

c
s
t
e
r
e
o

o
u
t
c
o
m
e
4

o
h
d
2
5
_
t
o
t
a
l

e
d
u

v
i
t
a
m
i
n
,

g
r
o
u
p
(
m
a
t
c
h
g
r
o
u
p
)

n
o
l
o
g
12

Example Continued:

_
c
o
n
s

.
9
3
9
8
1
1
3

1
.
2
0
6
1
3
9

0
.
7
8

0
.
4
3
6

-
1
.
4
2
4
1
7
8

3
.
3
0
3
8
p
h
i
2

_
c
o
n
s

.
8
7
6
4
5
7
8

1
.
2
6
8
3
3
1

0
.
6
9

0
.
4
9
0

-
1
.
6
0
9
4
2
4

3
.
3
6
2
3
4
p
h
i
1

v
i
t
a
m
i
n

.
1
3
0
1
3
6
9

.
1
9
5
4
5
1
6

0
.
6
7

0
.
5
0
6

-
.
2
5
2
9
4
1
3

.
5
1
3
2
1
5
1

e
d
u

-
.
4
0
1
0
3
9
1

.
4
3
1
5
8
7

-
0
.
9
3

0
.
3
5
3

-
1
.
2
4
6
9
3
4

.
4
4
4
8
5
5
9

o
h
d
2
5
_
t
o
t
a
l

-
.
0
0
7
3
6
8
4

.
0
1
4
4
9
1
6

-
0
.
5
1

0
.
6
1
1

-
.
0
3
5
7
7
1
4

.
0
2
1
0
3
4
6
x
b

o
u
t
c
o
m
e
4

C
o
e
f
.

S
t
d
.

E
r
r
.

z

P
>
|
z
|

[
9
5
%

C
o
n
f
.

I
n
t
e
r
v
a
l
]

L
o
g

l
i
k
e
l
i
h
o
o
d

=

-
8
4
1
.
1
3
9
2
2

P
r
o
b

>

c
h
i
2

=

0
.
6
0
4
8

W
a
l
d

c
h
i
2
(
3
)

=

1
.
8
5

N
u
m
b
e
r

o
f

o
b
s

=

2
3
2
2
13

Interpretation of
cstereo

output:

Estimated beta coefficient of ohd25_total =
-
0.0074 with
95% confidence interval (
-
0.0358, 0.0210)

Odds ratio of being in <32 weeks gestational age compared to
control is exp(
-
0.0074) = 0.993 (0.965, 1.021)

Now for odds ratios for the 32
-
35 weeks and 36 week case
categories, we need the products of the parameters:

For standard errors, use Delta Method via
nlcom

14

Interpretation continued:

_
n
l
_
1

-
.
0
0
6
9
2
4
9

.
0
0
7
2
7
5
7

-
0
.
9
5

0
.
3
4
1

-
.
0
2
1
1
8
5

.
0
0
7
3
3
5
1

o
u
t
c
o
m
e
4

C
o
e
f
.

S
t
d
.

E
r
r
.

z

P
>
|
z
|

[
9
5
%

C
o
n
f
.

I
n
t
e
r
v
a
l
]

_
n
l
_
1
:

[
x
b
]
o
h
d
2
5
_
t
o
t
a
l
*
[
p
h
i
2
]
_
c
o
n
s
.

n
l
c
o
m

[
x
b
]
o
h
d
2
5
_
t
o
t
a
l
*
[
p
h
i
2
]
_
c
o
n
s
Exponentiating

gives the odds ratio of being in the
32
-
35 weeks case category compare to controls of
0.994 with a 95% C.I. of (0.983, 1.004)

15

Constraints:

Are the 36 week and 32
-
35 weeks case categories
distinguishable?

a
l
l

n
e
g
a
t
i
v
e

o
u
t
c
o
m
e
s
.
n
o
t
e
:

7
7

g
r
o
u
p
s

(
1
3
9

o
b
s
)

d
r
o
p
p
e
d

b
e
c
a
u
s
e

o
f

a
l
l

p
o
s
i
t
i
v
e

o
r
.

c
s
t
e
r
e
o

o
u
t
c
o
m
e
4

o
h
d
2
5
_
t
o
t
a
l

e
d
u

v
i
t
a
m
i
n
,

g
r
o
u
p
(
m
a
t
c
h
g
r
o
u
p
)

n
o
l
o
g

c
o
n
s
t
r
a
i
n
t
s
(
1
)
.

c
o
n
s
t
r
a
i
n
t

1

[
p
h
i
1
]
_
c
o
n
s
=
[
p
h
i
2
]
_
c
o
n
s
16

Constraint Output

_
c
o
n
s

.
9
4
1
7
8
3
6

1
.
2
4
2
9
1

0
.
7
6

0
.
4
4
9

-
1
.
4
9
4
2
7
6

3
.
3
7
7
8
4
3
p
h
i
2

_
c
o
n
s

.
9
4
1
7
8
3
6

1
.
2
4
2
9
1

0
.
7
6

0
.
4
4
9

-
1
.
4
9
4
2
7
6

3
.
3
7
7
8
4
3
p
h
i
1

v
i
t
a
m
i
n

.
1
2
9
4
8
0
6

.
1
8
8
8
1
5
4

0
.
6
9

0
.
4
9
3

-
.
2
4
0
5
9
0
8

.
4
9
9
5
5
1
9

e
d
u

-
.
3
9
0
9
9
2
4

.
4
2
8
9
3
4
8

-
0
.
9
1

0
.
3
6
2

-
1
.
2
3
1
6
8
9

.
4
4
9
7
0
4
3

o
h
d
2
5
_
t
o
t
a
l

-
.
0
0
6
8
3
8
2

.
0
1
3
1
7
2

-
0
.
5
2

0
.
6
0
4

-
.
0
3
2
6
5
4
8

.
0
1
8
9
7
8
4
x
b

o
u
t
c
o
m
e
4

C
o
e
f
.

S
t
d
.

E
r
r
.

z

P
>
|
z
|

[
9
5
%

C
o
n
f
.

I
n
t
e
r
v
a
l
]

(

1
)

[
p
h
i
1
]
_
c
o
n
s

-

[
p
h
i
2
]
_
c
o
n
s

=

0
L
o
g

l
i
k
e
l
i
h
o
o
d

=

-
8
4
1
.
1
4
5
4

P
r
o
b

>

c
h
i
2

=

0
.
6
2
9
3

W
a
l
d

c
h
i
2
(
3
)

=

1
.
7
3

N
u
m
b
e
r

o
f

o
b
s

=

2
3
2
2
17

Constraint Output

The log
-
likelihood from the constrained model is
-
841.145
compared to
-
841.139 for the unconstrained stereotype model

Difference of 0.006 gives a chi2 value of 0.012 on 1 degree
of freedom

P
-
value = 0.91

Unconstrained stereotype model does not fit significantly
better than the constrained and the two case categories are
indistinguishable

18

Relationship to Other Models for
Ordered/Categorical Outcomes

Constrained Multinomial

Not as parsimonious as the proportional odds model (
ologit
)
but not valid in outcome dependent sampling

Adjacent category model is (basically) a constrained
stereotype model. Also valid under outcome dependent
sampling

19

Limitations

Convergence Issues

Currently only a one dimensional stereotype model

Cannot currently force an ordering on the stereotype
parameters

20

References:

Ferre C, et al; Maternal 25
-
Hydroxyvitamin D Status and the
Risk of Preterm Delivery: A Multi
-
Center Nested Case Control
Study; preprint

Mukherjee

B, Liu I,
Sinha

S; Analysis of matched case
-
control data with multiple ordered disease states;
Statistics in Medicine 2007

Ahn

J et. al.; Missing Exposure Date in Stereotype
Regression Model; Biometrics 2011

Andersen EB; Asymptotic Properties of Conditional Maximum
-
Likelihood Estimators; Journal of the Royal Statistical
Society 1970

Liang KY, Stewart WF;
Polychotomous

Logistic Regression
Methods for Matched Case
-
Control Studies with Multiple Case
or Control Groups; American Journal of Epidemiology 1987

Scott AJ, Wild CJ; Fitting Regression Models to Case
-
Contro

Data by Maximum Likelihood;
Biometrika

1997

Anderson JA; Regression and Ordered Categorical Variable;
Journal of the Royal Statistical Society 1984
\

Greenland S; Alternative Models for Ordinal Logistic
Regression; Statistics in Medicine 1994