Worksheet #5: Regression

sadhospitalΜηχανική

22 Φεβ 2014 (πριν από 3 χρόνια και 5 μήνες)

88 εμφανίσεις

Stat 20 Study Group

Faculty: Professor

Hank

Ibser

Study Group Leader:
Larry Wang
,

larry@csrjjsmp.com

Location: MW
1
-
2

201A Chavez

, http://www.csrjjsmp.com/stat20.html



Community through Academics and Leadership


Worksheet #
5
:

Regression

1.

Three doctors want to test a large group of patients for lupus. Dr. Cameron measures their white
blood cell counts

and finds them to be normally distributed with mean of 200 and SD of 20.

Dr.
Cha
se measures their sedimentation

rate
s

which are normally distributed with mean of 100 and SD
of 30
, and Dr. Foreman mea
sures their antibody levels, which are also normally distributed.

a.

What percentage of patients have a sedimentation rate between 160 and 190?

2.1
%

b.

Dr. Foreman found that a pa
tient with an antibody level of 100 was in the 90
th

percentile,
and a patient with an antibody level of 70 was in the 60
th

percentile. Find the mean and
standard deviation.

Mean≈63, SD≈28

c.

Suppose that in reality, there is a correlation of 1 between antibod
y levels and lupus, and a
correlation of 0 between lupus and white blood cell count, and a correlation of 0 between
lupus and sedimentation rate. What will be the correlations between the three test results?

Antibody
-
WBC: r=0 Antibody
-
sed: r=0 WBC
-
sed: No
information

2.

There is a positive correlation between husband IQ and wife IQ. Both averages are 100 with an
SD of 25. The scatter diagram is football shaped.

a.

True or False: Of those wives who have an IQ of 125, less than 50% have husbands with
IQs large
r than 125. If true, explain. If not, why not?

True. If r is not 1, we predict the average husband to have an IQ of less than 125, and
since the vertical strip is normal, the percentage above a number larger than the
average should be less than 50%. If r
=1, all the husbands will have IQ of 125, and
none will be larger.

b.

If the correlation was 0.6, perform a calculation that justifies your answer to part a).

By regression, w
e predict ~31% have husbands with IQs larger than 125.

3.

The average GPA during the fi
rst year of college is a 2.1, with an SD of 0.8, while the average
GPA during the second year is 2.5, with an SD of 0.8. The correlation is 0.4, and the scatter plot is
football shaped.

a.

Predict the second year GPA of someone who had a first year GPA of 2.
5.

2.66

b.

Predict the first year GPA of someone who had a second year GPA of 3.0.

2.3

c.

Predict the second year GPA of someone who had a first year GPA of 2.1.

2.5

d.

Predict the first year GPA of someone with an unknown second year GPA.

2.1

e.

True or false: people

tend to have a higher GPA their second year.

True

f.

Circle the best choice: Someone in the 40
th

percentile their first year tends to be ____ (at a
percentile larger than / at a percentile smaller than / at) the 40
th

percentile the second year.

larger

g.

True o
r False: The person at the top of the class the first year is predicted to be at the top
of the class the second year.

False

Stat 20 Study Group

Faculty: Professor

Hank

Ibser

Study Group Leader:
Larry Wang
,

larry@csrjjsmp.com

Location: MW
1
-
2

201A Chavez

, http://www.csrjjsmp.com/stat20.html



Community through Academics and Leadership


h.

An undergraduate advisor is doing a study that predicts the result of the second year, based
on the GPA during the first year, usi
ng regression. What is the correlation between the
investigator’s prediction and the actual GPA during the second year?

.4: The predicted value is a linear function

(change of scale)

of the first year GPAs.

4.

There is a negative correlation between hours wa
tching television and GPA
.
The scatter diagram is
football shaped.

a.

Greg

is at the 60
th

percentile for GPA. True or false and explain: The prediction for
Greg
’s percentile for hours watching television is less than 50.


True. His GPA is .25 in standard
units, which gets sent to some negative z
-
score for
TV hours after regression. A negative value in standard units means a percentile less
than 50.

b.

Lisa sees the negative correlation and decides to watch less television in an attempt to boost
her GPA.
Based

on the study
, will this increase her GPA?

No.

5.

There is a correlation between hours of exercise and body fat percent. One person uses the
baseline method to predict body fat percent while a second person uses regression to predict body
fat percent from h
ours of exercise. The ratio between the RMS error for the second person’s
technique to the RMS error for the first person’s is 0.6. What is the correlation between hours of
exercise and body fat percent? Remember, fat is burned when you exercise.

r=
-
.8