Basic Statistical Concepts
Chapter 2 Reading instructions
•
2.1 Introduction: Not very important
•
2.2 Uncertainty and probability: Read
•
2.3
Bias and variability
: Read
•
2.4
Confounding and interaction
: Read
•
2.5
Descriptive and inferential statistics
: Repetition
•
2.6
Hypothesis testing and p

values
: Read
•
2.7
Clinical significance and clinical equivalence
: Read
•
2.8 Reproducibility and generalizability: Read
Bias
and variability
Bias: Systemtic deviation from the true value
ˆ
E
Design, Conduct, Analysis, Evaluation
Lots of examples on page 49

51
Bias
and variability
Larger study does not decrease bias
n
ˆ
;
n
Drog X

Placebo

7

4

10
mm Hg

7

4

10
Drog X

Placebo
mm Hg
mm Hg
Drog X

Placebo

7

4

10
n=40
n=200
N=2000
Distribution of sample means:
= population mean
Population mean
bias
Bias
and variability
There is a multitude of sources for bias
Publication bias
Selection bias
Exposure bias
Detection bias
Analysis bias
Interpretation bias
Positive results tend to be published while negative of
inconclusive results tend to not to be published
The outcome is correlated with the exposure. As an example,
treatments tends to be prescribed to those thought to
benefit from them. Can be controlled by randomization
Differences in exposure e.g. compliance to treatment could
be associated with the outcome, e.g. patents with side
effects stops taking their treatment
The outcome is observed with different intensity depending
no the exposure. Can be controlled by blinding investigators
and patients
Essentially the I error, but also bias caused by model miss
specifications and choice of estimation technique
Strong preconceived views can influence how analysis results
are interpreted.
Bias and
variability
Amount of difference between observations
True biological:
Temporal:
Measurement error:
Variation
between
subject due
to biological factors (covariates)
including the treatment.
Variation over time (and space)
Often
within
subjects.
Related to instruments or observers
Design, Conduct, Analysis, Evaluation
Raw Blood pressure data
Baseline
8 weeks
Placebo
Drug X
DBP
(mmHg)
Subset of plotted data
Bias and
variability
X
Y
Unexplained
variation
Variation in
observations
=
Explained
variation
+
Bias and
variability
Drug A
Drug B
Outcome
Is there any difference between drug A and drug B?
Bias and
variability
Y=
μ
A
+βx
Y=
μ
B
+βx
μ
A
μ
B
x=age
Model:
ij
ij
i
ij
x
Y
Confounding
Predictors
of
treatm
ent
Predictors
of
outcome
Confounders
Treatment
allocation
A
B
Outcome
Example
Smoking Cigarettes is not so bad but watch out for
Cigars or Pipes (at least in Canada)
Variable
Non smokers
Cigarette
smokers
Cigar
or pipe
smokers
Mortality rate
*
20.2
20.5
35.5
Cochran, Biometrics 1968
*) per 1000 person

years %
Example
Smoking Cigaretts is not so bad but watch out for
Cigars or Pipes (at least in Canada)
Variable
Non
smokers
Cigarette
smokers
Cigar
or
pipe
smokers
Mortality
rate*
20.2
20.5
35.5
Average
age
54.9
50.5
65.9
Cochran, Biometrics 1968
*) per 1000 person

years %
Example
Smoking Cigaretts is not so bad but watch out for
Cigars or Pipes (at least in Canada)
Variable
Non
smokers
Cigarette
smokers
Cigar
or
pipe
smokers
Mortality
rate*
20.2
20.5
35.5
Average
age
54.9
50.5
65.9
Adjusted
mortality
rate*
20.2
26.4
24.0
Cochran, Biometrics 1968
*) per 1000 person

years %
Confounding
The effect of two or more factors can not be separated
Example:
Compare survival for
surgery and drug
R
Life long treatment with drug
Surgery at time 0
•
Surgery only if healty enough
•
Patients in the surgery arm may take drug
•
Complience in the drug arm May be poor
Looks ok but:
Survival
Time
Confounding
Can be sometimes be handled in the design
Example: Different effects in males and females
Imbalance between genders affects result
Stratify by gender
R
A
B
Gender
M
F
R
R
A
A
B
B
Balance on average
Always balance
Interaction
The outcome on one variable depends
on the value of another variable.
Example
Interaction between two drugs
R
A
A
B
B
Wash
out
A=AZD1234
B=AZD1234 +
Clarithromycin
Interaction
Mean
0
1
2
3
4
5
0
4
8
12
16
20
24
Time after dose
Plasma concentration (µmol/L)
linear scale
AZD0865 alone
Combination of clarithromycin
and AZD0865
19.75
(µmol*h/L)
36.62
(µmol*h/L)
AUC AZD1234:
AUC AZD1234 + Clarithromycin:
Ratio:
0.55 [0.51, 0.61]
AZD1234
AZD1234
Example: Drug interaction
Interaction
Example:
Treatment by center interaction
Treatment difference in diastolic blood pressure
25
20
15
10
5
0
5
10
15
0
5
10
15
20
25
30
Ordered center number
mmHg
Average treatment effect:

4.39 [

6.3,

2.4] mmHg
Treatment by center: p=0.01
What can be said about the treatment effect?
Descriptive and inferential
statistics
The presentation of the results from a clinical trial
can be split in three categories:
•
Descriptive statistics
•
Inferential statistics
•
Explorative statistics
Descriptive and inferential
statistics
Descriptive statistics aims to describe various
aspects of the data obtained in the study.
•
Listings.
•
Summary statistics (Mean, Standard Deviation…).
•
Graphics.
Descriptive and inferential
statistics
Inferential statistics
forms a basis for a conclusion
regarding a prespecified objective addressing the
underlying population.
Hypothesis
Results
Confirmatory analysis:
Conclusion
Descriptive and inferential
statistics
Explorative statistics
aims to find interesting results that
Can be used to formulate new objectives/hypothesis for
further investigation in future studies.
Results
Hypothesis
Explorative analysis:
Conclusion?
Hypothesis testing, p

values and
confidence intervals
Objectives
Variable
Design
Statistical Model
Null hypothesis
Estimate
p

value
Confidence interval
Results
Interpretation
Hypothesis testing, p

values
Statistical model: Observations
n
n
R
X
X
,
1
X
from a class of distribution functions
:
P
Hypothesis test: Set up a null hypothesis: H
0:
0
and an alternative H
1
:
1
Reject H
0
if
0

c
S
P
X
n
c
R
S
X
p

value:
Rejection region
The smallest significance level for which the
null hypothesis can be rejected.
Significance level
Confidence intervals
A confidence set is a random subset
covering the true parameter value with probability at
least .
X
C
1
Let
rejected
not
:
if
0
rejected
:
if
1
,
*
0
*
0
*
H
H
X
(critical function)
Confidence set:
0
,
:
X
X
C
The set of parameter values correponding to hypotheses
that can not be rejected.
Example
y
ij
=
μ
+
τ
i
+
β
(
x
ij

x
∙∙
) +
ε
ij
Variable: The change from baseline to end of study in sitting DBP
(sitting SBP) will be described with an
ANCOVA
model,
with treatment as a factor and baseline blood pressure
as a covariate
Null hypoteses (subsets of ):
H
01
:
τ
1
=
τ
2
(DBP)
H
02
:
τ
1
=
τ
2
(SBP)
H
03
:
τ
2
=
τ
3
(DBP)
H
04
:
τ
2
=
τ
3
(SBP)
Objective: To compare sitting diastolic blood pressure (DBP) lowering effect of
hypersartan 16 mg with that of hypersartan 8 mg
Model:
treatment effect
i = 1,2,3
{16 mg, 8 mg, 4 mg}
Parameter space:
4
R
4
R
0
3
2
1
Example contined
Hypothesis
Variable
LS Mean
CI (95%)
p

value
1: 16 mg vs 8 mg
Sitting DBP

3.7 mmHg
[

4.6,

2.8]
<0.001
2: 16 mg vs 8 mg
Sitting SBP

7.6 mmHg
[

9.2,

6.1]
<0.001
3: 8 mg vs 4 mg
Sitting DBP

0.9 mmHg
[

1.8, 0.0]
0.055
4 : 8 mg vs 4 mg
Sitting SBP

2.1 mmHg
[

3.6,

0.6]
0.005
This is a t

test where the test statistic follows a t

distribution
Rejection region:
c
T
X
X
:
001
.
0
P

value: The null hypothesis can pre rejected at
X
T
0

c
c
2
1
0

4.6

2.8
P

value says nothing about the
size of the effect!
No. of patients per group
Estimation of effect
p

value
10
1.94 mmHg
0.376
100

0.65 mmHg
0.378
1000
0.33 mmHg
0.129
10000
0.28 mmHg
<0.0001
100000
0.30 mmHg
<0.0001
A statistical significant difference does
NOT
need to be clinically relevant!
Example:
Simulated data. The difference between treatment and
placebo is 0.3 mmHg
Statistical and clinical
significance
Statistical significance:
Clinical significance:
Health ecominical relevance:
Is there any difference between
the evaluated treatments?
Does this difference have any
meaning for the patients?
Is there any economical
benefit for the society in
using the new treatment?
Statistical and clinical
significance
A study comparing gastroprazole 40 mg and mygloprazole 30 mg
with respect to healing of erosived eosophagitis after 8 weeks
treatment.
Drug
Healing rate
gastroprazole 40 mg
87.6%
mygloprazole
30 mg
84.2%
Cochran Mantel Haenszel p

value = 0.0007
Statistically significant!
Health economically relevant?
Clinically significant?
Comments 0
Log in to post a comment