When an inter-laboratory testing is conducted, the analysis of the

frizzflowerΠολεοδομικά Έργα

29 Νοε 2013 (πριν από 3 χρόνια και 11 μήνες)

100 εμφανίσεις

1

Module Eight:Comparative Study for Inter
-
laboratory Testing

When an inter
-
laboratory testing is conducted, the analysis of the
testing results may include:


Determine the best estimate and its corresponding uncertainty of the
variable of interest:


Make an interval estimation of the variable of interest based on the
corresponding distribution: confidence interval:


Conduct a comparative study:

1.
Comparing with the reference standard.

2.
Comparing the effects between two groups, when two samples are
tested in dependently: For example, two methods of testing procedures
are to be compared. 20 units of similar material will be randomly
assigned for testing using either methods, 10 for each method. The
purpose is to compare the difference between these two testing
methods.

Best
Best x
x U

Best
Best x
x kU

2

3. Comparing the changes of a response before and after (or with/without) a
treatment is performed. For example, to test the poison of a chemical
compound with and without an additional additive in ten labs. Each
compound is divided into two sub
-
samples. Each lab test the pair of the
compound, one with additional additive, he other without. The difference
between each pair tested by a lab is due to the additive. Note, in this
comparative study, each pair of sub
-
samples are the same or very similar.
This is a paired sample problem.

4. Comparing the effects among several groups, when a treatment has more
than two levels. This type of comparative studies are common in inter
-
laboratory testing. For example, one is interested in studying the
compressive strength of concrete using five different formula. Ten
specimen are produced using each formula. The compressive strengths
are tested. This is a one
-
factor experiment with five factor levels. Our
interest is to compare their strength and to determine which formula gives
the highest strength. If the only difference of these formula is the dosage
of an additive, ranging from 1%, 1.5%, 2%, 2.5% and 3%. Then, in
addition to compare the strength among the formula, we can also fit a
prediction model to determine the dosage level that results the maximum
strength.

3

5.
In many experiments, there may be more than one factor. The study is not only
understand the effect of each factor, but also to study the interaction effect between two
factors. This is a multifactor study. For example, For the compressive strength of
concrete testing study, in addition to the five levels of formula, in the process of
concrete formation, the temperature is another critical factor. We should consider both
formula factor and temperature factor when producing the specimen for strength test.
Suppose we would like to test for three levels of temperature. We have 5x3 two
factorial design. For each treatment combination, four specimen are produced. We have
a total of 3x5x4 = 60 specimen for strength testing. We are interested in studying
comparing the strength among different formula, among different temperature, and the
strengths among different formula for each temperature level.

6. Another type of study in lab testing is to study the variance components of factors for
the purpose of identifying factor levels that will reduce variability of response variable.
For example, in a metal alloy casting process, each casting is broken into small bars that
are used for other applications. The tensile strength of the alloy is critical to its intended
use. There is a specification of the strength. If variation of the strength is excessively
large, this means a large amount of bars will not meet the specification limits. An
experiment can be designed to identify factors and their level combinations that will
produce bars with small variability. This is a variance component problem.

4

In this module, we will discuss the type of comparative studies: 1,
2 and 3. In Module Ten, we will discuss the comparative study
four, the one
-
factor design and analysis. And in Module Eleven we
will focus on comparative 5, multifactor designs and analysis.
Module Twelve will study the Variance Components problems.

5

Comparative Study One: Comparing testing results with a
given reference or a given standard

In a lab testing study, one may be be interested in making a comparison of the
testing results with a given standard or a reference measurement. The
following steps may be applied to plan such a study:

1.
Identify the given standard or reference measurement, and make sure the
resource that developed the standard meet your purpose.

2.
Set up an adequate lab testing environment and testing procedure.

3.
The operator of the testing should be adequately trained to reduce unexpected
errors.

4.
Plan the experimental procedure, determine the number of experimental runs
to be conducted.

5.
Prepare the needed experimental units, and make sure these units are as
homogeneous as possible.

6.
Conduct the lab testing and carefully collect the data of interest. It is a good
practice to record any special events occurred during the testing.


6

Now, a data set is collected, and we would like to make a
comparison with a given reference. Steps for this analysis
may include:

1.
Carefully check the data for unusual measurements that may be due to
systematic error or special causes


Techniques for detecting outliers
can be applied here.

2.
Compute descriptive summaries and graph a histogram, box plot for
identifying outliers and normal probability plot for checking the
normality assumption.

3.
If there is a serious violation of normality assumption, one may choose
to make a data transformation. If there are outliers, one should go back
to check the possible special causes, and decide to keep or drop these
outliers before the analysis.

4.
The comparison is the one
-
sample test. Here is the procedure to
conduct the comparison.


7

One
-
sample t
-
test for comparing the testing results with a
given reference.

Example: The brightness of a certain type of paper is defined in the scale
of 1 to 100. A reference of the brightness of the type of paper is at the
scale 60. A lab is experimenting a new process for producing the type of
paper, and would like to test its brightness to see if the paper meet the
required brightness. A random sample of 30 sheets are chosen and tested
by a lab. Here is the collected data:

55

42

59

64

59

68

60

52

56

59

55

62

59

57

63

58

52

55

58

61

65

63

52

58

62

54

58

59

64

63

A quick eye check immediately identify a value of 42, which a much
smaller than the rest.

We first draw a box plot and a normal probability to identify outliers and
to check the normality assumption.

8

4
0
5
0
6
0
7
0
b
r
i
g
h
t
n
e
s
s
B
o
x
p
l
o
t

o
f

b
r
i
g
h
t
n
e
s
s

u
s
i
n
g

t
h
e

e
n
t
i
r
e

d
a
t
a

s
e
t
5
0
6
0
7
0
b
r
i
g
h
t
n
e
s
s
_
1
B
o
x
p
l
o
t

o
f

b
r
i
g
h
t
n
e
s
s
_
1
:

T
h
e

o
u
t
l
i
e
r

'
4
2
'

i
s

d
e
l
e
c
t
e
d
A
v
e
r
a
g
e
:

5
8
.
9
6
5
5
S
t
D
e
v
:

4
.
1
1
8
6
2
N
:

2
9
A
n
d
e
r
s
o
n
-
D
a
r
l
i
n
g

N
o
r
m
a
l
i
t
y

T
e
s
t
A
-
S
q
u
a
r
e
d
:

0
.
3
0
2
P
-
V
a
l
u
e
:



0
.
5
5
4
5
2
5
7
6
2
6
7
.
0
0
1
.
0
1
.
0
5
.
2
0
.
5
0
.
8
0
.
9
5
.
9
9
.
9
9
9
P
r
o
b
a
b
i
l
i
t
y
b
r
i
g
h
t
n
e
s
s
_
1
N
o
r
m
a
l
i
t
y

T
e
s
t

f
o
r

t
h
e

B
r
i
g
h
t
n
e
s
s

-

e
x
c
l
u
d
i
n
g

t
h
e

o
u
t
l
i
e
r

Reviewing the records from the
lab testing, it is noticed that the
paper given ’42’ was due to a
special cause of wrong timing in a
testing process. It is therefore
removed from further analysis.


The normality test appears data
follow normal curve very well.

9

The concept and Procedure for performing
the one sample t
-
test

When we are conducting a hypothesis test for comparing with a given reference,
there are usually two choices; one is the hypothesis we intend to establish in our
study, the other is the opposite. In order to make the procedure of testing easier, we
define these two hypotheses:

H
0

and H
a
. H
a
is the one we intend to establish. For this paper brightness test, our H
a

is the actual average brightness of the paper is significantly different from the given
reference.

Typical notation for the hypotheses are:

For the paper brightness study, we have:

Q: When.how do we decide to take H
0
or H
a

?


As we see, if the average of the sample data is either much larger or much smaller
than 60, we will choose H
a;
otherwise, we choose H
0
.

0 0 0
: :
a
H H
   
 
0
:60 :60
a
H H
 
 
10

Q: But, how far is far enough to make such a conclusion?


If the sample average is, say 59.5 or 60.4, then, we would not conclude it is far
enough to conclude H
a
. Therefore, we will need two critical average brightness,


, so that when the sample average obtained from the sample data is
beyond these two values, we will conclude H
a
, that is, the brightness is of the
paper is significantly different from the reference brightness, 60.

Q: How to determine the two critical values?


This can be answered by bringing in the distribution of . The following
distribution is the distribution of under H
0.

1 2
and
x x
X
X
1
x
60

2
x
Reject H
0
Accept H
0
Reject H
0

X
a/2.025

a/2.025

0
x
t
s n



-
t
(
a/2, n

1

t
(
a/2, n

1

Our common experience suggests
that the probability of rejecting H
0

should be small, so that, only
when the sample average is much
far away from 60, we will
conclude H
a
. Therefore, a typical
probability for rejecting H
0

is 5%
or 1%.

Standardized form of is used
for making proper comparison,
which is the t
-
distribution.

X
11

Procedure for conducting one
-
sample t
-
test:

1.
Set up H
0

and H
a

2.
Determine the rule for rejecting and accepting H
0

regions based on the type of
hypothesis rule based on the t
-
distribution.

3.
From the sample data, we compute the t
-
value from the sample average:



4. Compare the t
observed

with the critical t
-
values ,
-
t
(
a/2, n

1

and t
(
a/2, n

1)

from the t
-
table to determine if t
observed
falls in the Acceptance or in the Rejection region.

0
observed
observed
x
t
s n



NOTE: Computer output gives us both the t
observed

and the observed level of
significance, namely, the p
-
value.

The p
-
value for this two
-
sided test is 2P(t > |t
observed
|)

And the decision making based on p
-
value is :

P
-
value <
a

, then, we reject H
0,
that is decide to take H
a

P
-
value
a

, then, we conclude H
0


12

Right
-
side and Left
-
side tests

H
a
is the hypothesis we intend to establish. Therefore, in applications, other tha two
-
side tests, there are two common hypotheses:


Right
-
side test :


Left side
-
test.

How to choose the test for our need?


If our intension is to find out if the sample mean is much larger than the reference
value or not, right
-
side test should be applied. For example, if the reference value of
the brightness of paper, 60, is the minimum. Our goal is to decide if the new process
produces significantly brighter paper or not.



If our intension is to find out if the sample mean is much lower than the reference
value or not, right
-
side test should be applied. For example, if the reference value of
the brightness of paper, 60, is the maximum allowed. Our goal is to decide if the new
process produces significantly less bright paper or not.


If our intension is to find out if the sample mean is much lower than the reference
value or not, right
-
side test should be applied. For example, if the reference value of
the brightness of paper, 60, is the given standard. Our goal is to decide if the new
process produces significantly different brightness of paper or not.

0
:60 :60
a
H H
 
 
0 0 0
: :
a
H H
   
 
0 0 0
: :
a
H H
   
 
13

Hands
-
on Activity: Comparative Study with A given Reference


In testing the tensile strength of a new type of concrete, the goal is to make sure
that the tensile strength meets the minimum of 300 psi. A lab is assigned to test
this new concrete. 20 samples are tested. The tensile strengths are :

320

305

293

295

313

306

298

325

304

316

307

308

307

305

319

294

295

295

300

312

Perform an appropriate test to determine if the new type of concrete meets
the minimum tensile strength of 300 psi.

14

Comparative Study for Inter
-
laboratory Testing : two
-
group cases

Using the example of brightness of paper, there are many situations that the testing
may involve with two groups of treatment. Here are some possible situations:

1.
when chemical component is changed, the brightness could be changed
dramatically. A comparative study can be planned to compare the effect of two
different levels of this chemical component.

2.
When papers are tested by two different labs, there may be between
-
lab
differences. Such difference should be controlled to minimize the systematic error
of a given lab when testing the same material using the same testing procedure.

3.
When papers are testing using two different testing procedure, it is important to
identify the difference between these two testing procedures.

A comparative two
-
group study may be to compare the difference of two types of
material, two different treatments , two testing procedures, or difference between
two labs. We now discuss a method for making the two
-
group comparison. Similar
to the comparison between a given reference and a sample data, if is important to
keep in mind that we need to conduct outlier analysis and distribution checking.


15

The issue of designing experiments for two
-
sample comparative study

Consider the example of comparing the reaction of a chemical component in a lab testing

Treatment : Two levels of chemical component.

We will discuss two types of designs for experiment:

Add Level A component

Add Level B component

Test n = 15 units

Test n = 15 units

2.
Design B
-
Independent sample design
: Each
treatment is assigned to 15 units, which are
independent of the other treatment.

Specimen is split into
two sub
-
samples

Add Level A component

Add Level B component

Test n = 15 pairs. Each pair
are tested together

NOTE: a paired
-
sample comparison is usually referred to Before/After Treatment or Pre/Post Treatment
experiment. The variable of interest is observed before and after a treatment. This type of design occurs
often in testing the effect of c treatment along the time domain. For example, one my be interested in
studying the chemical residue for 5 day, 10 days after the chemical is sprayed to a certain vegetable.

1.
Design A


Paired sample design
:
The units assigned to two treatment
each time are very similar, since
they are from the same specimen.

16

NOTE: a paired
-
sample comparison is usually referred to Before/After Treatment or Pre/Post Treatment
experiment. The variable of interest is observed before and after a treatment. This type of design occurs
often in testing the effect of c treatment along the time domain.

For example, one my be interested in studying the chemical residue for 5 day, 10 days after the chemical
is sprayed to a certain vegetable.

Time

Treatment: Spray the
chemical to n randomly
chosen subjects.

Test the residue
five days after from
the subjects

Test the residue ten
days after from the
same subjects

Hands
-
on Activity


For the same study, one can design a two
-
independent sample study as well.

Design a two independent sample study for studying the chemical residue, and discuss the advantage
and disadvantage of paired
-
sample Vs independent sample designs.

Time

Treatment is given. Eg, a diet
treatment for three months

Before diet treatment: observe
weight, BMI, age, Gender, etc,
from each subject

Three months after, observe
weight, BMI, etc, from the
same subject.

17

The difference between Experiment A and B is:

Samples obtained from experiment A can be considered as 15 pairs, each pair is sampled
from the sub
-
group. Possible sources that may introduce the error is the same for two
samples except the levels of component. The experimental units are similar.

Samples obtained from Experiment B are two independent samples. Each is obtained from
the process that is independent from the other process. Possible sources that may introduce
errors include not only the levels of components but also the differences of the processes.
Therefore, the paper units for testing the brightness may have higher variation.

Analyses of data resulted from these twp experiments are different.

Experimental A is a paired sample problem, while B is an independent sample problem.

Hands
-
On Activity

From the projects you have conducted, identify a paired sample
project and one for independent sample project.

18

Analysis of Paired Sample Problem

Consider the experiment for testing the chemical residue.

Experiment:

15 pots of a certain vegetable are used as the experiment units. The residue is
measured and recorded five days and ten days after the spray.

X: the residue five days after the chemical treatment.

Y: the residue ten days after the chemical treatment.

Testing Procedure
: Each residue is the average of the residues of two specimen taken from the
same plot for the purpose of reducing random error.

Pot

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

x

59

61

64

62

59

63

58

59

64

65

64

60

67

65

63

y

54

52

59

60

61

60

56

61

58

59

62

61

61

58

57

d = y
-
x

-
5

-
9

-
5

-
2

2

-
3

-
2

-
2

-
6

-
6

-
2

1

-
6

-
7

-
6

For each pot, the residues are observed five days and ten days after. Hence the
difference between Y
-
X is the residue reduction in the five days of time period. To
understand if the reduction of residue is statistically significant, we can then perform a
one
-
sample test based on the difference, d. The hypothesis is:

0
:0,:0
d a d
H H
 
 
19

Recall: To perform a one
-
sample t
-
test, we need:

The following is the output from Minitab

d
, , SE
d
d s
Paired T for 10 days
-

5 days



N Mean StDev SE Mean

10 days(y)

15 58.600 2.849 0.735

5 days (x)

15 62.200 2.731 0.705

Difference (d)

15
-
3.600 3.376 0.872

95% CI for mean difference: (
-
5.470,
-
1.730)

T
-
Test of mean difference = 0 (vs not = 0):

T
-
Value =
-
4.13 P
-
Value = 0.001

-
1
0
-
5
0
D
i
f
f
e
r
e
n
c
e
s
B
o
x
p
l
o
t

o
f

D
i
f
f
e
r
e
n
c
e
s

o
f

R
e
s
i
d
u
e
s

b
e
t
w
e
e
n

1
0
-
d
a
y
s

a
n
d

5
-
d
a
y
s
(
w
i
t
h

H
o

a
n
d

9
5
%

t
-
c
o
n
f
i
d
e
n
c
e

i
n
t
e
r
v
a
l

f
o
r

t
h
e

m
e
a
n
)
[
]
X
_
H
o

Based on the p
-
value = .001 < 5%, we can
conclude that the residue reduction is
statistically significant at
a

= 5%. The average
reduction is 3.6 based on data from 15 pots.


The confidence interval at 95% is given by



5.47 to

1.73. That is the 95% sure that the
uncertainty of the residue is

(.025,14)
( ) 3.6 2.145(.872) 3.6 1.87
d
d t SE
    
20

Analysis of Two
-
independent Samples Problem

Consider the experiment for testing the chemical residue. We can design a two
-
independent
sample experiment for the residue study.

Experiment:

30 pots of a certain vegetable are used as the experiment units. 15 pots are
randomly chosen for the 5
-
day residue testing. The other 15 are for the 10
-
day residue testing.
X: the residue five days after the chemical treatment from 15 randomly selected pots.

Y: the residue ten days after the chemical treatment from the other 15 pots.

Testing Procedure
: Each residue is the average of the residues of two specimen taken from the
same plot for the purpose of reducing random error.

NOTE: This design is appropriate if each pot can only be applied for one residue testing.

Pot

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

x

59

61

64

62

59

63

58

59

64

65

64

60

67

65

63

Pot

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

y

54

52

59

60

61

60

56

61

58

59

62

61

61

58

57

For each pot, the residue can only be measured either five days or ten days after. The
assignment of pots to residue testing is random, and thus, there are considered
independent. The difference between Y
-
X no longer reflects the residue reduction, but
also include the pots difference.

21

Our purpose is to compare if

2

is statistically lower than

1
.
This is a left
-
side test:


H
a

is concluded if the corresponding sample mean difference, is indeed
much lower than zero.
How much less from zero is considered significant?

Similar to the one
-
sample problem, we need to determine the distribution of

Or equivalently, the distribution of the standardized form,

NOTE: Most of statistical hypothesis problems or estimation problems require the
distribution form of the best estimate of the variable of interest. This is usually
accomplished by finding the distribution of the standardized best estimate.

This is true for any test involves t
-
distribution, chi
-
square distribution, as well as F
-
distribution, and so on.

0 2 1 2 1
:0, :0
a
H H
   
   
The residue after 5
-
days is a population with it’s mean

1

and variance,
s
1
2
.

Similarly, the
residue after 10
-
days is a different population with it’s mean

2

and variance,
s
2
2.


Y
X
y x

y x

( )/
Y X
y x SE


22


What is the distribution of ? How to determine ?

Based on statistical theory, the t
-
distribution holds when the samples are randomly
chosen from each population. The quantity is the uncertainty of the the mean
difference. The way for determining depending on the sample sizes and if the
variances of two populations are homogeneous or not.

When the population variances are not equal, then is given by:

y x
y x
SE


Y x
SE

Y x
SE

Y x
SE

2 2
2 2
2
( ) ( )
1 2 1 2
1 2
2
2
2
1 2
2
2
1
1
, therfore, SE
However,thedegrees of freedom for the th
is uncertainty measurement is
a weighted d.f. of n and n:
df =
[/( 1)]
y y
x x
y x y x
y
x
x
s s
s s
s
n n n n
s
s
n n
s
n
n
 
   
 

 
 
 
 

 
 
2
2
2
2
[/( 1)]
y
s
n
n
 
 
 
 
 
Y x
SE

23

When the population uncertainties can be assumed equal, that is,

we can combine two samples together to obtain a better estimate of the
common measurement uncertainty for :

2 2 2
1 2
s s s
 
y x

The 100(1
-
a
)% confidence interval for can be determined by:




2 1
 

(/2,)
( )
df
Y X
y x t SE
a

 
1.
obtain the pooled estimate of the common variance,
s
2 ,

by:




2.
Compute SE of :

2 2
2
1 1 2 2
1 2
( 1) ( 1)
2
p
n s n s
s
n n
  

 
1 2
1 1
p
Y X
SE s
n n

 
24

We apply the t
-
test by:

1.
Compute t
-
value:

2.
Compare t
obs

with the critical t
-
value:




Or when computer software is available, the p
-
value is used for decision
making. The same rule is applied when using p
-
value, regardless what type
of test:



If p
-
value <
a
, then, reject H
0
, and conclude H
a

To test if population mean

2

statistically different from (greater or less than) the
population mean

2
.




obs
Y X
y x
t
SE



0 2 1 2 1
0 2 1 2 1
0 2 1 2 1
Two-side Test: :0, :0
Right-side Test: :0, :0
Left-side Test: :0, :0
a
a
a
H H
H H
H H
   
   
   
   
   
   
obs (/2,) (/2,) 0
obs (/2,) 0
obs (/2,)
For two-side test: If t falls outside of
-t and t , then reject H.
For right-side test: If t > t , then r
eject H.
For left-side test: If t < -t , then r
df df
df
df
a a
a
a
0
eject H.
25

Pot

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

x

59

61

64

62

59

63

58

59

64

65

64

60

67

65

63

Pot

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

y

54

52

59

60

61

60

56

61

58

59

62

61

61

58

57

Case Example: A chemical residue study

Purpose: To compare if chemical residue is significantly reduced ten days

after with 5 days after.

Experiment:

30 pots of a certain vegetable are used as the experiment units. 15 pots are
randomly chosen for the 5
-
day residue testing. The other 15 are for the 10
-
day residue
testing. X: the residue five days after the chemical treatment from 15 randomly selected
pots. Y: the residue ten days after the chemical treatment from the other 15 pots.

Testing Procedure
: Each residue is the average of the residues of two specimen taken from
the same plot for the purpose of reducing random error.

Variable Treatment N Mean Median StDev SE Mean

Residue 5
-
days 15 62.20 63.00 2.731 0.705


10
-
days 15 58.60 59.00 2.849 0.735

26

A
v
e
r
a
g
e
:

6
2
.
2
S
t
D
e
v
:

2
.
7
3
0
7
8
N
:

1
5
A
n
d
e
r
s
o
n
-
D
a
r
l
i
n
g

N
o
r
m
a
l
i
t
y

T
e
s
t
A
-
S
q
u
a
r
e
d
:

0
.
4
1
3
P
-
V
a
l
u
e
:



0
.
2
9
6
5
8
5
9
6
0
6
1
6
2
6
3
6
4
6
5
6
6
6
7
.
0
0
1
.
0
1
.
0
5
.
2
0
.
5
0
.
8
0
.
9
5
.
9
9
.
9
9
9
P
r
o
b
a
b
i
l
i
t
y
5
-
d
a
y
N
o
r
m
a
l

P
r
o
b
a
b
i
l
i
t
y

P
l
o
t

o
f

R
e
s
i
d
u
e

A
f
t
e
r

5

d
a
y
s
A
v
e
r
a
g
e
:

5
8
.
6
S
t
D
e
v
:

2
.
8
4
8
5
6
N
:

1
5
A
n
d
e
r
s
o
n
-
D
a
r
l
i
n
g

N
o
r
m
a
l
i
t
y

T
e
s
t
A
-
S
q
u
a
r
e
d
:

0
.
5
9
4
P
-
V
a
l
u
e
:



0
.
1
0
1
5
2
5
7
6
2
.
0
0
1
.
0
1
.
0
5
.
2
0
.
5
0
.
8
0
.
9
5
.
9
9
.
9
9
9
P
r
o
b
a
b
i
l
i
t
y
1
0
-
d
a
y
N
o
r
m
a
l

P
r
o
b
a
b
i
l
i
t
y

P
l
o
t

o
f

R
e
s
i
d
u
e
s

A
f
t
e
r

1
0
-
d
a
y
s
2
3
4
5
9
5
%

C
o
n
f
i
d
e
n
c
e

I
n
t
e
r
v
a
l
s

f
o
r

S
i
g
m
a
s
2
1
5
2
5
7
6
2
6
7
B
o
x
p
l
o
t
s

o
f

R
a
w

D
a
t
a
R
e
s
i
d
u
e
F
-
T
e
s
t
T
e
s
t

S
t
a
t
i
s
t
i
c
:

0
.
9
1
9
P
-
V
a
l
u
e







:

0
.
8
7
7
L
e
v
e
n
e
'
s

T
e
s
t
T
e
s
t

S
t
a
t
i
s
t
i
c
:

0
.
0
4
4
P
-
V
a
l
u
e







:

0
.
8
3
5
F
a
c
t
o
r

L
e
v
e
l
s
1
2
T
e
s
t

f
o
r

E
q
u
a
l

V
a
r
i
a
n
c
e
s

f
o
r

R
e
s
i
d
u
e
Diagnosis of assumptions:


Both samples follow
normal.


Variances are similar.

27

Two
-
Sample T
-
Test and CI: Residue, Treatment (Without assume equal
variances)

Treatment N Mean StDev SE Mean

1 15 62.20 2.73 0.71

2 15 58.60 2.85 0.74

Difference = mu (1)
-

mu (2)

Estimate for difference: 3.60,

95% CI for difference: (1.51, 5.69)

T
-
Test of difference = 0 (vs >):

T
-
Value = 3.53 P
-
Value = 0.001
DF = 27

Two
-
Sample T
-
Test :Residue, Treatment ( assume equal variances)

Difference = mu (1)
-

mu (2)

Estimate for difference: 3.60

T
-
Test of difference = 0 (vs >):

T
-
Value = 3.53 P
-
Value = 0.001
DF = 28

Both use Pooled StDev = 2.79

Note: DF = 27 is
computed to adjust the
unequal variances

Note: s
p

is used as
the common s.d.

28

1
2
5
2
5
7
6
2
6
7
T
r
e
a
t
m
e
n
t
R
e
s
i
d
u
e
6
2
.
2
5
8
.
6
B
o
x

P
l
o
t
s

f
o
r

t
h
e

R
e
s
i
d
u
e
s

-

T
w
o

I
n
d
e
p
e
n
d
e
n
t

S
a
m
p
l
e
s
Conclusion: The s.d.’s are similar. Levene’s test of uniformity of variances shows
p
-
value = .835. We can use either t
-
test to test the hypothesis ‘If the residue 10
-
days after is significantly reduced from 5
-
days after. Two t
-
test results
(assuming/not assuming equal variance) give the same conclusion:

P
-
value < 5%, therefore, the reduction of residue from 5
-
days to 10
-
days after the
chemical spray is statistically significant.

29

Hands
-
on Activity

Perform the two
-
independent sample test manually, and compare
with the computer output.

30