Anova statistics

Nugurusaichandan 1,360 views 25 slides Jun 25, 2020
Slide 1
Slide 1 of 25
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25

About This Presentation

ANNOVA


Slide Content

1




ANALYSIS OF VARIANCE (ANOVA)


Contents

 Basic Purpose of ANOVA
 Statement of Cochran’s theorem
 ANOVA one-way classification
 Solved problems in ANOVA one-way classification
 ANOVA two-way classification
 Solved problems in ANOVA two-way classification
 List of Questions asked in Previous Years

Learning Objectives

After the completion of this chapter, the student is able to
 Define ANOVA
 Describe the basic purpose of ANOVA
 Analyze the problems based on one-way classified data
 Analyze the problems based on two-way classified data
 Distinguish between ANOVA one-way and ANOVA two-way classifications.

2

Basic purpose of ANOVA

The Analysis Of Variance is a powerful statistical tool for tests of significance.
The test of significance based on t-distribution is a suitable procedure only for testing the
significance of the difference between two sample means. In a situation, when we have
three or more samples to consider at a time, an alternative procedure is needed for testing
the hypothesis that all the samples are drawn from the same population that is they have
the same mean.

For example, five fertilizers are applied to four plots each of wheat and yield of
wheat on each of the plot is given. We may be interested in finding of whether the effect
of these fertilizers on the yields is significantly different or in other words, whether the
samples have come from the same normal population. The answer to this problem is
provided by the analysis of variance.

Thus the basic purpose of the analysis of variance is “to test the homogeneity of
several means”.

The term “Analysis Of Variance” was introduced by Prof. R.A.Fisher in 1920’s to
deal with the problem in the analysis of agronomical data (agricultural data).

Variation is inherent in nature. The total variation in any set of numerical data is
due to a number of causes which may be classified as
1. Assignable causes and
2. Chance causes.

The variation due to assignable causes can be detected and measured where as the
variation due to chance causes is beyond the control of human hand and cannot be traced
separately.

Definition:- According to Prof.R.A.Fisher, “Analysis Of Variance is the separation of
variance due to one group of causes from the variance due to other group”.

Assumptions:
For the validity of the F-test in ANOVA, The following assumptions are made.

1. The observations are independent.
2. The parent population from which the observations are taken is normal.
3. Various treatment and environmental effects are additive in nature.


COCHRAN’S THEOREM
Let nXXX ,....,,
21 denote a random sample from normal population ),0(
2
N .
Let the sum of the squares of these values be written in the form

3
k
n
i
i
QQQX 

.....
21
1
2

Where jQ is a quadratic from in nXXX ,....,,
21 , with rank (degrees of freedom) kjr
j ,...,2,1,
. Then the random variables kQQQ,.....,,
21 are mutually independent and 2

jQ
is a 2
 -variate with jr degrees of freedom if and only if nr
k
j
j

1

ANOVA ONE-WAY CLASSIFICATION
Lay Out:
Let us suppose that N observations ),.....2,1;,.....,2,1(
iij
njkix  of a random
variable X are grouped on some basics, into k classes of sizes knnn,.......,,
21
respectively, (N=

k
i
in
1 ) as exhibited below.

Classes Observations Totals Means
1

2
.
.
i
.
.
K 1111211
........
nj
xxxx
2222221
........
nj
xxxx

.
. i
inijii xxxx ........
21

.
.
.k
knkjkk xxxx ........
21 .1T
.2T

.
. .iT

.
. .kT

.1x
.2x

.
. .ix

.
. .kx


The total variation in the observations xij can be split in the following two
components.
i. The variation between the classes (or) the variation due to different bases of
classifications, commonly known as treatments.
ii. The variation with in the classes i.e., the variations due to chance causes.

The main object of Analysis Of Variance is to examine if there is significant
difference between the class means in view of the inherent variability with in the separate
classes.

For e.g., let us consider the effect of k different rations on the yield in the milk of
N cows divided into K classes of sizes knnn,.......,,
21 respectively. [N=

k
i
i
n
1 ].
Mathematical model:
In this case the linear mathematical model will be
iijiij
njkitx ,....,2,1;,....,2,1, 

4



Where ijx = The yield of j-th unit from i-th treatment.
µ = The general mean effect.
it = The treatment effect = 
i .
ij = Error effects are independently and identically distributed
(i.i.d.) N(0,e
2
)
Also we have
0
0
)(
1
1
11











k
i
i
k
i
i
k
i
i
k
i
i
t
kk
k
t



Assumptions to the model:
1. All the observations ijx are independent.
2. The parent population must be normal.
3. ij ’s are independently and identically distributed (i.i.d.) N(0,e
2
)
Null hypothesis:
We want to test the equality of the population means i.e. the homogeneity of
different treatments. Hence the null hypothesis is given by
 
kH .....:
210
Or
0.....:
210 
ktttH
And the alternative hypothesis is given by
kH   .....:
211
Statistical analysis:
In this classification, the linear mathematical model is given by,
iijiij
njkitx ,....,2,1;,....,2,1,  ---------- (1)
iijij
tx   ------------ (2)
According to the principle of least squares, we have
.min)(
11
2
11
2
istx
E
k
i
n
j
iij
k
i
n
j
ij
i
i







According to the principle of maxima and minima, we have

5

0&0 





it
EE

Consider,
..
..
11
11
111
111111
11
11
ˆ
)(
1
00
0
0)(
0)1)((2
0
x
sayxx
N
xN
tNx
tx
tx
tx
E
k
i
n
j
ij
k
i
n
j
ij
k
i
i
k
i
n
j
ij
k
i
n
j
i
k
i
n
j
k
i
n
j
ij
k
i
n
j
iij
k
i
n
j
iij
i
i
i
iii
i
i





































Similarly,
...
1
......
1
..
1
1
111
1
1
ˆ
1
,
1
0
0
0)(
0)1()(2
0
xxt
x
n
xwherexxxx
n
t
xnxtn
tnnx
tx
tx
tx
t
E
ii
n
j
ij
i
ii
n
j
ij
i
i
i
n
j
ijii
iii
n
j
ij
n
j
i
n
j
n
j
ij
n
j
iij
n
j
iij
i
ii
i
i
iii
i
i

























6

From equation (2), we get
.
.
.....
ˆ
)(
iijij
iij
iijij
xx
xx
xxxx




Now, from equation (1), we get
)()(
)()(
......
......
iijiij
iijiij
ijiij
xxxxxx
xxxxxx
tx




Squaring and adding on both sides, we get 222
11
2
.
11
2
...
11
2
..
.
11
...
11
2
.
11
2
...
11
2
..
]lg[)()()(
)()(2)()()(
ETRT
k
i
n
j
iij
k
i
n
j
i
k
i
n
j
ij
iij
k
i
n
j
i
k
i
n
j
iij
k
i
n
j
i
k
i
n
j
ij
SSS
zeroisdeviationsofsumebraicaThexxxxxx
xxxxxxxxxx
iii
iiii









Where
2
T
S = Total Sum of Squares = 


k
i
n
j
ij
i
xx
11
2
..
)(
2
TRS = Treatment Sum of Squares = 


k
i
n
j
i
i
xx
11
2
...
)(
2
E
S = Error Sum of Squares = 


k
i
n
j
iij
i
xx
11
2
.
)(
Degrees of freedom:
The degrees of freedom for totals = N – 1
The degrees of freedom for treatments = k – 1
The degrees of freedom for errors = N – 1 – (k – 1)
= N – k
Mean Sum of Squares (M.S.S.):
Mean sum of squares is obtained by dividing the value of sum of squares with the
corresponding degrees of freedom.
)(...
)(
1
...
2
2
2
2
says
kN
S
errorstodueSSM
says
k
S
treatmentstodueSSM
e
E
tr
TR





7

ANOVA TABLE

Source of
variation
Degrees
of
freedom
Sum of
squares
Mean
Sum of
Squares
F-Ratio
Treatments 1k 2
TRS 2
tr
s
],1[~)(
22
2
2
kNkFss
s
s
F
etr
e
tr


Errors kN 2
E
S 2
e
s
Totals 1N

Conclusion: If the calculated value of F is less than the tabulated value of F at  %
LOS then we accept our null hypothesis H0, otherwise we reject H0.

PROBLEMS
1. A test was given to 5 students taken at random from the 5
th
class of 3 schools of a
town. The individual scores are
School I 9 7 6 5 8
School II 7 4 5 4 5
School III 6 5 6 7 6
Carry out the analysis of variance and state your conclusion.
Solution:
For the given data, our null hypothesis is given by
0H : There is no significant difference between three schools.
And the alternative hypothesis is
1H : There is a significant difference between three schools.
Totals
(.iT ) 
j
ijx
2

School I 9 7 6 5 8 35 255
School II 7 4 5 4 5 25 131
School III 6 5 6 7 6 30 182
G = 90 
ij
ijx
2
=568
28
540568
15
90
568
2
2
22




N
G
xS
ij
ijT

8

10
540550
540180125245
15
90
5
30
5
25
5
35
2222
2
.
2
.2





N
G
n
T
S
i
i
i
TR
18
1028
222



TRTE SSS

ANOVA TABLE

Source of
variation
Degrees
of
freedom
Sum of
squares
Mean Sum of
Squares
F-Ratio
Treatments 3-1=2 10 5
F= 33.3
5.1
5


~ F(2,12)
Errors 14-2=12 18 1.5
Totals 15-1=14

The tabulated value of F(2, 12) = 19.41 at 5% LOS.

Conclusion: Since the calculated value of F is less than the tabulated value of F at 5%
level of significance, hence we accept 0H .
i.e., there is no significant difference between 3 schools.

2. Three processors A, B and C are tested to see whether their outputs are
equivalent. The following observations of outputs are made.
A 10 12 13 11 10 14 15 13
B 9 11 10 12 12
C 11 10 15 14 12 13
Carry out the analysis of variance and state your conclusion.

Solution:

For the given data, our null hypothesis is given by
0H : There is no significant difference between the three processors.
And the alternative hypothesis is given by
1H : There is a significant difference between the three processors.

9

Processors Observations Total
(.iT ) 
j
ijx
2

A 10 12 13 11 10 14 15 13 98 1224
B 09 11 10 12 12 54 590
C 11 10 15 14 12 13 75 955
G = 227 
ij
ijx
2
=2769
9474.56
0526.27122769
19
227
2769
2
2
22




N
G
xS
ij
ijT
1474.9
0526.27122.2721
0526.27125.9372.5835.1200
19
227
6
75
5
54
8
98
2222
2
.
2
.2





N
G
n
T
S
ii
i
TR
8.47
1474.99474.56
222



TRTE SSS

ANOVA TABLE

Source of
variation
Degrees
of
freedom
Sum of
squares
Mean Sum
of Squares
F-Ratio
Treatments 3-1=2 9.1474 4.5737
F= 5309.1
9875.2
5737.4


~ F(2,16)
Errors 18-2=16 47.8 2.9875
Totals 19-1=18

The tabulated value of F(2,16) = 3.63 at 5% level of significance.

Conclusion: Since the calculated value of F is less than the tabulated value of F at 5%
level of significance, hence we accept our null hypothesis0H .
i.e., there is no significant difference between the three processors A, B and C.

3. Three varieties of coal were analyzed by four chemists and the ash content in the
varieties were found to be

10


Varieties
Chemists
1 2 3 4
A 8 5 5 7
B 7 6 4 4
C 3 6 5 4
Do the varieties differ significantly in their ash content?

Solution:

For the given data, our null hypothesis is
0H : There is no significant difference between three varieties of coal.
And the alternative hypothesis is
1H : There is a significant difference between three varieties of coal.





Varieties Chemists Total
(.iT ) 
j
ijx
2

1 2 3 4
A 8 5 5 7 25 163
B 7 6 4 4 21 117
C 3 6 5 4 18 86
G = 64 
ij
ijx
2
= 366
6667.24
3333.341366
12
64
366
2
2
22




N
G
xS
ij
ijT

1667.6
3333.3418125.11025.156
12
64
4
18
4
21
4
25
2222
2
.
2
.2




N
G
n
T
S
i
i
i
TR

11

5.18
1667.66667.24
222



TRTE SSS
ANOVA TABLE
Source of
variation
Degrees
of
freedom
Sum of
squares
Mean Sum of
Squares
F-Ratio
Treatments 3-1=2 6.1667 3.0834 )9,2(~5.1
0556.2
0834.3
FF 

Errors 11-2=9 18.5 2.0556
Totals 12-1=11

The tabulated value of F(2,9) = 4.26 at 5% level of significance.

Conclusion:
Since the calculated value of ‘F’ is less than the tabulated value of ‘F’ at 5% level
of significance, hence we accept 0H .
i.e., There is no significant difference between the ash content of three varieties of coal.

4. The following data shows the lives in hours of four batches of electric lamps.
Batches
1 1600 1610 1650 1680 1700 1720 1800
2 1580 1640 1640 1700 1750
3 1460 1550 1600 1620 1640 1660 1740 1820
4 1510 1520 1530 1570 1600 1680
Perform an analysis of variance of these data and show that a significance test does
not reject their homogeneity.

Solution:
For the given data, our null hypothesis is
0H : There is no significant difference between the four batches.
And the alternative hypothesis is
1H : There is a significant difference between four batches.
Since all the observations in the given data are very high, for the sake of
simplicity in the computation part we will use the technique of change of origin and
scale.
Now shifting the origin to 1640 and then dividing by 10, the given data reduces to

12

Batches Observations Total
(.iT ) 
j
ijx
2

1 - 4 - 3 1 4 6 8 16 - 28 398
2 - 6 0 0 6 11 - - - 11 193
3 - 18 - 9 - 4 - 2 0 2 10 18 - 3 853
4 - 13 - 12 - 11 - 7 - 4 4 - - - 43 515
G = - 7 
ij
ijx
2
= 1959
1154.1957
8846.11959
26
)7(
1959
2
2
22





N
G
xS
ij
ijT
6071.443
8846.11667.308125.12.24112
26
)7(
6
)43(
8
)3(
5
11
7
28
22222
2
.
2
.2









N
G
n
T
S
i
i
i
TR
5083.1513
6071.4431154.1957
222



TRTE SSS

ANOVA TABLE

Source of
variation
Degrees
of
freedom
Sum of
squares
Mean Sum
of Squares
F-Ratio
Treatments 4-1=3 443.6071 147.8690
F=1494.2
7958.68
8690.147
 ~ F(3,22)
Errors 25-3=22 1513.5083 68.7958
Totals 26-1=25

The tabulated value of F(3,22) = 3.05 at 5% level of significance.

Conclusion:
Since the calculated value of ‘F’ is less than the tabulated value of ‘F’ at 5% level
of significance, hence we accept our null hypothesis H0.
i.e., There is no significant difference between the four batches.

13

ANOVA TWO – WAY CLASSIFICATION (With one observation per cell)
Lay-Out:
Let us consider the case when there are two factors which may affect the variate
values hjkix
ij
,.....,2,1&,....,2,1,  . For example, the yield of milk may be affected
by differences in treatments (rations) as well as the differences in variety i.e. breed and
stock of the cows. Let us now suppose that ‘N’ cows are divided into ‘k’ different groups
according to their breed and stock, each group contains ‘h’ cows and then let us consider
the affect of ‘k’ treatments on the yield of milk hjkix
ij
,.....,2,1&,....,2,1,  of N =
kh cows.
The yields may be expressed as variate values in the following hk two way
table.
Blocks
Treatments
1 2 ………. j …………. h Total
1 11x 12x ………. jx
1 …………. hx
1 .1T
2 21x 22x ………. j
x
2 …………. hx
2 .2T
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
i 1ix 2ix ………. ijx ………….. ihx .iT
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
k 1kx 2kx ………. kjx ………….. khx .kT
Total 1.T 2.T ………. jT
. ………….. hT
. G

Mathematical model

In this case the linear mathematical model will be

hjkibtx
ijjiij
,....,2,1;,....,2,1, 

Where ijx = The yield of j-th unit from i-th treatment.
µ = The general mean effect.
it = The i-th treatment effect.
jb = The j-th block effect.
ij = Error effects are independently and identically distributed
(i.i.d.) N(0,e
2
)

Also we have
0
11


h
j
j
k
i
i
bt

14


Assumptions to the model:
1. All the observations ijx are independent.
2. The parent population must be normal.
3. ij ’s are independently and identically distributed (i.i.d.) N(0,e
2
)

Null hypothesis:
We want to test the equality of the population means i.e. the homogeneity of
different treatments as well as blocks. Hence the null hypotheses are given by




hB
kT
H
H
.2.1.0
..2.10
.....:
.....:
And the alternative hypothesis is given by
hB
kT
H
H
.2.1.1
..2.11
.....:
.....:





Statistical analysis
In this classification, the linear mathematical model is given by,
hjkibtx
ijjiij
,....,2,1;,....,2,1,  ---------- (1)
jiijij btx   ------------ (2)
According to the principle of least squares, we have
.min)(
11
2
11
2
isbtx
E
k
i
n
j
jiij
k
i
n
j
ij
i
i







According to the principle of maxima and minima, we have


0&0,0 








ji
b
E
t
EE

Consider,

15

..
..
11
11
1111
11111111
11
11
ˆ
)(
1
00
0
0)(
0)1)((2
0
x
sayxx
N
xN
btNx
btx
btx
btx
E
k
i
h
j
ij
k
i
h
j
ij
h
j
j
k
i
i
k
i
h
j
ij
k
i
h
j
j
k
i
h
j
i
k
i
h
j
k
i
h
j
ij
k
i
h
j
jiij
k
i
h
j
jiij





































also,
...
1
......
1
..
1
11
1111
1
1
ˆ
1
,
1
]0[0
0
0)(
0)1()(2
0
xxt
x
h
xwherexxxx
h
t
xhxht
bhthx
btx
btx
btx
t
E
ii
h
j
ijii
h
j
iji
h
j
iji
h
j
ji
h
j
ij
h
j
j
h
j
i
h
j
h
j
ij
h
j
jiij
h
j
jiij
i


























Similarly,



k
i
ijjjj
x
k
xwherexxb
1
....
1
,
ˆ

16

From equation (2), we get
....
....
........
ˆ
)()(
xxxx
xxxx
xxxxxx
jiijij
jiij
jiijij



Now, from equation (1), we get
)()(
)().()(
.........
...........
xxxxxxxx
xxxxxxxxxx
btx
jiijiij
jiijjiij
ijjiij



Squaring and adding on both sides, we get 2222
2
....
1111
2
..
11
2
...
11
2
..
]lg[
)().()()(
EBTRT
jiij
k
i
h
j
k
i
h
j
j
k
i
h
j
i
k
i
h
j
ij
SSSS
zeroisdeviationsofsumebraicaThe
xxxxxxxxxx

 


Where
2
T
S = Total Sum of Squares = 


k
i
h
j
ij
xx
11
2
..
)(
2
TRS = Treatment Sum of Squares = 


k
i
h
j
i
xx
11
2
...
)(
2
B
S = Block Sum of Squares = 


k
i
h
j
j
xx
11
2
...
)(
2
E
S = Error Sum of Squares = 


k
i
h
j
jiij
xxxx
11
2
....
)(

Degrees of freedom:
The degrees of freedom for totals = N – 1
The degrees of freedom for treatments = k – 1
The degrees of freedom for blocks = h – 1
The degrees of freedom for errors = N – 1 – (k – 1) – (h – 1)
= N – k – h + 1
= kh – k – h + 1
= k(h – 1) – 1(h – 1)
= (h – 1)(k – 1)

Mean Sum of Squares (M.S.S.):
Mean sum of squares is obtained by dividing the value of sum of squares with the
corresponding degrees of freedom.

17

)(
)1)(1(
...
)(
1
...
)(
1
...
2
2
2
2
2
2
says
kh
S
errorstodueSSM
says
h
S
blockstodueSSM
says
k
S
treatmentstodueSSM
e
E
b
B
tr
TR










ANOVA TABLE

Source of
variation
Degrees of
freedom
Sum of
squares
Mean
Sum of
Squares
F-Ratio
Treatments 1k 2
TRS 2
tr
s )]1)(1(,1[~)(
)]1)(1(,1[~)(
22
2
2
22
2
2


hkhFss
s
s
F
hkkFss
s
s
F
eb
e
b
B
etr
e
tr
TR

Blocks 1h 2
B
S 2
b
s
Errors )1)(1( hk 2
E
S 2
e
s
Totals 1N

Conclusion: If the calculated values of ‘F’ are less than the tabulated values of ‘F’ at 
% LOS then we accept our null hypotheses, otherwise we reject our null hypotheses.

PROBLEMS
1. The following table gives quality rating of 10 service stations given by five
professional raters.
Raters Service Stations
1 2 3 4 5 6 7 8 9 10
A 99 70 90 99 65 85 75 70 85 92
B 96 65 80 95 70 88 70 51 84 91
C 95 60 48 87 48 75 71 93 80 93
D 98 65 70 95 67 82 73 94 86 80
E 97 65 62 99 60 80 76 92 90 89
Analyze the data and discuss whether there is any significant difference between
ratings or between service stations.

Solution:- For the given data, our null hypothesis is defined as 0H
: There is no significant difference between ratings as well as service stations.
And the alternative hypothesis is 1H
: There is a significant difference between ratings as well as service stations.

18

2870
66083689846
6608
50
3990
5
445
5
425
5
400
5
365
5
410
5
310
5
475
5
350
5
325
5
485
368
50
3990
10
810
10
810
10
750
10
790
10
830
9846
318402328248
50
3990
328248
2222
22222222222
2
.
2
.2
222222
2
.
2
.2
2
2
22
















BTRTE
j
j
j
B
i
i
i
TR
ij
ijT
SSSS
N
G
n
T
S
N
G
n
T
S
N
G
xS









Raters Service Stations Total (.iT ) 
j
ijx
2
1 2 3 4 5 6 7 8 9 10
A 99 70 90 99 64 85 75 70 85 92 830 70266
B 96 65 80 95 70 88 70 51 84 91 790 64348
C 95 60 48 87 48 75 71 93 80 93 750 59166
D 98 65 70 95 67 82 73 94 86 80 810 66928
E 97 65 62 99 60 80 76 92 90 89 810 67540
Total
(jT
. )
485 325 350 475 310 410 365 400 425 445 G = 3990 
ij
ijx
2
= 328248

19

ANOVA TABLE

Source of
Variation
Degrees of
Freedom
Sum of
Squares
Mean Sum of
squares
F-Ratio
Treatments 4 368 92 )36,9(~2097.9
7222.79
2222.734
)36,4(~1540.1
7222.79
92
FF
FF
B
TR



Blocks 9 6608 734.2222
Errors 36 2870 79.7222
Totals 49

The tabulated value of F(4,36) is 2.63
The tabulated value of F(9,36) is 2.15 at 5% level of significance.
Conclusion:
Since the calculated value of ‘F’ for treatments is less than the tabulated value of
‘F’, hence we accept our null hypothesis for treatments.
i.e., There is no significant difference between the ratings given be five
professional raters.
Since the calculated value of ‘F’ for blocks is greater than the tabulated value of
‘F’, hence we reject our null hypothesis for blocks.
i.e., There is a significant difference between ten service stations.

2. Perform the analysis of variance for the following data with suitable technique.
Observer Consignment
1 2 3 4 5 6
1 9 10 9 10 11 11
2 12 11 9 11 10 10
3 11 10 10 12 11 10
4 2 11 11 14 12 10
Solution:
For the given data, our null hypothesis is defined as 0H
: There is no significant difference between observers as well as consignments.
And the alternative hypothesis is 1H
: There is a significant difference between observers as well as consignments.

Observers
Consignment
1 2 3 4 5 6 Total
(.iT ) 
j
ijx
2

1 9 10 9 10 11 11 60 604
2 12 11 9 11 10 10 63 667
3 11 10 10 12 11 10 64 686
4 2 11 11 14 12 10 60 686
Total (jT
. ) 34 42 39 47 44 41 G = 247 
ij
ijx
2
= 2643

20

9583.100
0417.25422643
24
247
2643
2
2
22




N
G
xS
ij
ijT
125.2
0417.25426006667.6825.661600
24
247
6
60
6
64
6
63
6
60
22222
2
.
2
.2




N
G
n
T
S
ii
i
TR
7083.24
0417.254225.42048425.55225.380441289
24
247
4
41
4
44
4
47
4
39
4
42
4
34
2222222
2
.
2
.2




N
G
n
T
S
j
j
j
B
125.74
7083.24125.29583.100
2222



BTRTE SSSS

ANOVA TABLE

Source of
Variation
Degrees
of
freedom
Sum of
squares
Mean
sum of
squares
F-Ratio
Treatments 3 2.125 0.7083 )5,15(~1
9417.4
9417.4
)3,15(~9768.6
7083.0
9417.4
FF
FF
B
TR



Blocks 5 24.7083 4.9417
Errors 15 74.125 4.9417
Totals 23

The tabulated value of F(15, 3) is 8.70
The tabulated value of F(15, 5) is 4.62 at 5% level of significance.

Conclusion:
Since the calculated values of ‘F’ for both treatments and blocks are less than the
tabulated values of ‘F’ at 5 % level of significance, hence we accept our null hypothesis.

21

i.e., There is no significant difference between the observers as well as
consignments.

3. Perform analysis of variance with a suitable technique for the following data and
comment on your conclusions.
Varieties Chemists
1 2 3 4
A 8 5 5 7
B 7 6 4 4
C 3 6 5 4

Solution:
For the given data, our null hypothesis is defined as 0H
: There is no significant difference between the varieties as well as chemists.
And the alternative hypothesis is 1H
: There is a significant difference between the varieties as well as chemists.

Varieties Chemists Total
(.iT ) 
j
ijx
2

1 2 3 4
A 8 5 5 7 25 163
B 7 6 4 4 21 117
C 3 6 5 4 18 86
T.j 18 17 14 15 G = 64 
ij
ijx
2 = 366

6667.24
3333.341366
12
64
366
2
2
22




N
G
xS
ij
ijT
1667.6
3333.3418125.11025.156
12
64
4
18
4
21
4
25
2222
2
.
2
.2




N
G
n
T
S
i
i
i
TR

22

3333.3
3333.341753333.653333.96108
12
64
3
15
3
14
3
17
3
18
22222
2
.
2
.2




N
G
n
T
S
j
j
j
B
1667.15
3333.31667.66667.24
2222



BTRTE SSSS

ANOVA TABLE
Source of
Variation
Degrees
of
freedom
Sum of
Squares
Mean
sum of
Squares
F-Ratio
Treatments 2 6.1667 3.0834 )3,6(~2750.2
1111.1
5278.2
)6,2(~2198.1
5278.2
0834.3
FF
FF
B
TR



Blocks 3 3.3333 1.1111
Errors 6 15.1667 2.5278
Totals 11

The tabulated value of F(2, 6) = 5.14
The tabulated value of F(6, 3) = 8.94 at 5% level of significance.

Conclusion:
Since the calculated values of ‘F’ for both the treatments and blocks are less than
the tabulated values of ‘F’ at 5% level of significance, hence we accept our null
hypothesis.
i.e., There is no significant difference between the varieties as well as chemists.

4. Apply Two-Way ANOVA on the row means 39, 41, 48 and the column means 38,
21, 69, 42 with 3 rows and 4 columns and also given 
ij
ijx
2 = 25,944.
Solution:
For the given data, our null hypothesis is given by 0H
: There is no significant difference between 3 rows as well as 4 columns.
And the alternative hypothesis is given by 1H
: There is a significant difference between 3 rows as well as 4 columns.
Now, from the given information we have

23

Columns
Rows
1 2 3 4 Means Total (.iT )
1 39 156
2 41 164
3 48 192
Means 38 21 69 42 512
Total (jT
. ) 114 63 207 126 510
Since the grand total for rows is not equal to the grand total for columns from the
above table, hence the given information is not correct.

SHORT ANSWER QUESTIONS
1. Define analysis of variance?
Ans.: Analysis of variance was introduced by Prof. R.A. Fisher in 1920’s to deal with the
problem in the analysis of agricultural data. The basic purpose of analysis of variance is
“to test the homogeneity of several means”.
According to Prof. R.A. Fisher, “analysis of variance is the separation of variance
due to one group of causes from the separation of variance due to another group of
causes”.

2. What are the basic assumptions for the validity of F-test in ANOVA?
Ans.: The following are the basic assumptions for the validity of F-test in ANOVA.
 All the observations must be independent
 The parent population from which the samples have been drawn must be normal.
 Various treatment and environment effects are additive in nature.

3. What is the difference between “variability with in classes” and “variability
between the classes”? Explain with suitable example.
Ans.: Variations among the observations of each specific class are called its internal
variation and the totality of the internal variation is called variability within the classes.
The totality of variation with in each class reflects chance variation under the
assumption that the variation due to classes is equal to the total variation and is often
called the experimental error. This variation is due to the control and non specific factors.
The totality of variation from one class to another i.e., variation due to classes is called
variability between classes.
For example, let us consider a sample of 4 provinces and 10 sugar shops from
each province. We note down the prices of sugar on these 40 shops. The variation
between the prices of ten shops from each province is their internal variation and the
totality of this internal variation is variability with in provinces. The variation between
the 4 sample means is the variability between provinces.

4. Explain the utility and applications of ANOVA technique?
Ans.: Utility: The technique of studying the homogeneity of population by separating the
total variation into its various components has a much wide scope. Now that was

24

conceived by Prof. R.A. Fisher who used it in analyzing to agricultural data. Now-a-days
it is used to handling the statistics of multiple groups in various other branches of study.
Applications:
 The main application of ANOVA is to test the homogeneity of the observations.
 To test the significance of additional terms in a regression equation.
 To test the curve linearity or no linearity of the fitted regression lines.
 To test the significance in case of multiple regression.


5. Explain the concept of critical difference.
Ans.: If the treatments show significant effect then we would be interested to find out
which pair(s) of treatments differs significantly. For this, instead of calculating student’s t
for different pairs of treatment means, we calculate the least significant difference at the
given level of significance. This least difference is known as the Critical Difference
(C.D.) and C.D. at α% level of significance is given by
C.D. = S.E. of difference between two treatment means x tα% for error d.f.

LIST OF PREVIOUS QUESTIONS

1. Explain ANOVA and compare the ANOVA one way classification with that of
two way classification with the assumptions. [July, 2008]
2. Explain what do you understand by analysis of variance and give the ANOVA
one-way classification. [March, 2008]
3. Write the statistical analysis of one-way classification of ANOVA.
[March, 2007(OR)]
4. The following data refers to the output of three machines of the same make by
each of the four operators. Perform the analysis of variance for one way
5. classification and find whether the operators produce differently or can the
variation between the operator’s out-put be attributed to chance errors?
[March, 2007(OR)]
Operators
A B C D
Machine1 174 173 171.5 173.5
Machine2 173 172 171 171
Machine3 173.5 173 173 172.5
6. State the mathematical model used in analysis of variance is a two-way
classification. State the basic assumptions. Discuss the advantages of this method
over one-way classification if any. [March, 2007]
7. Write about the analysis variance for one way classifications mathematical model,
and assumptions of the model. [March, 2006]
8. Write the statistical analysis of one-way classification. [March, 2006]
9. Describe the technique of ANOVA. Write down the ANOVA table for one-way
layout. [June, 2005]
10. The following table shows the lines in hours of four batches of electric lamps
[June, 2005]

25

Batches
1 1600 1610 1650 1680 1700 1720 1800
2 1580 1640 1640 1700 1750
3 1460 1550 1600 1620 1640 1660 1740 1820
4 1510 1520 1530 1570 1600 1680
11. What is ANOVA? Briefly mention its uses. [March, 2005]
12. Give the complete analysis of two way classification. [March, 2005]
13. Mention the assumptions under ANOVA. Explain its advantages. [March, 2004]
14. Explain the analysis of two-way classification. [March, 2004]

LIST OF ESSAY QUESTIONS
1. Explain the basic purpose of ANOVA.
2. Explain ANOVA one-way classification
3. Explain ANOVA two-way classification

LIST OF SHORT ANSWER QUESTIONS
1. Define ANOVA
2. State the assumptions for the validity of F – test in ANOVA.
3. Distinguish between assignable causes and chance causes.
4. State Cochran’s theorem.
5. Explain the utility and importance of ANOVA
6. Define critical difference.