Tests of Significance.pptx powerpoint presentation
drnaveenravindra97
28 views
65 slides
Oct 16, 2024
Slide 1 of 65
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
About This Presentation
stats
Size: 1.07 MB
Language: en
Added: Oct 16, 2024
Slides: 65 pages
Slide Content
Presenter: Dr Vidya D C Guide: Ms Radhika K Dr N S Murthy Acknowledgement: Mr Shivraj N S Tests of Significance
Contents Introduction Terms and concepts in test of significance Methods of tests of significance Steps in tests of significance Parametric methods Non parametric methods 6/22/2014 2
Introduction 6/22/2014 4 Statistics – methodology of collection and meaningful interpretation of the data Statistics Descriptive statistics Inferential statistics
Introduction 6/22/2014 5 Tests of significance – methodologies of statistics, which deals with the techniques to analyse , how far the difference between estimates from different sample are due to sampling variation or otherwise.
Terms and concepts in test of significance 6/22/2014 6 Null Hypothesis- first step in testing of hypothesis. Level of significance –probability of committing the type I error ( fixed at 5% or 1%) Parameter : It is a population value or a variable that describes the population
6/22/2014 7 Large sample: Refers to samples having >30 observations. Small sample : Refers to samples having ≤30 observations. Critical regions- Region of acceptance & Region of rejection Terms and concepts in test of significance
Methods of tests of significance 6/22/2014 8
PARAMETRIC TESTS- Assumptions The observations must be drawn from normally distributed populations These populations must have the same variances The observations must be independent Population parameters involved is mean, standard deviation Require interval scale or ratio scale (whole numbers or fractions). Example: Height in inches: 72, 60.5, 54.7; temperature[30-34 degree Celsius] 9
PARAMETRIC TESTS… Critical ratio/ Z-test Paired t-test Unpaired t-test One way ANOVA Two way ANOVA 10
Steps in tests for large sample 1)Set up the null hypothesis that the 2 samples are from the same population and that the difference between the 2 sample estimates is due to sampling variation . 2)Calculate 2 means x 1 and x 2 or 2 proportions p 1 and p 2 corresponding to the 2 samples with sample size n 1 and n 2 respectively. 3)Calculate the standard deviation of the 2 samples and their standard errors (SE 1 ) and (SE 2 ) respectively 11
4)Calculate the standard error of the difference between the 2 sample estimates as √[(SE 1 ) 2 + (SE 2 ) 2 ] 5)Calculate the critical ratio (CR) or z value . z= difference between sample estimates/ SE of difference 12 Steps in tests for large sample(contd..)
6)Refer to the normal distribution table & corresponding to this calculated value of z, find the value of probability P. 7)If P is ≤ 0.05, reject the null hypothesis and conclude that the difference between the 2 sample estimates as significant. 13 Steps in tests for large sample(contd..)
Tests for large samples SE of mean= s/ √n SE of proportion= √(pq/n) SE of difference between 2 means= √[(s 1 2 /n 1 ) + (s 2 2 /n 2 )] SE of difference between 2 proportions= √[(p 1 q 1 /n 1 ) + (p 2 q 2 /n 2 )] 14
Example 15
Tests for small samples Assumptions for t- distribution : the means of the 2 samples are normally distributed the means of the 2 samples are independently distributed the variances of the 2 samples are equal. 16
t- test for paired observation The formula to calculate t statistics is, t= d¯/s md with (n-1) degrees of freedom n= number of observations in the sample d¯= mean of differences in the values of the variable of the sample observations before & after treatment 17
s md = SE of mean difference s md = s d / √n s d = standard deviation of the values of d (the differences in variable before & after the treatment) 18 t- test for paired observation
1)Set up null hypothesis that d¯=0 2)Calculate the difference d 1 for each pair of observations before & after treatment and compute their mean d¯ 3)Calculate the SD of these differences s d = [√{sum total of difference between individual observation & d¯} 2 ]/(n-1) n= number of pairs of observations 4)Calculate SE s md from the formula s md = s d / √n 19 t- test for paired observation
5)Calculate the value of t- statistics as t= d¯/ s md 6)Compute the degrees of freedom as (n-1) 7)From the t- distribution table find the probability level corresponding to this value of t & with degrees of freedom (n-1) 20 t- test for paired observation
Unpaired t-test t= {(x 1 ¯- x 2 ¯)/ s md } s md is estimated standard error of the difference between the 2 sample means s md = √[{(n 1 +n 2 )/n 1 n 2 }{(n 1 -1)s 1 2 + (n 2 -1)s 2 2 }/ (n 1 +n 2 -2)] s 1 2 and s 2 2 are the SD of 2 samples n 1 and n 2 are respective sample sizes t= (x 1 ¯- x 2 ¯)/ √[{(n 1 +n 2 )/n 1 n 2 }{(n 1 -1)s 1 2 + (n 2 -1)s 2 2 }/ (n 1 +n 2 -2)] with (n 1 + n 2 – 2) degrees of freedom 22
23
ANOVA Assumptions : the effects under different groups are additive Sample of observations has been drawn using random sampling procedure Samples come from normally distributed population the variance is same in various groups 24
One way ANOVA 1)Calculate sum of observations of each group (village)= (T i ) 2)Calculate sum of all observations (Σx ij ), where x ij represents each observation 3)Calculate square of sum of all observations= ( Σxij ) 2 4)Calculate total sum of squares, ie = [Σ(x ij 2 )- {( Σx ij ) 2 /n}]; n= total no. of observations. ( Σx ij ) 2 /n is called correction factor (CF) 25
5)Calculate ‘sum of squares between groups’, ie = {Σ(T j 2 /k i )-CF}; T j = sum of observations in each group; k i = no. of observations in each group 6)Sum of squares within groups is obtained as difference between the ‘total sum of squares’ and ‘sum of squares between groups’ ie [( Σxij )2 – CF] – [Σ(T j 2 / k i ) – CF] = ( Σx ij ) 2 - Σ(T j 2 / k i ). 26 One way ANOVA
ANALYSIS OF VARIANCE TABLE Source of sum of squares Degree of freedom Sum of squares Mean sum of squares F Between villages 3 349.525 116.5083 4.544 Within villages 36 922.950 25.6375 Total 39 1272.475 27
Two way ANOVA To study the differences within sub classification of the groups also. 28
Analysis of variance table Source of SS df Sum of squares MSS F Between villages 9 134.17 14.908 0.855 Between trimesters 2 436.72 218.36 12.526 Residual 18 313.78 17.432 Total 29 884.67 29
NON-PARAMETRIC TESTS- Assumptions The observations do not follow normal distribution The populations do not have same variances The observations may or may not be independent Population parameters involved are median, mode Data may be nominally or ordinally scaled Example: nominal(male/female); ordinal(good-better- best) 30
NON-PARAMETRIC TESTS… Chi-square test Fisher’s exact test The Sign test Wilcoxon’s Signed Rank test Mann Whitney U test Kruskal - Wallis H test Friedman test Mc Nemar test 31
Chi-square test χ 2 = Σ {(observed frequency – expected frequency) 2 / expected frequency} This value is calculated for each cell in the table & sum of above ratios is the total chi-square value . d f = (c-1)(r-1) 32
Blood group Non leprosy Lepromatous leprosy Non Lepromatous Leprosy A (131/471)x150 = 41.7 (131/471)x169 = 47.0 (131/471)x152 = 42.3 B (145/471)x150 = 46.2 (145/471)x169 = 52.0 (145/471)x152 = 46.8 O (154/471)x150 = 49.0 (154/471)x169 = 55.3 (154/471)x152 = 49.7 AB (41/471)x150 = 13.1 (41/471)x169 = 14.7 (41/471)x152 = 13.2 33
Blood group Non leprosy Lepromatous leprosy Non lepromatous leprosy Total A 3.28 0.09 2.22 5.59 B 4.12 0.17 2.49 6.78 O 0.08 0.25 0.06 0.39 AB 0.00 0.50 0.59 1.09 Total 7.48 1.01 5.36 13.85 34
Chi-square test for a 2x2 table χ 2 = [(ad-bc) 2 x G] / [(a+b)(c+d)(a+c)(b+d)] df = (2-1)(2-1) = 1 35
Filariasis infestation Male Female Total Yes 28 (a) 20(b) 48 No 237(c) 222(d) 459 Total 265 242 507(G) 36
Fisher’s Exact probability test P = [{(a+c)! (b+d)! (a+b)! (c+d)! }/ {n! a! b! c! d!}] a, b, c & d = cell frequencies in 2x2 table n = total number of observations 37
Habit Boys Girls Total Exercise regularly 2 8 10 Do not exercise regularly 10 4 14 Totals 12 12 24 38
Sign test The significance of difference can be tested using usual χ 2 test, which is as follows: χ 2 = [(|a-b| - 1) 2 ] / n with 1 df a & b = no. of (+) & (-) respectively n = (a+b) 39
All ‘0’ differences, ie when the 2 tests are identical in their results are omitted for calculation so that ‘n’ is always equal to (a+b). P is obtained from χ 2 distribution table & the conclusion about the significance is made as usual 40 Sign test
Wilcoxon’s Sign Rank test This is useful in testing the significance of differences in paired observations , when the data is quantitative in nature. 42
Kruskal Wallis H test This is the non-parametric equivalent to F test used in one way ANOVA. This test is used to test whether or not the groups of independent samples have been drawn from the same population. It is calculated using formula: H = [12/n(n+1)]{Σ(R i 2 /n i )} – 3(n+1) 47
Group 1 Group 2 Group 3 Score Rank Score Rank Score Rank 10 1 11 2 30 20 12 3 14 4 32 22 15 5 22 12 24 14 17 7 28 18 31 21 23 13 20 10 26 16 18 8 16 6 34 24 19 9 29 19 27 17 21 11 33 23 25 15 n 1 = 7 R 1 = 46 n 2 = 8 R 2 = 82 n 3 = 9 R 3 = 172 48
Friedman Test This is the non-parametric equivalent of two way ANOVA, where each group may have further sub-divisions. d f = k-1 The formula used is as follows: χ 2 = [{(12) x Σ(R j 2 )} / {nk(k+1)}] – {3n(k+1)} ΣR j = sum of ranks in each column n = no. of rows k = no. of columns 49
Group of care givers Need 1 Need 2 Need 3 Need 4 Score Rank Score Rank Score Rank Score Rank Group I 13 2 9 4 10 3 16 1 Group II 10 2 4 3 3 4 12 1 Group III 12 1 6 3 2 4 10 2 Rank sums 5 10 11 4 50
Mc Nemar test It is a type of 2x2 chi-square test. It is for comparisons of variables from matched pairs & uses information only from discordant pairs(variables are not independent). Reduces type 1 error χ 2 = [(|f-g| - 1) 2 ] / (f+g) d f = 1 51
Alcohol addiction (cases) Total Yes No Alcohol addiction (controls) Yes 10 30 40 No 20 40 60 Total 30 70 100 52
Non-parametric tests: Advantages over parametric tests Can be used to test the data measured on nominal & ordinal scales Easier to compute, understand & explain Make fewer assumptions about distribution of samples Can be used even for very small sample sizes The computational burden is so light that they are known as “short cut” methods 53
Advantages… Need not involve population parameters ie can be used to analyze the data from populations where mean & standard deviations are not available or undeterminable Results may be as exact as parametric procedures when used even on populations following normal distribution 54
Non-parametric tests: Disadvantages Can test the statistical hypothesis but cannot estimate the parameter Low power of tests Difficult to compute by hand for large samples Tables are not widely available 55
Disadvantages… Waste of time & data if all assumptions of a statistical model are satisfied & if there is a suitable parametric test 56
NON-PARAMETRIC COMPLEMENTS OF PARAMETRIC TESTS PARAMETRIC TESTS NON-PARAMETRIC TESTS Paired t test Wilcoxon’s signed rank test Unpaired t test Mann Whitney U test One way ANOVA Kruskal Wallis test Two way ANOVA Friedman test 57
Clinical significance v/s statistical significance 6/22/2014 58 A possible antipyretic is tested in patients with fever 500 received the drug 500 received the placebo Temperatures measured 4 hours after dosing P value= 0.011 Statistical significance ? Yes Clinical significance? No . Temperature only fell by about 0.1 degree Celsius N Mean St Dev SE Mean Drug 500 39.950 0.653 0.029 Control 500 40.058 0.699 0.031
Summary 6/22/2014 59
Conclusion 6/22/2014 62 Tests of significance draws the conclusions about population based on the results observed in the random sample. So, we should be careful in choosing the method which suits the hypothesis bearing in mind the assumptions of the same.
References 6/22/2014 63 1] Rao NSN, Murthy NS. Applied statistics in health sciences. 1 st ed. New Delhi: Jaypee Brothers Medical Publishers (P) Ltd; 2008. 2] Dixit JV. Principles and practice of biostatistics. 3 rd ed. Jabalpur: M/S Banarsidas Bhanot Publishers; 2005. 3] Hennekens CH, Buring JE. Epidemiology in medicine. 1 st ed. Boston, Toronto: Little Brown & Company. 4] Sundaran KR, Dwivedi SN,Sreenivas V. Medical Statistics-Principles and methods. New Delhi: BI Publications Pvt Ltd.;2010
6/22/2014 64 5] Kapur JN, Saxena HC. Mathematical Statistics.4 th Edition. Delhi: Chand and Co Publishers:1967 6] Jeyaseelan L. Tests of Significance notes. Fundamentals in Biostatistics and SPSS. 2014 7]Dr Farah Naaz Fathima . Seminar notes on Tests of significance. Presented on 2/11/2004 8] Dr Akshay K M. Seminar notes on Parametric tests. Presented on 26/11/2008 9] Dr Bhanu M . Seminar notes on Tests of Significance. Presented on 29/04/2011. 10] Aleyamma Mathew. ABC of Medical Statistics. Kerala Surgical journal, 1999:6(1);p 1-71