Non-parametric Statistical tests for Hypotheses testing

2,364 views 24 slides Nov 13, 2018
Slide 1
Slide 1 of 24
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24

About This Presentation

A complete guidelines for Non-parametric Statistical tests for Hypotheses testing with relevant examples which covers Meaning of non-parametric test, Types of non-parametric test, Sign test, Rank sum test, Chi-square test, Wilcoxon signed-ranks test, Mc Nemer test, Spearman’s rank correlation, sta...


Slide Content

Non-parametric statistics Presented by; Thejaswini s 2 nd M.com G.F.G.C.W H olenarsipura

Contents Meaning of non-parametric test Types of non-parametric test Sign test Rank sum test Chi-square test Wilcoxon signed-ranks test Mc Nemer test Spearman’s rank correlation Conclusion Bibliography

M eaning Non-parametric statistics is the branch of statistics. It refers to a statistical method in which the data is not required to fit a normal distribution. Nonparametric statistics uses data that is often ordinal, meaning it does not rely on numbers, but rather a ranking or order of sorts. For example: a survey conveying consumer preferences ranging from like to dislike would be considered ordinal data.

cont’d Nonparametric statistics does not assume that data is drawn from a normal distribution. Instead, the shape of the distribution is estimated under this form of statistical measurements like descriptive statistics, statistical test, inference statistics and models. There is no assumption of sample size because it’s observed data is quantitative. This type of statistics can be used without the mean, sample size, standard deviation or estimation of any other parameters.

T ypes Non-parametric Test Sign test Rank sum test Chi-square test Wilcoxon signed-ranks test McNemer test Spearman’s rank correlation

Sign test The sign test is one of the non parametric test. Its names says the fact that is based on the direction of the plus(+) and minus(-) signs of observations in a sample. The sign test may be classified in to two types One sample sign test Two sample sign test

One sample sign test The one sample sign test is a very simple non-parametric test and the data can be non symmetric in nature. The one sample sign test computes the statistical significance of a hypothesized median value for a single data set. For example H : population median = 63 H 1 : population median > 63 + = 8 - = 2 Total sample = 10 64 + 69 + 40 - 64 + 65 + 71 + 82 + 59 - 64 + 74 + 63

Two sample sign test The sign test has important applications in problems where we deal with paired data. Each pair of value can be replaced with a plus (+) sign if the first value (say X) is greater than the first value of second sample (say Y) and we take minus (-) sign if the first value of x is less than the first value. In case of two values are equal, the pairs are discarded. For example Total number of + signs = 6 Total number of – signs = 2 Hence, sample size is 8 [since there are 2 zeros in the sign row and such 2 pairs are discarded (10-2=8) ] By X 1 2 3 1 2 2 3 By Y 1 2 1 1 2 Signs (X-Y) + + + - + + + -

Cont’d Formula For small samples K = n – 1 -0.98 2 For large samples Z = S – np  

Rank sum test Rank sum tests are U test (Wilcoxon-Mann-Whitney test) H test (Kruskal-Wallis test) U test: It is a non-parametric test. This test is determine whether two independent samples have been drawn from the same population. The data that can be ranked i.e., order from lowest to highest (ordinal data).

Cont’d For example The values of one sample 53, 38, 69, 57, 46 The values of another sample 44, 40, 61, 53, 32 We assign the ranks to all observations, adopting low to high ranking process and given items belong to a single sample. Formula U1 = n1n2+ n1 (n1+1) - N1 = number of samples readings in one area. N2 = number of samples readings in another area. ∑r1 = sum of ranks of readings.   Size of sample in ascending order Rank 32 1 38 2 40 3 44 4 46 5 53 6.5 53 6.5 57 8 61 9 69 10

Cont’d H test: The Kruskal-Wallis H test (also called as the “one-Way ANOVA on ranks”) is a rank-based non parametric test that can be used to determine if there are statistically significant difference between two or more groups of an independent variable on a continuous or ordinal dependent variable. For example: H test to understand whether exam performance, measured on a continuous scale from 0-100, differed based on test anxiety levels(i.e., dependent variable would be “exam performance” and independent variable would be “test anxiety level”, which has three independent groups: students with “low”, “medium” and “high” test anxiety levels).

Cont’d Formula H = 12 ∑ - 3(n+1) n(n+1) n i Where n i = sample size for a population T i = rank sum for population n = total no. of observations.  

Chi square test The chi-square test is a non-parametric test. It is used mainly when dealing with a nominal variable. The chi-square test is mainly 2 methods. Goodness of fit: Goodness of fit refers to whether a significant difference exists between an observed number and an expected number of responses, people or other objects. For example: suppose that we flip a coin 20 times and record the frequency of occurrence of heads and tails. Then we should expect 10 heads and 10 tails. Let us suppose our coin-flipping experiment yielded 12 heads and 8 tails. Our expected frequencies (10-10) and our observed frequencies (12-8).

Cont’d Independence: the independence of test is difference between the frequencies of occurrence in two or more categories with two or more groups. For example: The educational attainment is classified (UG and PG) and income categories (low, middle, high) then we could use the chi-square test for independence. Formula = ∑ [ (O – E)² ] where O= observed frequency E E= expected frequency   Educational attainment low Middle High Total UG 13 16 01 30 PG 43 51 60 154 56 67 61 184

Wilcoxon signed-ranks test In various research situations in the context of two-related samples when we can determined both direction and magnitude of difference between matched values, we can use an important non-parametric test viz., Wilcoxon matched-pair test. While applying this test, we first find the difference between each pair of values and assign rank to the difference from the smallest to the largest without regard to sign.

Cont’d For example : experiment on brand name quality perception Pair Brand A Brand B Difference Ranks 1 25 32 -7 7.5 2 29 30 -1 2.5 3 10 8 2 5.5 4 31 32 -1 2.5 5 27 20 7 7.5 6 24 32 -8 9 7 26 27 -1 2.5 8 29 30 -1 2.5 9 30 32 -2 5.5 10 32 32 Omit 11 20 30 -10 10 12 5 32 -27 11

McNemer test McNemer test is one of the important non-parametric test often used when the data happen to be nominal and relate to two related samples. As such this test id specially useful with before and after measurement of the same subjects. Example: a researcher wanted to compare the attitudes of medical students toward confidence in statistics analysis before and after the intensive statistics course. Formula = (b - c)² / (b + c) (1 df)  

Spearman’s rank correlation In this method a measure of association that is based on the ranks of the observations and not on the numerical values of the data. It was developed by famous Charles spearman in the early 1990s and such it is also known as spearman’s rank correlation co-efficient.

Cont’d For example Formula 1 - 6 ∑D ² N (N² - 1) D = R 1 – R 2 Where R 1 = rating one R 2 = rating two N = number of pairs English (marks) Maths (marks) Rank (English) Rank (maths) Difference of ranks 56 66 9 4 5 75 70 3 2 1 45 40 10 10 71 60 4 7 3 62 65 6 5 1 64 56 5 9 16 58 59 8 8 80 77 1 1 76 67 2 3 1 61 63 7 6 1

conclusion The non-parametric test are called as “distribution-free” test since they make no assumptions regarding the population distribution. It is test may be applied ranking test. They are easier to explain and easier to understand but one should not forget the fact that they usually less efficient/powerful as they are based on no assumptions. Non-parametric test is always valid, but not always efficient.

D ifferences Parametric and Non parametric test Parametric Non parametric Information about population is completely known No information about the population is available Specific assumptions are made regarding the population No assumptions are made regarding the population Null hypothesis is made on parameters of the population distribution The null hypothesis is free from parameters Test statistics is based on the distribution Test statistics is arbitrary Parametric tests are applicable only for variable It is applied both variable and attributes No parametric test exist for nominal scale data Non parametric test do exist for nominal and ordinal scale data Parametric test is powerful, if it exist It is not so powerful like parametric test

bibliography Websites https://www.slideshare.com www.ciencecentral.com https://www.statisticshowto.datasciencecentral.com Book Research Methodology -C.R Kothari