Non parametric-tests

1,356 views 38 slides May 30, 2021
Slide 1
Slide 1 of 38
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38

About This Presentation

Non parametric tests explained with examples


Slide Content

Non Parametric Tests

Mean / Median The mean is a good measure of center when the data is bell-shaped, but it is sensitive to outliers and extreme values. When the data is skewed, however, a better measure of center would be the median. The median, is a resistant measure.   In other words, we may want to consider a test for the median and not the mean. In a skewed distribution, the population median, typically denoted as  η, is a better typical value than the population mean,  μ.

Sign test It is a non-parametric or “distribution free” test, which means the test doesn’t assume the data comes from a particular distribution. The sign test compares the sizes of two groups. The sign test is an alternative to a one sample t test or a paired t test.  It can also be used for ordered data. The null hypothesis for the sign test is that the difference between medians is zero. red (ranked) categorical data. This test is used when we are interested in testing the population median and not the mean.

One sample median test The  one sample median test  checks whether or not there is a significant difference between our hypothesized median and the real median of a sample. We learned how to use a t-test for the difference between means of dependent samples. That test required both populations to be normally distributed. If the condition of normality cannot be satisfied, we can use the paired-sample sign test to test the difference between two population medians, the following conditions must be met. 1. A sample must be randomly selected from each population. 2. The samples must be dependent (paired).

We find the difference between corresponding data entries by subtracting the entry representing the second variable from the entry representing the first variable, and record the sign of the difference. Then compare the number of + and – signs. (the 0s are ignored.)

Steps:- State the hypothesis Specify alpha Specify sample size Find critical value – from t-table or z-table Find test statistic Make decision Interpret

Test statistic When n<=25 , test statistic is smaller no of positive or negative sign. When n>25 , test statistic is calculated from formula :- z=((x+0.5)+0.5n)/sqrt(n)/2 Where x=smaller no of sign and n=total no of positive and negative sign.

Example :- Sand C represent two tasks, S the spelling of 25 words presented separately, and C the spelling of 25 words of equal difficulty presented as an integral part of a sentence (i.e., in context). A teacher wants to know which condition is favorable to higher scores. Test the hypothesis that C is better than S.

Of the 10 differences, 7 are plus (C higher than S), 2 are minus (S higher than C) and one is zero. Excluding the 0 as being neither + nor - , we have 9 differences of which 7 are plus. Let alpha = 0.05 and N = 9 . It’s a left tailed test. Critical value- 1.860 (from t-table) Test statistics = 2 Since test statistic is greater than the critical value , we fail to reject the null hypothesis.

Ex A college statistics professor claims that the median test score for his students is 58. The scores of 18 randomly selected tests are listed below. At alpha=0.01, can you reject the professors claim? 58 62 55 55 53 52 52 59 55 55 60 56 57 61 58 63 63 55

Paired/Matched sample Sign test Assumptions for the test (your data should meet these requirements before running the test) are: The data should be from two samples. The two dependent samples should be paired or matched. For example, depression scores from before a medical procedure and after. Example:- This set of data represents test scores at the end of Spring and the beginning of the Fall semesters. The hypothesis is that the summer break means a significant drop in test scores.

H : No difference in median of the signed differences. H 1 : Median of the signed differences is less than zero.

H : No difference in median of the signed differences. H 1 : Median of the signed differences is less than zero. Count the number of positives and negatives. 4 positives. 12 negatives. Add up the number of items in your sample and subtract any you had a difference of zero for (in column 3). The sample size in this question was 17, with one zero, so n = 16. Let alpha = 0.05 and N = 16 . Critical value- 2.120 (from t-table) Test statistics = 4 Since test statistic is greater than the critical value , we fail to reject the null hypothesis.

Example: A new chemotherapy treatment is proposed for patients with breast cancer.   Investigators are concerned with patient's ability to tolerate the treatment and assess their quality of life both before and after receiving the new chemotherapy treatment.   Quality of life (QOL) is measured on an ordinal scale and for analysis purposes, numbers are assigned to each response category as follows: 1=Poor, 2= Fair, 3=Good, 4= Very Good, 5 = Excellent.   The data are shown below.  Patient QOL Before Chemotherapy Treatment QOL After Chemotherapy Treatment Difference Sign 1 3 2 1 + 2 2 3 -1 - 3 3 4 -1 - 4 2 4 -2 - 5 1 1 NA 6 3 4 -1 - 7 2 4 -2 - 8 3 3 NA 9 2 1 1 + 10 1 3 -2 - 11 3 4 -1 - 12 2 3 -1 - H0- no difference in median of both the data values Ha – there is a difference in the median of both the data values No of + ves - 2 No. of – ves = 8 N=10 Alpha= 0.05 Test statistics= 2 Critical value- 1.812 Conclusion:- test statistic > critical value We accept the hypothesis that there is no difference in the median of both the data values. There was no significant change in the quality of life after and before the chemotherapy treatment.

Mood’s Median Test Mood’s median test is used to compare the medians for two samples to find out if they are different. For example, you might want to compare the median number of  positive  calls to a hotline vs. the median number of  negative  comment calls to find out if you’re getting significantly more negative comments than positive comments (or vice versa). This test is the nonparametric alternative to a one way ANOVA; Nonparametric means that you don’t have to know what distribution your sample came from (i.e. a normal distribution) before running the test. That said, your samples should have been drawn from distributions with the same shape.  Use this test instead of the sign test when you have two independent samples. The test is a particular case of the chi-square test of dependence.  The null hypothesis for this test is that the medians are the same for both groups. The alternate hypothesis for the test is that the medians are different for both groups.

Step 1: Make a 2 x  k  contingency table,   where k is the number of samples. Step 2: Find M, the overall median for all the data in your samples. To do this, list all of your data (from all samples) in a single set. Sort in ascending order and then find the middle number. Step 3: List each individual sample’s data in ascending order. Count how many data points are greater than M (from Step 2) and then count how many data points are smaller than or equal to M. List these in the first row of the contingency table. Step 4: Perform a chi-square test on the completed contingency table. Step 5: Compare the chi-square statistic to the table value with: degrees of freedom = (number of rows – 1) * (number of columns – 1).

Example Non parametric test - Mood's Median test for the following sets of data :- (11,15,9,4,34,17,18,14,12,13,26,31) (34,31,35,29,28,12,18,30,14,22,10,29 ) Significance Level  α =0.05  and One-tailed test Sol:- Step-1: Calculate total Median of combination of 2 samples Sorting of combined samples 4,9,10,11,12,12,13,14,14,15,17,18,18,22,26,28,29,29,30,31,31,34,34,35 n =24 Median =(12 th term+13 th term)/2=(18+18)/2=18

Step-2: Create a 2×2 contingency table whose first row consists of the number of elements in each sample that are greater than Median and second row consists of the number of elements in each sample that are less than or equal to Median Sample A Sample B Total > Median 3 8 11 <= Median 9 4 13 Total 12 12 24 Step-3: Perform a chi-square test of independence. State the hypothesis H 0: two categories variables are independent. H 1: two categories variables are not independent. Observed Frequencies B1 B2 Total A1 3 8 11 A2 9 4 13 Total 12 12 24

Expected Frequencies Compute Chi-square χ 2=∑( Oij - Eij ) 2/ Eij =(3-5.5) 2 /5.5+(8-5.5) 2 /5.5+(9-6.5) 2 /6.5+(4-6.5) 2 /6.5 =6.25/5.5+6.25/5.5+6.25/6.5+6.25/6.5 =1.1364+1.1364+0.9615+0.9615 =4.1958 Compute the degrees of freedom ( df ). df =(2-1)⋅(2-1)=1 for 1 df ,  p ( χ 2≥4.1958)=0.0405 . Test statistic- 4.1958. Critical value- 6.314 Since the test statistic < critical value , we reject the null hypothesis  H 0. B 1 B 2 Total A 1 5.5 5.5 11 A 2 6.5 6.5 13 Total 12 12 24

Example A major wheat supplier from Texas analyzing the yields of various crop methods. He randomly assigned two different wheat crop methods to a very high number of different acres of farm land and recorded the production rate (yield per acre) for each plot. We need to find out difference between the two wheat crop methods.

Kruskal Wallis Test The Kruskal Wallis test is the non parametric alternative to the One Way ANOVA.  The test determines whether the medians of two or more groups are different. Like most statistical tests, you calculate a test statistic and compare it to a distribution cut-off point. The test statistic used in this test is called the  H statistic.   The hypotheses for the test are: H : population medians are equal. H 1 : population medians are not equal. The Kruskal Wallis test will tell you if there is a significant difference between groups. However, it won’t tell you  which  groups are different.   You want to find out how test anxiety affects actual test scores. The independent variable “test anxiety” has three levels: no anxiety, low-medium anxiety and high anxiety. The dependent variable is the exam score, rated from 0 to 100%. You want to find out how socioeconomic status affects attitude towards sales tax increases. Your independent variable is “socioeconomic status” with three levels: working class, middle class and wealthy. The dependent variable is measured on a 5-point  scale from strongly agree to strongly disagree.

The H test is used when the assumptions for ANOVA aren’t met (like the assumption of normality). It is sometimes called the  one-way ANOVA on ranks , as the ranks of the data values are used in the test rather than the actual data points. Assumptions:- One independent variable with two or more levels (independent groups). The test is more commonly used when you have three or more levels. Ordinal scale, Ratio Scale or Interval scale dependent variables. Your observations should be independent. In other words, there should be no relationship between the members in each group or between groups. All groups should have the same shape distributions. It is used for comparing two or more independent samples of equal or different sample sizes.

The Kruskal-Wallis H Test is a nonparametric procedure that can be used to compare more than two populations in a completely randomized design. All n = n 1 +n 2 +…+ n k measurements are jointly ranked ( i.e.treat as one large sample). We use the sums of the ranks of the k samples to compare the distributions.

Rank the total measurements in all k samples from 1 to n . Tied observations are assigned average of the ranks they would have gotten if not tied. Calculate  T = r ank su m f or t h e i t h s am p le And the test statistic i = 1, 2 , …, k  3 ( n  1 ) n ( n  1 ) n i 12 2 T  i H 

H : the k distributions are identical versus H a : at least one distribution is different Test statistic: Kruskal-Wallis H When H is true, the test statistic H has an approximate chi-square distribution with df = k -1. Use a right-tailed rejection region or p - value based on the Chi-square distribution.

Example A shoe company wants to know if three groups of workers have different salaries: Women : 23K, 41K, 54K, 66K, 78K. Men : 45K, 55K, 60K, 70K, 72K Minorities : 18K, 30K, 34K, 40K, 44K. Sol:- Null Hypothesis  H 0 : All groups are equal Alternative Hypothesis  H 1 : At least one group is not equal Step 1: Sort the data for all groups/samples into ascending order in one combined set. 20K 23K 30K 34K 40K 41K 44K 45K 54K 55K 60K 66K 70K 72K 90K

Step 2: Assign ranks to the sorted data points. Give tied values the average rank. 20K 1 23K 2 30K 3 34K 4 40K 5 41K 6 44K 7 45K 8 54K 9 55K 10 60K 11 66K 12 70K 13 72K 14 90K 15

Step 3: Add up the different ranks for each group/sample. Women : 23K, 41K, 54K, 66K, 90K = 2 + 6 + 9 + 12 + 15 = 44. Men : 45K, 55K, 60K, 70K, 72K = 8 + 10 + 11 + 13 + 14 = 56. Minorities : 20K, 30K, 34K, 40K, 44K = 1 + 3 + 4 + 5 + 7 = 20. Step 4: Calculate the H statistic: Where : n = sum of sample sizes for all samples, c = number of samples, T j  = sum of ranks in the j th  sample, n j  = size of the j th  sample.

H = 6.72 Step 5: Find the critical chi-square value, with c-1 degrees of freedom. For 3 – 1 degrees of freedom and an alpha level of .05, the critical chi square value is 5.9915. Step 6: Compare the H value from Step 4 to the critical chi-square value from Step 5. If the critical chi-square value is less than the H statistic, reject the null hypothesis that the medians are equal. If the chi-square value is not less than the H statistic, there is not enough evidence to suggest that the medians are unequal. In this case, 5.9915 is less than 6.72, so we can reject the null hypothesis.

Perform Kruskal wallis test for the following data:- 8,5,7,11,9,6 – 25.5 10,12,11,9,13,12 - 64 11,14,10,16,17,12 – 87.5 18,20,16,15,14,22 - 123 Significance Level  α =0.05  and One-tailed test. 12/24*25[(25.5 2 + 64 2 + 87.5 2 + 123 2 )/6] -3(24+1) H= 16.825 Critical value = 7.815

Mann Whitney U Test The Mann-Whitney U test is the nonparametric equivalent of the two sample t-test.  The Mann Whitney U test, sometimes called the Mann Whitney Wilcoxon Test or the Wilcoxon Rank Sum Test While the t-test makes an assumption about the distribution of a population , the Mann Whitney U Test makes no such assumption. The test compares two populations.   The null hypothesis is that the two samples come from the same population (i.e. that they both have the same median). This test is often performed as a two-sided test and, thus, the research hypothesis indicates that the populations are not equal as opposed to specifying directionality. A one-sided research hypothesis is used if interest lies in detecting a positive or negative shift in one population as compared to the other.

Assumptions for the Mann Whitney U Test The dependent variable should be measured on an ordinal scale or a continuous scale. The independent variable should be two independent, categorical groups. Observations should be independent. In other words, there should be no relationship between the two groups or within each group. Observations are not normally distributed. However, they should follow the same shape (i.e. both are bell-shaped and skewed left). The result of performing a Mann Whitney U Test is a U Statistic. For small samples, use the direct method (see below) to find the U statistic; For larger samples, a formula is necessary.

Formula Either of these two formulas are valid for the Mann Whitney U Test.  R is the sum of ranks in the sample, and n is the number of items in the sample.

Consider a Phase II clinical trial designed to investigate the effectiveness of a new drug to reduce symptoms of asthma in children. A total of n=10 participants are randomized to receive either the new drug or a placebo. Participants are asked to record the number of episodes of shortness of breath over a 1 week period following receipt of the assigned treatment. The data are shown below. Placebo 7 5 6 4 12 New Drug 3 6 4 2 1 Is there a difference in the number of episodes of shortness of breath over a 1 week period in participants receiving the new drug as compared to those receiving the placebo? SOL:- In this example, the outcome is a count and in this sample the data do not follow a normal distribution. In addition, the sample size is small (n 1 =n 2 =5), so a nonparametric test is appropriate. The hypothesis is given below, and we run the test at the 5% level of significance (i.e., α=0.05). H : The two populations are equal versus H 1 : The two populations are not equal. The first step is to assign ranks and to do so we order the data from smallest to largest. This is done on the combined or total sample (i.e., pooling the data from the two treatment groups (n=10)), and assigning ranks from 1 to 10, as follows.

  Total Sample (Ordered Smallest to Largest) Ranks Placebo New Drug Placebo New Drug Placebo New Drug 7 3 1 1 5 6 2 2 6 4 3 3 4 2 4 4 4.5 4.5 12 1 5 6 6 6 7.5 7.5 7 9 12 10

We produce a test statistic based on the ranks. First, we sum the ranks in each group. In the placebo group, the sum of the ranks is 37; in the new drug group, the sum of the ranks is 18. Recall that the sum of the ranks will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 10(11)/2=55 which is equal to 37+18 = 55. For the test, we call the placebo group 1 and the new drug group 2  We let R 1  denote the sum of the ranks in group 1 (i.e., R 1 =37), and R 2 denote the sum of the ranks in group 2 (i.e., R 2 =18).  The test statistic for the Mann Whitney U Test is denoted  U  and is the  smaller  of U 1  and U 2.

In every test, we must determine whether the observed U supports the null or research hypothesis.  We determine a critical value of U such that if the observed value of U is less than or equal to the critical value, we reject H  in favor of H 1  and if the observed value of U exceeds the critical value we do not reject H . To determine the appropriate critical value we need sample sizes (for Example: n 1 =n 2 =5) and our two-sided level of significance ( α=0.05) The critical value is 2, and the decision rule is to reject H  if U  <  2. We do not reject H  because 3 > 2. We do not have statistically significant evidence at α =0.05, to show that the two populations of numbers of episodes of shortness of breath are not equal. To be significant, our obtained U has to be equal to or LESS than this critical value.

A new approach to prenatal care is proposed for pregnant women living in a rural community. The new program involves in-home visits during the course of pregnancy in addition to the usual or regularly scheduled visits. A pilot randomized trial with 15 pregnant women is designed to evaluate whether women who participate in the program deliver healthier babies than women receiving usual care. The outcome is the APGAR score measured 5 minutes after birth. Recall that APGAR scores range from 0 to 10 with scores of 7 or higher considered normal (healthy), 4-6 low and 0-3 critically low. The data are shown below. Usual Care 8 7 6 2 5 8 7 3 New Program 9 9 7 8 10 9 6 Is there statistical evidence of a difference in APGAR scores in women receiving the new and enhanced versus usual prenatal care?
Tags