Power study: 1-way anova vs kruskall wallis

doncua1 4,114 views 22 slides Oct 15, 2014
Slide 1
Slide 1 of 22
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22

About This Presentation

A "Monday Journal Activity" presentation...


Slide Content

“Monday Journal Activity”

3 rounds of Monte Carlo duel - vs - ANOVA Kruskal-wallis A Parametric test A Non-parametric test “Many-sample location parameter tests” bout

Power study of ANOVA and Kruskal -Wallis Test Tanja Van Hecke Faculty of Applied Engineering Sciences, Ghent University, Ghent ,Belgium [email protected] Presented by: Mr. Benito Jr. B. Cuanan Scientific Publishing and Statistics Department Strategic Center for Diabetes Research King Saud University

Introduction: Suppose levels of Patotine (a non-existing hormone) of different groups are being investigated. Suppose the groups are : “the underweight group” , “the normal group” and “the overweight group”. Once data are collected, how should this three groups be compared?

Introduction: -----The most common approach would be to compare the mean or median levels of the different groups. But how? Take note: It would be very awkward if in your research paper, you present your data like this: “ … figures in table 1.1 are the Tukey’s Biweight m-estimator …”

Introduction: Consider the table on the right. If we are to compare the 3 groups, what test statistics should we use? Looking at the type of data ( in this case: continuous) and the number of groups to be compared( in this case: 3 groups), our “textbook” will surely suggest: … USE: one-way ANOVA

Introduction: Are you sure about that? Have you checked the normality? How about its scedasticity ?

Abstract: This paper described the comparison of the ANOVA and the Kruskal -Wallis test by means of the power when violating the assumption about normally distributed populations. The permutation method is used as a simulation method to determine the power of the test. It appears that in the case of asymmetric populations, the non-parametric Kruskal -Wallis test performs better than the parametric equivalent ANOVA method.

A parametric test The most commonly used test for location.  Used to analyze the differences between group means and their associated procedures. It models the data as: y ij = µ i + ε ij , µ i is the mean or expected response of data in the i-th treatment. ε ij are independent, identically distributed normal random errors. The 1-way ANOVA ( Analysis of Variance)

The 1-way ANOVA ( Analysis of Variance) The one-way ANOVA is used to test the equality of k (k > 2) population means, so the null hypothesis is: H : μ 1 = μ 2 = . . . = μ k . Assumptions: The dependent variable is normally distributed in each group that is being compared. There is homogeneity of variances. This means that the population variances in each group are equal.  one-way ANOVA may yield inaccurate estimates of the p-value when the data are not normally distributed at all. If the sample sizes are equal or nearly equal, ANOVA is very robust. If not , then the true p-value is greater than the computed p-value. What if the assumptions are violated?

This paper focused on the violation of the first(in our list) assumption of ANOVA, that is: violation on NORMALITY of each group’s distribution. The 1-way ANOVA ( Analysis of Variance)

A Non-parametric test The “analogue” of ANOVA in testing for location.  Used to compare the medians of 3 or more independent groups. It models the data as: y ij = η i + φ ij , η i is the median response of data in the i-th treatment. φ ij are independent, identically distributed continuous random errors The Kruskal -Wallis test

Unlike the ANOVA, this test does not make assumptions about normality . However, it assumes that each group have approximately the same shape. Like most non-parametric tests, it is performed on the ranks of the measurement observations. The null hypothesis of the Kruskal -Wallis test states that the samples are from identical populations. When rejecting the null hypothesis of the Kruskal -Wallis test, then at least one of sample stochastically dominates at least one other sample. The Kruskal -Wallis test

POWER: Statistical power is defined as  the probability that the test correctly rejects the null hypothesis when the null hypothesis is false. It is the “ sensitivity” of the test. In this paper, empirical power were calculated. Power by means of the permutation test

Power by means of the permutation test PERMUTATION TEST: The steps for a multiple-treatment permutation test: • Compute the F-value of the given samples, called F obs . • Re-arrange the k n observations in k samples of size n. • For each permutation of the data, compare the F-value with the F obs For the upper tailed test, compute the p-value as p = #( F >F obs )/ n tot If the p-value is less than or equal to the predetermined level of significance α, then we reject H . if we have 3 groups with 20 observations each, there are (60!)(20!)- 3 possible different permutations of those observations into three groups. That’s equivalent to 577,831,814,478,475,823,831,865,900

Comparing the power of ANOVA and Kruskal -Wallis The Monte Carlo simulation was used. random data from a specified distribution with given parameters were generated. ANOVA and Kruskal -Wallis test were then conducted. 2500 samples from chosen distributions were tested at α = 0.05.

The Test: The following hypotheses were considered: H o : µ 1 = µ 2 = µ 3 and H 1 : µ 1 + d = µ 2 and µ 1 + 2d = µ 3 Ex. @d=0.3: if µ 1 = 1 , then µ 2 =1.3 and µ 3 =1.6 The two tests were compared in 3 distribution types, namely: Normal, Lognormal, and Chi-squared distribution.

Normal Distribution Chi-squared distribution with 3 degrees of freedom Lognormal Distribution( in red)

Round 1: The Normal

Round 2: The Log Normal

Round 3: The Chi-squared

And the winner…. (Conclusion) For non-symmetrical distributions, the non-parametrical Kruskal -Wallis test results in a higher power compared to the classical one-way anova . The results of the simulations show that an analysis of the data is needed before a test on differences in central tendencies is conducted. Although the literature and textbooks state that the F-test is robust under the violations of assumptions, these results show that the power suffers a significant decrease.