Hypothesis Testing- Parametric and Non-Parametric Tests in Statistics In hypothesis testing, Statistical tests are used to check whether the null hypothesis is rejected or not rejected. These Statistical tests assume a null hypothesis of no relationship or no difference between groups. So, In this article, we will be discussing the statistical test for hypothesis testing including both parametric and non-parametric tests
Hypothesis Testing is a type of statistical analysis in which you put your assumptions about a population parameter to the test. It is used to estimate the relationship between 2 statistical variables. Let's discuss few examples of statistical hypothesis from real-life - A teacher assumes that 60% of his college's students come from lower-middle-class families. A doctor believes that 3D (Diet, Dose, and Discipline) is 90% effective for diabetic patients. Now that you know about hypothesis testing, look at the two types of hypothesis testing in statistics.
Null Hypothesis and Alternate Hypothesis The Null Hypothesis is the assumption that the event will not occur. A null hypothesis has no bearing on the study's outcome unless it is rejected. H is the symbol for it, and it is pronounced H-naught. The Alternate Hypothesis is the logical opposite of the null hypothesis. The acceptance of the alternative hypothesis follows the rejection of the null hypothesis. H 1 is the symbol for it.
Let's understand this with an example. A sanitizer manufacturer claims that its product kills 95 percent of germs on average. To put this company's claim to the test, create a null and alternate hypothesis. H0 (Null Hypothesis): Average = 95%. Alternative Hypothesis (H1): The average is less than 95%. Another straightforward example to understand this concept is determining whether or not a coin is fair and balanced. The null hypothesis states that the probability of a show of heads is equal to the likelihood of a show of tails. In contrast, the alternate theory states that the probability of a show of heads and tails would be very different.
Parametric Tests Parametric tests are those tests for which we have prior knowledge of the population distribution ( i.e , normal), or if not then we can easily approximate it to a normal distribution which is possible with the help of the Central Limit Theorem. Parameters for using the normal distribution is – Mean Standard Deviation Parametric tests usually assume certain properties of the parent population from which we draw samples. Assumptions like observations come from a normal population, sample size is large, assumptions about the population parameters like mean, variance, etc., must hold good before parametric tests can be used.
But there are situations when the researcher cannot or does not want to make such assumptions. In such situations we use statistical methods for testing hypotheses which are called non-parametric tests because such tests do not depend on any assumption about the parameters of the parent population. Besides, most non-parametric tests assume only nominal or ordinal data, whereas parametric tests require measurement equivalent to at least an interval scale. As a result, non-parametric tests need more observations than parametric tests to achieve the same size of Type I and Type II errors IMPORTANT PARAMETRIC TESTS The important parametric tests are: 1. z-test; 2. t-test; 3. χ2 -test, and 4. F-test.
The main difference between these two tests is that one of them is dependent and the other is independent to a certain extent from parameters like mean, standard deviation, variation, and Central Limit Theorem. All of these are different parameters calculated on the data available. Although every parametric test has a nonparametric counterpart or equivalent. Parametric: you know about the population Non-Parametric: you don’t know about the population
Properties Parametric Non-parametric Assumptions Yes No central tendency Value Mean value Median value Correlation Pearson Spearman Probabilistic distribution Normal Arbitrary Population knowledge Requires Does not require Used for Interval data Nominal data Applicability Variables Attributes & Variables Examples z-test, t-test, etc. Kruskal -Wallis, Mann-Whitney
Parameters of Comparison Parametric Nonparametric Definition The test whose outcomes depend on the distribution is called a parametric test. The test whose outcomes do not depend on the distribution is called a nonparametric test. Statistical power Parametric tests have higher statistical power. Nonparametric tests have lower statistical power. Versatility Parametric tests are not applicable to all situations. Nonparametric tests are more robust and can be applied to different situations. Central Tendency value Mean value is the central tendency value for this test. Median value is the central tendency value for this test. Type of distribution It is used on data that follows a normal distribution. It is used on data that follows any arbitrary distribution.
What is Nonparametric Test? Nonparametric tests are tests that aren’t dependent on any assumptions of the data distribution or parameters to analyze them. They are also sometimes referred to as “distribution-free tests”. Nonparametric doesn’t necessarily mean that we know nothing about the population, it means that the data is skewed or “not normally distributed”. The reasons why we use nonparametric tests are: *if the data doesn’t meet the assumptions for the population sample or when data is skewed, *the population sample size is too small, or *the data being analyzed is nominal or ordinal. The different types of nonparametric tests are : Run Test for randomness of data, Mann Whitney U Test, Wilcoxon Matched Pairs Rank Test, Kruskul -Wallis Test, Kolmogorov -Smirnov Test
Run Test for randomness of data It is sometimes called the Geary test. It is test for randomness of dichotomous variables. ( A dichotomous variable is one that takes on one of only two possible values when observed or measured) What is a run? Let's suppose we have observations of the random variable X , and observations of the random variable Y . Suppose we combine the two sets of independent observations into one larger collection of observations, and then arrange the observations in increasing order of magnitude. If we label from which set each of the ordered observations originally came, we might observe something like this: where x denotes an observation of the random variable X and y denotes an observation of the random variable Y . (We might observe this, for example, if the X values were 0.1, 0.4, 0.5, 0.6, 0.8, and 0.9, and the Y values were 0.2, 0.3, 0.7, 1.0, 1.1, and 1.2). That is, in this case, the smallest of all of the observations is an X value, the second smallest of all of the observations is a Y value, the third smallest of all of the observations is a Y value, the fourth smallest of all of the observations is an X value, and so on. Now, each group of successive values of X and Y is what we call a run .
So, in this example, we have six runs. If we instead observed this ordered arrangement: we would have three runs. And, if we instead observed this ordered arrangement : we would have eight runs.
Why runs? The next obvious question is in what way might knowing the number of runs be helpful in testing the null hypothesis of the equality of the distributions F ( x ) and G ( y ). Let's investigate that question by taking a look at a few examples. Let's start with the case in which the distributions are equal. In that case, we might observe something like this. In this particular example, there are eight runs. As you can see, this kind of a picture suggests that when the distributions are equal, the number of runs will likely be large.
Now, let's take a look at one way in which the distribution functions could be unequal. One possibility is that one of the distribution functions is at least as great as the other distribution function at all points z . This situation might look something like this. In this particular example, there are only two runs. This kind of a situation suggests that when one of the distribution functions is at least as great as the other distribution function, the number of runs will likely be small. Note that this is what the distribution functions might look like if the median of Y was greater than the median of X .
Man – Whitney U Test This test enables 2 independent samples by testing hypothesis on 2 population median. also called the Mann–Whitney– Wilcoxon ( MWW/MWU ), Wilcoxon rank-sum test , or Wilcoxon –Mann–Whitney test ) used to test whether two samples are likely to derive from the same population (i.e., that the two populations have the same shape). (Nonparametric tests used on two dependent samples are the Sign test and the Wilcoxon signed-rank test ) (Non parametric test always talks about median, not by mean and variance) The null and two-sided research hypotheses for the nonparametric test are stated as follows: H : The two populations are equal versus H 1 : The two populations are not equal.
T est Statistic for the Mann Whitney U Test The test statistic for the Mann Whitney U Test is denoted U and is the smaller of U 1 and U 2 , defined below. where R 1 = sum of the ranks for group 1 and R 2 = sum of the ranks for group 2.
Key Concept: For any Mann-Whitney U test, the theoretical range of U is from 0 (complete separation between groups, H most likely false and H 1 most likely true) to n 1 *n 2 (little evidence in support of H 1 ). In every test, U 1 +U 2 is always equal to n 1 *n 2 .
Wilcoxon Matched Pairs Rank Test The Wilcoxon signed-ranks test is a non-parametric equivalent of the paired t -test. It is most commonly used to test for a difference in the mean (or median) of paired observations - whether measurements on pairs of units or before and after measurements on the same unit. It can also be used as a one-sample test to test whether a particular sample came from a population with a specified median. The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used either to test the location of a population based on a sample of data, or to compare the locations of two populations using two matched samples. The Wilcoxon test can be a good alternative to the t -test when population means are not of interest; for example, when one wishes to test whether a population's median is nonzero, or whether there is a better than 50% chance that a sample from one population is greater than a sample from another population.
Two slightly different versions of the test exist: The Wilcoxon signed rank test compares your sample median against a hypothetical median. The Wilcoxon matched-pairs signed rank test computes the difference between each set of matched pairs, then follows the same procedure as the signed rank test to compare the sample against some median. The term “ Wilcoxon ” is often used for either test. This usually isn’t confusing, as it should be obvious if the data is matched, or not matched. The null hypothesis for this test is that the medians of two samples are equal. It is generally used: As a non-parametric alternative to the one-sample t test or paired t test .
The Wilcoxon matched-pairs signed-rank test is a nonparametric method to compare before-after, or matched subjects. It is sometimes called simply the Wilcoxon matched-pairs test.
The Kruskal -Wallis H test (sometimes also called the "one-way ANOVA on ranks") is a rank-based nonparametric test that can be used to determine if there are statistically significant differences between two or more groups of an independent variable on a continuous or ordinal dependent variable. It is considered the nonparametric alternative to the one-way ANOVA , and an extension of the Mann-Whitney U test to allow the comparison of more than two independent groups. Kruskal -Wallis (the nonparametric version of the ANOVA) ANOVA allows a comparison of more than 2 groups at same time to determine whether a relationship exists between them.
For example , you could use a Kruskal -Wallis H test to understand whether exam performance, measured on a continuous scale from 0-100, differed based on test anxiety levels (i.e., your dependent variable would be "exam performance" and your independent variable would be "test anxiety level", which has three independent groups: students with "low", "medium" and "high" test anxiety levels). Alternately, you could use the Kruskal -Wallis H test to understand whether attitudes towards pay discrimination, where attitudes are measured on an ordinal scale, differed based on job position (i.e., your dependent variable would be "attitudes towards pay discrimination", measured on a 5-point scale from "strongly agree" to "strongly disagree", and your independent variable would be "job description", which has three independent groups: "shop floor", "middle management" and "boardroom").
The Kruskal -Wallis test is one of the non parametric tests that is used as a generalized form of the Mann Whitney U test. It is used to test the null hypothesis which states that ‘k’ number of samples has been drawn from the same population or the identical population with the same or identical median. If Sj is the population median for the jth group or sample in the Kruskal -Wallis test, then the null hypothesis in mathematical form can be written as S1 =S2= ….. = Sk. Obviously, the alternative hypothesis would be that Si is not equal to Sj . This means that at least one pair of groups or samples has different pairs. In order to apply the Kruskal -Wallis test, one has to write the data in a two way format in such a manner that each column represents each successive sample. In the computation each of the ‘N’ observations is replaced in the form of ranks. This means that all the values from the ‘k’ number of samples are combined together and are ranked in a single series.
The Kruskal -Wallis test allows to compare three or more groups. More precisely, it is used to compare three or more groups in terms of a quantitative variable . It can be seen as the extension to the Mann-Whitney test which allows to compare 2 groups under the non-normality assumption. “Is the length of the flippers different between the 3 species of penguins?” The null and alternative hypotheses of the Kruskal -Wallis test are: H : The 3 species are equal in terms of flipper length H 1 : At least one species is different from the other 2 species in terms of flipper length
Kolmogorov -Smirnov Test The Kolmogorov -Smirnov Goodness of Fit Test (K-S test) compares your data with a known distribution and lets you know if they have the same distribution. Although the test is nonparametric — it doesn’t assume any particular underlying distribution — it is commonly used as a test for normality to see if your data is normally distributed .It’s also used to check the assumption of normality in Analysis of Variance . More specifically, the test compares a known hypothetical probability distribution (e.g. the normal distribution ) to the distribution generated by your data — the empirical distribution function .
The hypotheses for the test are: Null hypothesis (H ): the data comes from the specified distribution. Alternate Hypothesis (H 1 ): at least one value does not match the specified distribution. That is, H : P = P , H 1 : P ≠ P . Where P is the distribution of your sample , P is a specified distribution The Kolmogorov –Smirnov statistic quantifies a distance between the empirical distribution function of the sample and the cumulative distribution function of the reference distribution, or between the empirical distribution functions of two samples. The null distribution of this statistic is calculated under the null hypothesis that the sample is drawn from the reference distribution (in the one-sample case) or that the samples are drawn from the same distribution (in the two-sample case)
The Kolmogorov –Smirnov test can be modified to serve as a goodness of fit test. In the special case of testing for normality of the distribution, samples are standardized and compared with a standard normal distribution. This is equivalent to setting the mean and variance of the reference distribution equal to the sample estimates, and it is known that using these to define the specific reference distribution changes the null distribution of the test statistic