John Arbuthnot 1667-1735 English statistician First to use test of significance
Outline of parametric test Z test Examines the null hypothesis that a sample comes from normal distribution with a known variance and mean against the alternative hypothesis that it does not have the mean .
Student’s T test- one sample It investigate the null hypothesis that a sample comes from a normal distribution with unknown variance and a specified mean against the alternative hypothesis that it does not have the mean .
It examines whether two independent samples come from normal distributions with a unknown variance and the same mean , against the alternative hypothesis that the means are equal. Student’s T test- Two sample
CHI-SQUARE variance test Test the null that a sample comes from a normal distributions with a specified variance against the alternative hypothesis that it comes from a normal distribution with a different variance .
F - Test Examines the null hypothesis that two independent sample comes from normal distribution with a same variance against the alternative hypothesis that they come from normal distributions with different variances.
BARTLETT’s test- Multiple –sample test for equal variances. It investigates the null hypothesis that multiple samples come from normal distribution with a same variance against the alternative hypothesis that they come from normal distributions with different variances.
Testing of hypothesis It is of two type Parametric test Non parametric test
Parametric and non parametric test Parametric test are more robust and for the most part require less data to make a stronger conclusion than non-parametric test. Non parametric is a statistical procedure whereby the data does not match a normal distribution.
Parameters of the population-parametric Observation must be independent Observation must be drawn from normally distributed population. Populations must have the same variance. Student t test is used when two independent groups are compared.
Student ‘t’ test It is a statistical test which is commonly used to compare the mean of two group of samples. It is one of the most widely used parametric test. It is a method of testing hypothesis about mean of small sample drawn from a normally distributed population when the standard deviation for the sample is unknown.
Student ‘t’ test Student ‘t’ test replaces ‘z’ test whenever the standard deviation of the population of the variable is unknown.
Student ‘t’ test
History – Student ‘t’ test
William sealy Gosset 1876-1937 A british statistician He work at Guinness brewery in Dublin. Guiness did not allow its staff to publish. So william used the pen name ‘ Student ‘ . The t- distribution was published in 1905.
He applied it in Quality control to handle small samples in brewing. He applied statistical techniques in agriculture to select the best-yielding varieties of barley. History – Student ‘t’ test
Problems due to small samples Wide variation in estimates from sample to sample. When the sample size is small i.e less than 30, the difference between the population parameter and the sample static does not follow the G aussian or Normal distribution.
Student ‘t’ test As the sample size increases, the t-distribution approximates the Gausian distribution. When sample size is 30 , the differences between these distributions is very small. ‘t’ score is used for testing statistical significance. The t curve is symmetrical but flatter than the normal.
Degree of freedom It is a number that indicates the number of values that can be independently chosen.
The ‘t’ test assesses whether the means of two groups are statistically different from each other. Student ‘t’ test
Two general research strategies Between subject design Two sets of data could come from two independent populations Within subjects design Two sets of data could come from related population
The figure shows where the control and treatment group means are located
The question that t test addresses is whether the means are statistically different
The difference between the mean is same in all the three
You notice that three situations don’t look the same
There is a relatively little overlap between the two bell shaped curves
In high variability case, the group differences appears least striking because the two bell shaped distributions overlap so much
Medium variability
This lead us to an important conclusion We are looking at the differences between scores for two groups, we have to judge the differences between their means relative to the spread or variability of their scores The t – test does just this
‘t’ score
The formula for the test is ratio. The top part of the ratio is just the difference between the two means or averages. The bottom part is a measure of the variability or dispersion of the score.
The formula is essentially another example of the signal to noise The difference between the means is signal and the bottom part is a measure of variability that is essentially noise.
The t value will be positive if the first mean is larger than the second and negative if it is smaller.
Problems In a population the average weight of males is 55 kg with a standard deviation of 3 kg. A sample of 14 males was found to have a mean weight of 60 kg. test at 5 % level of significance whether the sample mean is consistent with the population mean
Hypothesis Null hypothesis There is no difference between the sample mean and population mean is 60 kg Alternative hypothesis The population mean is not 60 kg
As per null hypothesis Sample mean = 60 kg Population mean = 55 kg Population standard deviation = 3 kg Sample size is 14
Solution
For a two tailed test with df 13 at 5 % level of significance the table value of ‘t’ test = t 0.05 13 = 2.160 The t score is 6.2352 which is greater than 2.160 Hence H0 is rejected [null hypothesis] The inference is that the sample mean is significantly differ from the population mean at 5 % level of significance
Problem two – unpaired t test The body weights of males and females having the same heights are depicted Is there a statistically significant gender difference in body weight test at 5 % and 1% level of significance. Null hypothesis there is no difference between two sample mean
Formula
Types of ‘t’ test One sample t test Two sample t test or unpaired t test Paired t test
One sample t test It is used to determine whether the mean of a single variable differs from a specified constant. Example Measure of a manufactured item are compared against the required standard. Variable used in this test is known as test variable.
Degree of freedom = n1+ n2 -2 = 7+9-2 =14 DF at 5 % level of significance, the table value 2.145 T score 2.205 is greater than 2.145 Ho is rejected
Inference at 1 % level of significance At DF = 14 at 1 % level of significance, the table value at t = 2.977 The t score is 2.205 is lesser than 2.977 Ho is accepted The difference between the two sample mean is statistically significant at 1 % level of significant
One sample t test The one sample t test compares a sample mean to a hypothesized population mean to determine whether the two means are significantly different.
Data for one sample t test requires Variables should be continuous and independent of one another. Normal distribution of sample and population on test variable.
Two types of hypothesis Null hypothesis Alternative hypothesis
X - sample mean - Proposed constant for population mean S- sample standard deviation N- sample size
Result The calculated t value is compared to critical value from the t – distribution table with degree of freedom df = n-1 and chosen confidence level. If calculated t value is greater than t value, then we reject the null hypothesis.
Unpaired t test Unpaired t test is used to compare the mean of two independent groups. In pharmaceutical research half of the subjects are assigned to the treatment group and remaining half subjects are randomly assigned to control group. In research studies where two independent groups eg women and men Unpaired t test is commonly used.it is most widely used test in statistics.
Data should be Independent variables must consist of two independent groups. Null hypothesis H0: there is no significant difference between the means of two groups. Alternative hypothesis H1: there is a significant difference between the two population mean. This difference is unlikely to be caused by sampling error or chance.
Paired t test It is used to compare two population means where one sample can be paired with observations in the other sample. It is repeated measures t test Before and after effect of a pharmaceutical treatment on the same group of the people or change in blood pressure before and after treatment of hypertention .
The difference between the before and after is norma lly distributed