short description of ANOVA, ANCOVA and T test with examples in SAS
Size: 1.67 MB
Language: en
Added: Jan 31, 2017
Slides: 30 pages
Slide Content
Glimpse of T test, ANOVA and ANCOVA Prachi Bari Sanjana Ghadigaonkar Samrat Parage Vaibhav Deshmukh Pragya Sinha Aritra Das
T test A statistical examination of two population means
T - Test A t-test is any statistical hypothesis test in which the test statistic follows a Student's t-distribution under the null hypothesis. It can be used to determine if two sets of data are significantly different from each other. In probability and statistics, Student's t-distribution (or simply the t-distribution) is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown. Let be independent and identically distributed as N( , ), i.e. this is a sample of size n from a normally distributed population with expected value and unknown variance .
Assumptions Interval or ratio scale of measurement X follows a normal distribution with mean μ and variance σ 2 Z and s are independent, Where Z= Standard normal variable s =The ratio of sample standard deviation over population standard deviation Generally used when sample size is <30. One sample T test statistic The random variable T= Where, has a Student's t -distribution with n − 1 degrees of freedom.
SAS Code for One sample T test Example: title 'One-Sample t Test'; data time; input time @@; datalines ; 43 90 84 87 116 95 86 99 93 92 121 71 66 98 79 102 60 112 105 98 ; run; proc ttest h0=80 alpha=0.05; var time; run;
Output Here, we see that the p value (0.0329) < α (0.05) hence we reject . Hence we conclude that the mean weight of given sample is not equal to 80. One-Sample t Test :
Types of Two sample T – Test Independent Two Sample T – Test: The independent samples t -test is used when two separate sets of independent and identically distributed samples are obtained, from each of the two populations being compared. Dependent (Paired) Two Sample T – Test: A typical example of the repeated measures t -test would be where subjects are tested prior to a treatment, say for high blood pressure, and the same subjects are tested again after treatment with a blood-pressure lowering medication.
Assumptions of Independent two sample t-test Given two groups 1 and 2 this test is only applicable when: The two sample sizes (that is, the number n of participants of each group) are equal The two samples have the same variance Samples have been randomly drawn independent of each other Testing Assumptions Normality:- Shapiro–Wilk or Kolmogorov–Smirnov test, or it can be assessed graphically using a normal quantile plot If the two groups are normally distributed then to check equality of variance we use F - Test If normality is rejected then we use Wilcoxon Test. Both the assumptions are satisfied thus we go for two sample T - Test If variance of two groups is not equal then we use Welch Test
Test statistic Where, Hypothesis Here the hypothesis to be tested is: vs
Independent Two sample T – test in SAS Example: data wt ; input group $ wt @@; datalines ; Ind 85 Ind 70 Ind 64 Ind 87 Ind 76 Ind 95 Ind 67 Ind 87 Ind 93 Ind 82 Aus 90 Aus 95 Aus 103 Aus 107 Aus 95 Aus 112 Aus 98 Aus 92 Aus 98 Aus 115 ; run; proc ttest h0=0 alpha=0.05; class group; var wt ; run;
Output
Paired Two sample T – test in SAS Example: data wt ; input Before After @@; datalines ; 120 128 140 132 126 118 124 131 128 125 130 132 130 131 140 141 126 129 118 127 135 137 127 135 ; run; proc ttest data= wt sides=2 alpha=0.05 h0=0; title "Paired sample t-test"; paired Before * After; run; Output The TTEST Procedure Difference: Before - After
ANOVA ANOVA – Analysis Of Variance Analysis of variance ( ANOVA ) is a collection of statistical models used to analyze the differences among group means.
The mathematical model that describes the relationship between the response and treatment for the one way ANOVA is given by = µ + + i = 1,2….k, j = 1,2,…... where, observation on treatment, µ : common effect for the whole experiment, : treatment effect, random error. One – Way ANOVA Hypothesis of one – way ANOVA H : The means of all the groups are equal. Vs H 1 : Not all of the group means are the same.
The normality assumption: dependent variable should be approximately normally distributed The homogeneity of variance assumption: variance of each group should be approximately equal The independence assumption: observations should be independent of each other Assumptions of one way-ANOVA Example Suppose we are testing a new drug to see if it helps reduce the time to recover from a fever. We decide to test the drug on three different races (Caucasian, African American, and Hispanic. We randomly select 10 test subjects from each of those races, so all together, we have 30 test subjects. The response variable is the time in minutes after taking the medicine before the fever is reduced.
Proc anova data=time; class gender race; model time=gender race; run ; SAS code for one-way ANOVA
Two-way ANOVA is a type of study design with one numerical outcome variable and two categorical explanatory variables. Mathematical model of two –way ANOVA is as follows = µ + + + + where µ is overall mean effect is effect due to level of first factor is effect due to level of second factor is effect due to interaction between level of first factor and level of second factor Two – Way ANOVA Assumptions: The populations from which the samples were obtained must be normally or approximately normally distributed. The samples must be independent. The variances of the populations must be equal. The groups must have the same sample size.
Proc anova data=time; class gender race; model time=gender race; run ; SAS code for Two-way ANOVA Output of two-way ANOVA
ANCOVA Analysis of covariance ( ANCOVA ) is a general linear model which blends ANOVA and regression.
ANCOVA ANCOVA by definition is a general linear model that includes both ANOVA (categorical) predictors and Regression (continuous) predictors. ANCOVA examines the influence of an independent variable on a dependent variable while removing the effect of the covariate factor. ANCOVA first conducts a regression of the independent variable (i.e., the covariate) on the dependent variable. The residuals (the unexplained variance in the regression model) are then subject to an ANOVA.
ANCOVA in other words… Analysis of Covariance (ANCOVA) is a statistical test related to ANOVA It tests whether there is a significant difference between groups after controlling for variance explained by a covariate A covariate (CV) is a continuous variable that correlates with the dependent variable (DV) This is one way that you can run a statistical test with both categorical and continuous independent variables Purposes of ANCOVA Increase sensitivity of F test Removes predictable variance from the error term Improves power of the analysis
Adjustment of Covariate Effect Partitioning variance in ANOVA Partitioning variance in ANCOVA
Relationship between CV and DV Hypotheses for ANCOVA H : the group means are equal after controlling for the covariate Vs H 1 : the group means are not equal after controlling for the covariate
linearity of regression: The regression relationship between the dependent variable and concomitant variables must be linear. homogeneity of error variances: The error is a random variable zero mean and equal variances for different treatment classes and observations independence of error terms: The errors are uncorrelated. That is that the error covariance matrix is diagonal. normality of error terms: The residual (error terms) should be normally distributed homogeneity of regression slopes: The slopes of the different regression lines should be equivalent, i.e., regression lines should be parallel among groups. Assumptions
Choosing Covariates Variables that affect or have the potential to affect the dependent variable Demographic information Inherent characteristics Differences in group characteristics due to sampling Number of covariates depends on: Known relationship or previous research No. of independent variables or groups Total no. of subjects Example Suppose we want to compare the effect of drugs on the weights of a particular group of patients(homogeneous among themselves). We can analyze the data by performing the ANCOVA by regarding: the final weight of the patients taking drugs, after a specified period as the response variable. the initial weight of the animals at the time of starting the experiment as the covariate.
Model So, our model becomes: Where, is the general mean effect is the (fixed) additional effect due to the th treatment ( i =1,2,….,p) is the random error effect (j=1,2,…, is the coefficient of regression of y on x is the value of covariate variable corresponding to the response variable PROC MIXED data =< name of the dataset > ; CLASS variables ; MODEL dependent = < fixed-effects > <covariates> < / options > ; LSMEANS fixed-effects < / options > ; run; SAS Codes in ANCOVA