Regression SPSS Report 4 By Maryam Bolouri & Tahereh Soleimani
T he main types of research questions that multiple regression can be used to address are: how well a set of variables is able to predict a particular outcome? which variable in a set of variables is the best predictor of an outcome? whether a particular predictor variable is still able to predict an outcome when the effects of another variable are controlled
To recap: It explores the relationship between one continuous dependent variable and a number of independent variables or predictors (usually continuous). Multiple regression is based on correlation. It allows a more sophisticated exploration of the interrelationship among a set of variables. Multiple regression will provide you with information about the model as a whole (all subscales) and the relative contribution of each of the variables that make up the model (individual subscales).
Homework 5: Interested to investigate the relationship between anxiety , motivation and writing performance , a researcher conducted a study with 50 learners. Anxiety and motivation with 20 questions were measured on separate questionnaires on a 5-point Likert scale. The index for writing (out of 25) was the average of two raters of the essay written under timed conditions. Predictors: anxiety and motivation Dependent V: writing performance Relationship btw them (predictive power)
Research question and research hypothesis: How well the anxiety and motivation levels can predict writing performance? How much variance in writing performance can be explained by scores of anxiety and motivation scales? Which variables is the best predictor of writing performance? H0: there is no significant relationship with predictive power between anxiety, motivation and the dependent variable of the study (writing performance).
Step one: checking the assumptions sample size It is recommended that ‘for social science research, about 15 participants per predictor are needed for a reliable equation’. In this study there are 2 independent variables so the required sample must be larger that 30 which in this study is 50 and quite acceptable.
Step one: checking the assumptions Multicollinearity Multicollinearity exists when the independent variables are highly correlated (r=.9 and above). Singularity occurs when one independent variable is actually a combination of other independent variables. The correlation btw independent variable must be smaller that 0.7. The correlation btw independent variables and dependent one must be larger than 0.3
Step one: checking the assumptions outliers, normality, linearity, homoscedastisity Multiple regression is very sensitive to outliers (very high or very low scores). Tabachnick and Fidell (2007, p. 128) define outliers as those with standardised residual values above about 3.3 (or less than –3.3). Residuals are the differences between the obtained and the predicted dependent variable (DV) scores. The residuals scatterplots allow you to check: • normality : the residuals should be normally distributed about the predicted DV scores • linearity : the residuals should have a straight-line relationship with predicted DV scores • homoscedasticity : the variance of the residuals about predicted DV scores should be the same for all predicted scores.
Step one: checking the assumptions scatter plot and normal probability plot There was no outlier.
Step two: evaluating the model how much of the variance in the dependent variable is explained by the model and in this study R square is 0.152 or explains 15per cent of the variance in writing performance. This is not a respectable result. Adjusted R square statistic ‘corrects’ this value to provide a better estimate of the true population value. As the sample of the study is fairly small it is better to include Adjusted one in the interpretation stage of the study. To assess the statistical significance of the result, it is necessary to look in the table labelled ANOVA. This tests the null hypothesis that multiple R in the population equals 0. The model in this example reaches statistical significance (Sig. = .021; this really means p<.05).
Model summary table and ANOVA ANOVA b Model Sum of Squares df Mean Square F Sig. 1 Regression 146.967 2 73.483 4.225 .021 a Residual 817.513 47 17.394 Total 964.480 49 a. Predictors: (Constant), motivation, anxiety b. Dependent Variable: writing performance Model Summary b Model R R Square Adjusted R Square Std. Error of the Estimate 1 .390 a .152 .116 4.17060 a. Predictors: (Constant), motivation, anxiety b. Dependent Variable: writing performance
Step three: evaluating each of the independent variables we are interested in comparing the contribution of each independent variable; therefore we will use the beta values. In this case the largest beta coefficient is 0.36, which is for Anxiety. This means that this variable makes the strongest unique contribution to explaining the dependent variable, when the variance explained by all other variables in the model is controlled for. The Beta value for Motivation was slightly lower (–0.25), indicating that it made less of a unique contribution. In this case, just anxiety made a unique, and statistically significant, contribution to the prediction of writing performance, yet it turned out that motivation didn’t make a respectable contribution.
Step three: evaluating each of the independent variables In this example, Anxiety has a part correlation coefficient of 0.35. If we square this (multiply it by itself) we get .12, indicating that Anxiety uniquely explains 12 per cent of the variance in writing performance. For the motivation the value is –.24, which squared gives us .05, indicating a unique contribution of 5 per cent to the explanation of variance in writing performance. All in all these 2 predictors 12+5=17% of variance in dependant V can be explained. The total R square explained 15% of variance in scores. It is because of the violation of the required assumptions.
Coefficient table Coefficients a Model Unstandardized Coefficients Standardized Coefficients t Sig. 95% Confidence Interval for B Correlations Collinearity Statistics B Std. Error Beta Lower Bound Upper Bound Zero-order Partial Part Tolerance VIF 1 (Constant) 19.158 2.326 8.236 .000 14.478 23.837 anxiety .056 .021 .362 2.623 .012 .013 .098 .305 .357 .352 .949 1.054 motivation -.058 .032 -.250 -1.815 .076 -.122 .006 -.168 -.256 -.244 .949 1.054 a. Dependent Variable: writingperformance
How to report and present the outcome Our model, which includes anxiety and motivation, explains 15 percent of the variance in writing performance(Question 1). Of these two variables, anxiety makes the largest unique contribution (beta = .36), and motivation contribution was not statistically significant (Question 2).