Objectives To learn Correlation and Regression Analysis Difference between correlation and linear regression
Regression Analysis is primarily used to build models/equations to predict a key response, Y, from a set of predictor (X) variables
step 1: Find the slope (b) of the line step 2: Find the y-intercept (a) The regression equation is: COPIER SALES OF AMERICA What is the expected number of copiers sold by a representative who made 20 calls?
Correlation Analysis we estimate a sample correlation coefficient, more specifically the Pearson Product Moment correlation coefficient. -spearman rho
Positive Correlation Negative Correlation Direction
Strength
EXAMPLE: CORRELATION OF GESTATIONAL AGE AND BIRTH WEIGHT A small study is conducted involving 17 infants to investigate the association between gestational age at birth, measured in weeks, and birth weight, measured in grams.
We wish to estimate the association between gestational age and infant birth weight. The data are displayed in a scatter diagram in the figure below.
The formula for the sample correlation coefficient is The variances of x and y measure the variability of the x scores and y scores around their respective sample means ( X and Y, considered separately).The covariance measures the variability of the (x,y) pairs around the mean of x and mean of y, considered simultaneously. And the formula for sample variance is defined as:
The mean gestational age is: The computations are summarized below.
The computations are summarized below. Not surprisingly, the sample correlation coefficient indicates a strong positive correlation.
Difference between Correlation and linear Regression Correlation Linear Regression quantifies the direction and strength of the relationship between two numeric variables, X and Y, and always lies between -1.0 and 1.0 relates X to Y through an equation of the form Y = a + bX.
Significant test To test whether the association is merely apparent and might have arisen by chance use the t test in the following calculation. . More advanced methods More than one independent variable is possible - in such a case the method is known as multiple regression. (3,4 )This is the most versatile of statistical methods and can be used in many situations.
THANK YOU thank you
ACTIVITY Find the correlation coefficient between x and y for the following data. x y 60 34 40 50 45 41 22 43 75 32 34 40 45 33 12 30