staticstical correlation and linear and logistic regration .pptx

HAIDARHANTOSH2 12 views 25 slides Jul 17, 2024
Slide 1
Slide 1 of 25
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25

About This Presentation

exploring model of the correlation


Slide Content

THE CORRELATION MODEL

OBJECTIVE Obtain a measure of the relationship between two random variables (X &Y)

Pearson’s Correlation Coefficient (r) It is a measure of the linear (or straight line) relationship between two interval level variables

Pearson’s Correlation Coefficient (r) Its value lies between (-1---- +1) -1: perfect inverse linear correlation +1: Perfect positive linear correlation 0: No correlation

Pearson’s Correlation Coefficient (r) The value of (r) indicates the strength of the relationship <0.2 : very weak 0.2- <0.4 : weak 0.4- <0.7 : moderate 0.7- <0.9 : strong ≥0.9 : very strong

Pearson’s Correlation Coefficient (r) The sign of (r) indicates the direction of the relationship Positive correlation indicates that high score on one variable is associated with high scores on a second variable Negative correlation indicates that high scores on one variable is associated with low scores on the second variable

Pearson’s Correlation Coefficient (r) Pearson’s Correlation Coefficient (r) n ∑XY -(∑X) (∑Y) r =----------------------------------------- √[n∑X 2 – (∑X) 2 ] [n ∑Y 2 – (∑Y) 2 ]

Testing significance of (r) The (r ) value represents a sample value and can be used to test the hypothesis: Ho P=0 HA P≠0 n-2 t=r √----------- 1-r 2 df=n-2

Scatter Diagram The form of the relationship between two variables can be presented visually in a Scatter Diagram which is a graphic device used to visually summarize the relationship between two variables

Scatter Diagram The X-axis is traditionally the horizontal axis and represents the independent variable The Y –axis is the vertical axis and represents the dependent variable

Simple Linear Regression It is helpful in: Ascertaining the probable form of the relationship between variables Predict or estimate the value of one variable corresponding to a given value of another variable

Simple Linear Regression The independent variable (x) is pre-selected and called non-random or mathematical variable

Simple Linear Regression The least square line summarizes the relationship between X and Y: Y= a+ bx a= intercept : the point where the line crosses the vertical axis (i.e.: amount of Y when X= 0) b=slope : amount by which Y changes for each change in X X=independent variable Y=dependant variable

Simple Linear Regression n ∑XY - (∑X) (∑Y) b=-------------------------------- n∑X 2 - (∑X) 2 ∑Y- b ∑X a =-------------------------- n

Correlation Exercise

Systolic Blood Pressure Readings (mmHg) by two methods in 25 Patients with Essential Hypertension Patient No. Method I Method II 1 132 130 2 138 134 3 144 132 4 146 140 5 148 150 6 152 144 7 158 150 8 130 122 9 162 160 10 168 150 11 172 160 12 174 178 13 180 168 14 180 174 15 188 186 16 194 172 17 194 182 18 200 178 19 200 196 20 204 188 21 210 180 22 210 196 23 216 210 24 220 190 25 220 202

Method I Method II Systolic Blood pressure readings (mm Hg), 25 Patients with essential hypertension

Patient No. Method I Method II X 2 Y 2 XY 1 132 130 17424 16900 17160 2 138 134 19044 17956 18492 3 144 132 20736 17424 19008 4 146 140 21316 19600 20440 5 148 150 21904 22500 22200 6 152 144 23104 20736 21888 7 158 150 24964 22500 23700 8 130 122 16900 14884 15860 9 162 160 26244 25600 25920 10 168 150 28224 22500 25200 11 172 160 29584 25600 27520 12 174 178 30276 31684 30972 13 180 168 32400 28224 30240 14 180 174 32400 30276 31320 15 188 186 35344 34596 34968 16 194 172 37636 29584 33368 17 194 182 37636 33124 35308 18 200 178 40000 31684 35600 19 200 196 40000 38416 39200 20 204 188 41616 35344 38352 21 210 180 44100 32400 37800 22 210 196 44100 38416 41160 23 216 210 46656 44100 45360 24 220 190 48400 36100 41800 25 220 202 48400 40804 44440 4440 4172 808408 710952 757276

n ∑XY -(∑X) (∑Y) r =-------------------------------------------- √[n∑X 2 – (∑X) 2 ] [n ∑Y 2 – (∑Y) 2 ] (25)(757276) -(4440) (4172) 408220 r =----------------------------------------------------------------------= --------------= 0.955 √[(25)(808408) – (4440) 2 ]√[(25)(710952) – (4172) 2 ] 427611.05 H o ρ =0, H A ρ ≠0 n-2 25-2 t=r √----------- = 0.955 √---------------- = 16.17 1-r 2 1- (0.955) 2 t 0.975 = 2.0687 So we reject the H o df =23

n∑XY -(∑X) (∑Y) b =---------------------------- n∑X 2 – (∑X) 2 (25)(757276) -(4440) (4172) b =------------------------------------------= 0.822 (25)(808408) – (4440) 2 ∑Y – b ∑X 4172 - 0.822 (4440) a =------------------ = ----------------------------- = 20.89 n 25 Y= a + bX Y = 20.89 + 0.822 X

Patients’ Scores on standardized Test and New Test Patient No. Score on New Test (X) Score on standardized Test (Y) 1 50 61 2 55 61 3 60 59 4 65 71 5 70 80 6 75 76 7 80 90 8 85 106 9 90 98 10 95 100 11 100 114

Patient No. Score on New Test (X) Score on standardized Test (Y) X 2 Y 2 XY 1 50 61 2500 3721 3050 2 55 61 3025 3721 3355 3 60 59 3600 3481 3540 4 65 71 4225 5041 4615 5 70 80 4900 6400 5600 6 75 76 5625 5776 5700 7 80 90 6400 8100 7200 8 85 106 7225 11236 9010 9 90 98 8100 9604 8820 10 95 100 9025 10000 9500 11 100 114 10000 12996 11400   825 916 64625 80076 71790

n∑XY -(∑X) (∑Y) b =---------------------------- n∑X 2 – (∑X) 2 (11)(71790) -(825) (916) b =------------------------------------------= 1.1236 (11)(64625) – (825) 2 ∑Y – b ∑X 916 – 1.1236 (825) a =------------------ = ----------------------------- = - 0.9973 n 11 Y= a + bX Y = 0.9973 + 1.1236 X

Scores on new test Scores on standardized test Original data and least – squares for Example 2
Tags