Assumptions of OLS.pptx

1,678 views 27 slides Oct 27, 2023
Slide 1
Slide 1 of 27
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27

About This Presentation

In this PPT, you can get more knowledge about the Assumptions of Ordinary Least Square (OLS). Most of us, did't know the basic idea about regression. So, you read this material in order to clarify yourself by this PPT.
Assumptions of OLS:
1. Linear Regression
2.X Values are repeated and fixed s...


Slide Content

Assumptions of Ordinary Least Square

Intro Ordinary Least Squares (OLS) is a method used in econometrics, which is a branch of economics that applies statistical techniques to analyze economic data . OLS is a technique for estimating the parameters of a linear regression model. |

Assumption 1: Linear regression model. The regression model is linear in the parameters, as shown in below Y i = β + β1X i + u i that the regressand Y and the regressor X themselves may be nonlinear Y i = β + β 1 X i 2 + u i Y i = β 1 + X i + u i  

Assumption 2: X values are fixed in repeated sampling. Values taken by the regressor X are considered fixed in repeated samples. More technically, X is assumed to be non-stochastic. It is very important to understand the concept of “fixed values in re- peated sampling,” Conclusion conditional regression analysis, that is, conditional on the given values of the regressor(s) X .

  Y ↓ X→   80   100   120   140   160   180   200   220   240   260 Weekly family consumption expenditure Y, $ 55 60 65 70 75 – – 65 70 74 80 85 88 – 79 84 90 94 98 – – 80 93 95 103 108 113 115 102 107 110 116 118 125 – 110 115 120 130 135 140 – 120 136 140 144 145 – – 135 137 140 152 157 160 162 137 145 155 165 175 189 – 150 152 175 178 180 185 191 Total 325 462 445 707 678 750 685 1043 966 1211 Conditional means of Y, E (Y | X ) 65 77 89 101 113 125 137 149 161 173

Assumption 3: Zero mean value of disturbance u i . Given the value of X , the mean, or expected, value of the random disturbance term u i is zero. Technically, the conditional mean value of u i is zero. Symbolically, we have E ( u i | X i ) = 0 each Y population corresponding to a given X is distributed around its mean value (shown by the circled points on the PRF) with some Y values above the mean and some below it. It requires is that the average or mean value of these deviations corresponding to any given X should be zero. 9 the positive u i values cancel out the negative u i values so that their average or mean effect on Y is zero. 10

Mean Y PRF: Yi = b 1 + b 2 Xi + ui – ui X 1 X 2 X 3 X 4

Assumption 4: Homoscedasticity or equal variance of u i . states that the variance of u i for each X i (i.e., the conditional variance of u i ) is some positive constant number equal to σ 2 . Technically, σ 2 represents the assumption of homoscedasticity, or equal (homo) spread ( scedasticity ) or equal variance.

Continue…

Continue….

Y X 1 X 2 X i PRF: Yi = b 1 + b 2 Xi Probability density of ui f ( u ) X

Continue.. In contrast, consider below the figure where the conditional variance of the Y population varies with X . This situation is known appropriately as heteroscedasticity, or unequal spread, or variance. Symbolically, can be written as Var ( u i | X i ) = σ 2 Notice the subscript on σ 2 which indicates that the variance of the Y population is no longer constant. To understand the rationale behind this assumption, refer to Figure. As this figure shows, var ( u X 1 ) < var ( u X 2 ), ... , < var ( u X i )

Continue…

Assumption 5: No autocorrelation between the disturbances. Given any two X values, X i and X j ( i ≠ j ), the correlation between any two u i and u j ( i ≠ j ) is zero. where i and j are two different observation the disturbances u i and u j are uncorrelated. Technically, this is the assumption of no serial correlation, or no auto-correlation.

we see, that the u ’s are positively correlated, a positive u followed by a positive u or a negative u followed by a negative u . Figure a

The u ’s are negatively correlated, a positive u followed by a negative u and vice versa. Figure b

Continue.. If the disturbances (deviations) follow systematic patterns, such as those shown in Figure a and b, there is auto- or serial correlation, Assumption 5 requires is that such correlations be absent. Figure c shows that, there is no systematic pattern to the u ’s, thus indicating zero correlation

Continue.. Figure c

Assumption 6: Zero covariance between u i and X i , or E ( u i X i ) = 0 cov ( u i , X i ) = E [ u i − E ( u i )][ X i − E ( X i )] = E [ u i ( X i − E ( X i ))] since E ( u i ) = 0 = E ( u i X i ) − E ( X i ) E ( u i ) since E ( X i ) is nonstochastic = E ( u i X i ) since E ( u i ) = 0 = 0 by assumption But if X and u are correlated, it is not possible to assess their individual effects on Y .

Continue.. Thus, if X and u are positively correlated, X increases when u increases and it decreases when u decreases. Similarly, if X and u are negatively correlated, X increases when u decreases and it decreases when u increases. In either case, it is difficult to isolate the influence of X and u on Y . Assumption 6 is automatically fulfilled if X variable is nonrandom or non-stochastic and Assumption 3 holds, for in that case, cov ( u i , X i )= [ X i E ( X i )] E [ u i E ( u i )] = 0 . But since we have assumed that our X variable not only is non-stochastic but also assumes fixed values in repeated samples

Assumption 7: The number of observations n must be greater than the number of parameters to be estimated. Alternatively, the number of observations n must be greater than the number of explanatory variables From this single observation there is no way to estimate the two unknowns, β 1 and β 2 . We need at least two pairs of observations to estimate the two unknowns

Assumption 8: Variability in X values.

Assumption 9: The regression model is correctly specified. What is the functional form of the model? Whether your model has linear in the parameter or variables, or both? What are the probabilistic assumptions made about the Y and X, and the u entering the model? Y i = α + α 1 X i + u i ………………1 X i Y =  

Continue.. The regression model one(1) is linear both in the parameters and the variables, whereas the model two(2) is linear in the parameters but nonlinear in the variables X. If the model two (2) is fitted to model one (1), it give us wrong predictions, between A and B for any given X, i.e., the model one (1) going to overestimate the true mean value of Y. Whereas to the left of A is going to underestimate the true mean value of Y.

10. There is no Perfect Multicollinearity. You are correct. Perfect multicollinearity refers to a situation in which two or more independent variables in a regression model have a perfect linear relationship, meaning that one independent variable can be expressed as a linear combination of the others. In other words, there is a perfect correlation between these variables.