Blue property assumptions.

9,802 views 11 slides Nov 16, 2021
Slide 1
Slide 1 of 11
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11

About This Presentation

This presentation briefly discusses about Linear Regression:Best Linear Unbiased Estimator(BLUE) property assumptions in Data Analytics


Slide Content

Linear Regression: Best Linear Unbiased Estimator(BLUE) property assumptions Dr. C.V. Suresh Babu

In Simple Linear Regression or Multiple Linear Regression we make some basic assumptions on the error term Assumptions: 1. Error has zero mean 2. Error has constant variance 3. Errors are uncorrelated 4. Errors are normally distributed https://www.youtube.com/watch?v=iIUq0SqBSH0

Introduction to Properties of OLS Estimators Linear regression models have several applications in real life. In econometrics, Ordinary Least Squares (OLS ) method is widely used to estimate the parameters of a linear regression model. For the validity of OLS estimates, there are assumptions made while running linear regression models. A1. The linear regression model is “linear in parameters.” A2. There is a random sampling of observations. A3. The conditional mean should be zero. A4. There is no multi-collinearity (or perfect collinearity). A5. Spherical errors: There is homoscedasticity and no auto-correlation A6: Optional Assumption: Error terms should be normally distributed.

The second assumption is known as Homoscedasticity and therefore, the violation of this assumption is known as Heteroscedasticity . Homoscedasticity vs Heteroscedasticity Therefore, in simple terms, we can define heteroscedasticity as the condition in which the variance of error term or the residual term in a regression model varies. As you can see in the above diagram, in case of homoscedasticity, the data points are equally scattered while in case of heteroscedasticity the data points are not equally scattered.

Possible reasons of arising Heteroscedasticity: Often occurs in those data sets which have a large range between the largest and the smallest observed values i.e. when there are outliers. When model is not correctly specified. If observations are mixed with different measures of scale. When incorrect transformation of data is used to perform the regression. Skewness in the distribution of a regressor , and may be some other sources

Effects of Heteroscedasticity: As mentioned above that one of the assumption (assumption number 2) of linear regression is that there is no heteroscedasticity. Breaking this assumption means that OLS (Ordinary Least Square) estimators are not the Best Linear Unbiased Estimator(BLUE) and their variance is not the lowest of all other unbiased estimators. Estimators are no longer best/efficient. The tests of hypothesis (like t-test, F-test) are no longer valid due to the inconsistency in the co-variance matrix of the estimated regression coefficients.

Applications and How it Relates to Study of Econometrics OLS estimators, because of such desirable properties discussed above, are widely used and find several applications in real life.

Example: Consider a bank that wants to predict the exposure of a customer at default. The bank can take the exposure at default to be the dependent variable and several independent variables like customer level characteristics, credit history, type of loan, mortgage, etc. The bank can simply run OLS regression and obtain the estimates to see which factors are important in determining the exposure at default of a customer. OLS estimators are easy to use and understand. They are also available in various statistical software packages and can be used extensively.

OLS regressions form the building blocks of econometrics. Any econometrics class will start with the assumption of OLS regressions. It is one of the favorite interview questions for jobs and university admissions. Based on the building blocks of OLS, and relaxing the assumptions, several different models have come up like GLM (generalized linear models), general linear models, heteroscedastic models, multi-level regression models, etc.

Research in Economics and Finance are highly driven by Econometrics. OLS is the building block of Econometrics. However, in real life, there are issues, like reverse causality, which render OLS irrelevant or not appropriate. However, OLS can still be used to investigate the issues that exist in cross-sectional data. Even if OLS method cannot be used for regression, OLS is used to find out the problems, the issues, and the potential fixes.

Summary Linear regression is important and widely used, and OLS stimation technique is the most prevalent. In this session, the properties of OLS estimators were discussed because it is the most widely used estimation technique. OLS estimators are BLUE (i.e. they are linear, unbiased and have the least variance among the class of all linear and unbiased estimators). Amidst all this, one should not forget the Gauss-Markov Theorem (i.e. the estimators of OLS model are BLUE) holds only if the assumptions of OLS are satisfied. Each assumption that is made while studying OLS adds restrictions to the model, but at the same time, also allows to make stronger statements regarding OLS. S o, whenever you are planning to use a linear regression model using OLS, always check for the OLS assumptions. If the OLS assumptions are satisfied, then life becomes simpler, for you can directly use OLS for the best results – thanks to the Gauss-Markov theorem! https://www.youtube.com/watch?v=OCwZyYH14uw