Chapter8_Final.pptxkhnhkjlllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll

DibyenduRoy49 21 views 30 slides Oct 05, 2024
Slide 1
Slide 1 of 30
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30

About This Presentation

ejssdfsndf,xg,ffdmgzdkn


Slide Content

Chapter 8 Multicollinearity

Perfect Multicollinearity Perfect multicollinearity is a violation of Classical Assumption VI. It is the case where the variation in one explanatory variable can be completely explained by movements in another explanatory variable. Such a case between two independent variables would be: Where the X’s are independent variables in:

Perfect Multicollinearity (continued) Other examples of perfect linear relationships: Real world examples? -Distance between two cities. -Percent of voters voting in favor and against a proposition.

Perfect Multicollinearity (continued)

Perfect Multicollinearity (continued) OLS is incapable of generating estimates of regression coefficients where perfect multicollinearity is present. You cannot “hold all the other independent variables in the equation constant.” A special case related to perfect multicollinearity is a dominant variable . A dominant variable is so highly correlated with the dependent variable that it masks the effects of other independent variables. Don’t confuse dominant variables with highly significant variables.

Imperfect Multicollinearity Imperfect multicollinearity : linear functional relationship between two or more independent variables so strong that it can significantly affect the estimations of coefficients. It occurs when two (or more) independent variables are imperfectly linearly related, as in: Note u i , a stochastic error term in Equation (8.7)

Imperfect Multicollinearity (continued)

The Consequences of Multicollinearity The major consequences of multicollinearity are: 1. Estimates will remain unbiased. 2. The variances and standard errors of the estimates will increase. 3. The computed t -scores will fall. 4. Estimates will become sensitive to changes in specification. 5. The overall fit of the equation and estimation of the coefficients of non multicollinear variables will be largely unaffected.

The Consequences of Multicollinearity (continued) 1 . Estimates will remain unbiased. Even if an equation has significant multicollinearity , the estimates of β will be unbiased if first six Classical Assumptions hold. 2 . The variances and standard errors of the estimates will increase. With multicollinearity , it becomes difficult to precisely identify the separate effects of multicollinear variables. OLS is still BLUE with multicollinearity . But the “minimum variances” can be fairly large.

The Consequences of Multicollinearity (continued)

The Consequences of Multicollinearity (continued) 3 . The computed t-scores will fall . Multicollinearity tends to decrease t -scores mainly because of the formula for the t -statistic. If standard error increases, t -score must fall. Confidence intervals also increase because standard errors increase.

The Consequences of Multicollinearity (continued) 4 . Estimates will become sensitive to changes in specification. Adding/dropping variables and/or observations will often cause major changes in β estimates when significant multicollinearity exists. This occurs because with severe multicollinearity OLS is forced to emphasize small differences between variables in order to distinguish the effect of one multicollinear variable.

The Consequences of Multicollinearity (continued) 5 . The overall fit of the equation and estimation of the coefficients of nonmulticollinear variables will be largely unaffected. will not fall much, if at all, with significant multicollinearity . Combination of high and no statistically significant variables is an indication of multicollinearity . It is possible for an F -test of overall significance to reject the null even though none of the individual t -tests do.

Two Examples of the Consequences of Multicollinearity Example : Student consumption function w here: CO i = annual consumption expenditures of the i th student on items other than tuition and room and board. Yd i = annual disposable income (including gifts) of that student LA i = liquid assets (savings, etc.) of the i th student ε i = stochastic error term

Two Examples of the Consequences of Multicollinearity (continued) Estimate Equation 8.9 with OLS: Including only disposable income:

Two Examples of the Consequences of Multicollinearity (continued) Example : Demand for gasoline by state w here: PCON i = petroleum consumption in the i th state (trillions of BTUs) UHM i = urban highway miles within the i th state TAX i = gasoline tax in the i th state (cents per gallon) REG i = motor vehicle registrations in the i th state (thousands)

Two Examples of the Consequences of Multicollinearity (continued) Estimate Equation 8.12 with OLS: If you drop UHM:

The Detection of Multicollinearity Multicollinearity exists in every equation. Important question is how much exists. The severity can change from sample to sample. There are no generally accepted, true statistical tests for multicollinearity . Researchers develop a general feeling for the severity of multicollinearity by examining a number of characteristics. Two common ones are: 1. Simple correlation coefficient 2. Variance inflation factors

High Simple Correlation Coefficients The simple correlation coefficient , r, is a measure of the strength and direction of the linear relationship of two variables. Range of r is +1 to -1. Sign of r indicates the direction of the correlation. If r, in absolute value, is high, then the two variables are quite correlated and multicollinearity is a potential problem.

High Simple Correlation Coefficients (continued) How high is high? Some researchers select arbitrary number, such as 0.80. Better answer might be r is high if it causes unacceptable large variances. The use of r to detect multicollinearity has a major limitation: groups of variables acting together can cause multicollinearity without any single simple correlation coefficient being high.

High Variance Inflation Factors (VIFs) Variance inflation factor (VIF) is a method of detecting the severity of multicollinearity by looking at the extent to which a given explanatory variable can be explained by all other explanatory variables in an equation. Suppose the following model with K independent variables: Need to calculate a VIF for each of the K independent variables.

High Variance Inflation Factors (VIFs) (continued) To calculate VIFs: 1. Run an OLS regression that has X i as a function of all the other explanatory variables in the equation. 2. Calculate the variance inflation factor for

High Variance Inflation Factors (VIFs) (continued) The higher the VIF, the more severe the effects of multicollinearity . But, there are no formal critical VIF values. A common rule of thumb: if VIF > 5, multicollinearity is severe. It’s possible to have large multicollinearity effects without having a large VIF.

Remedies for Multicollinearity Remedy 1: Do nothing Existence of multicollinearity might not mean anything (i.e. coefficients still significant and meet expectations). If you delete a multicollinear variable that belongs in model, you cause specification bias. Every time a regression is rerun, we risk encountering a specification that accidently works on the specific sample.

Remedies for Multicollinearity (continued) Remedy 2: Drop a redundant variable Two or more variables in an equation measuring essentially the same thing might be called redundant . Dropping redundant variable is nothing more than making up for a specification error. In case of severe multicollinearity , it makes no statistical difference which variable is dropped. The theoretical underpinnings of model should be the basis for dropping a redundant variable.

Remedies for Multicollinearity (continued) Example : Student consumption function:

Remedies for Multicollinearity (continued) Remedy 3: Increase the size of the sample Normally, a larger sample will reduce the variance of the estimated coefficients diminishing impact of multicollinearity . Unfortunately, while a useful alternative to be considered, it may be impossible.

An Example of Why Multicollinearity Often is Best Left Unadjusted Example : Impact of marketing on soft drink sales w here: S t = sales of the soft drink in year t P t = average relative price of the drink in year t A t = advertising expenditures for the company in year t B t = advertising expenditures for the company’s main competitor in year t.

An Example of Why Multicollinearity Often is Best Left Unadjusted (continued) If variable B is dropped: Note expected bias in estimated coefficient of A t :

CHAPTER 8 : the end
Tags