Dummyvariable1

1,372 views 34 slides Sep 21, 2021
Slide 1
Slide 1 of 34
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34

About This Presentation

dummy


Slide Content

Dummy Variable
Regression Models

What is Dummy Variable?
•Variablesthatareessentiallyqualitativein
nature(or)variablesthatarenotreadily
quantifiable
•Examples:gender,maritalstatus,race,
colour,religion,nationality,geographical
location,political/policychanges,party
affiliation

Other Names for Dummy Variable
Indicatorvariables
Binaryvariables
Categoricalvariables
Dichotomousvariables
Qualitativevariables

Why Dummy Variable Regression?
•To include qualitative variables as an
explanatory variable in the regression
model
•Example: If we want to see whether gender
discrimination has any influence on
earnings, apart from other factors

How to quantifyqualitative aspect?
•Byconstructingartificialvariablesthattake
onvaluesof1or0(zero)
•1indicatespresenceofthatattribute
•0indicatesabsenceofthatattribute
•Example:
(1)Gender=1iftherespondentisfemale
=0iftherespondentismale
(2)Time=1ifwartime;0ifpeacetime
•Herevariableswithvalues1and0are
calleddummyvariables

Types of Dummy Variable Models
(1)AnalysisofVariance(ANOVA)Model:All
explanatoryvariablesaredummyvariables
(2)AnalysisofCovariance(ANCOVA)Model:
Mixofquantitativeandqualitative
explanatoryvariables

ANOVA Model
•Supposewewanttomeasureimpactof
GENDERonwages/employeecompensation
•Inparticular,weareinterestedtoknow
whether femaleemployees are
discriminatedagainsttheirmale
counterparts
•Genderisnotstrictlyquantifiable

•Hence,wedescribegenderusingdummy
variable
D = 1 if male respondent
= 0 if female respondent [reference group]
Let the regression model as
Y
i= + D + u
i (1)
(where Y-Monthly salary)

•Thisspecificationhelpsustoseewhether
gendermakesdifferenceinsalary.
•Interpretationofmodel(1):
•Takingexpectationof(1)onbothsides,weget
•Meansalaryofmaleas
E(Y
i/D=1)=+
•Meansalaryoffemaleas
E(Y
i/D=0)=

•Note,meansalaryoffemaleisgivenby
intercept
•Coefficienttellsbyhowmuchmeansalary
ofmaleworkersdifferfrommeansalaryof
femaleworkers(or)simplydifferencein
averagesalarybetweenmen&women-
calleddifferentialinterceptcoefficient
•isattachedtocategorywhichisassigned
dummyvariablevalueof1(heremale)

•Intercept()belongstothecategoryfor
whichzerodummyvariablevalueisassigned
(herefemale)
•Thecategorywhichisassignedzerodummyis
knownasbenchmark/control/reference
category
•Interceptvaluerepresentsmeanvalueof
benchmarkcategory
•Allcomparisons(with)aremadeinrelation
tobenchmarkcategory

•Hypothesis testing:
•Done in the usual way
•H
0:=0[Nogenderdiscriminationinsalary
determination/nostatisticallysignificant
differenceinsalariesbetweenmalesand
females]
•H1:0[Genderdiscriminationispresentin
salarydetermination]
•Uset–statistics
•Ifissignificantlydifferentfromzero,wecan
acceptalternatehypothesis

Example: Cross Section Data on Monthly Wages and Gender
Y D Y
D Y D
1345 0 1566 0 2533 1
2435 1 1187 0 1602 0
1715 1 1345 0 1839 0
1461 1 1345 0 2218 1
1639 1 2167 1 1529 0
1345 0 1402 1 1461 1
1602 0 2115 1 3307 1
1144 0 2218 1 3833 1
1566 1 3575 1 1839 1
1496 1 1972 1 1461 0
1234 0 1234 0 1433 1
1345 0 1926 1 2115 0
1345 0 2165 0 1839 1
3389 1 2365 0 1288 1
1839 1 1345 0 1288 0
981 1 1839 0Male 26
1345 0 2613 1Female 23

Regression Results:
•Y = 1518.696 + 568.227 D
t:(12.394)(3.378) R
2
=0.195; F=11.410
Female Male
 (=1518.696)
+
=568.23
(=2086.923)
Y
Nos.

•Resultsshowmeansalaryoffemaleworkers
isaboutRs.1519
•Meansalaryofmaleworkersisincreasedby
Rs.568(i.e.1519+568=2087)
•tstatisticsrevealthatmeansalaryofmaleis
statisticallysignificantlyhigherbyabout
Rs.568

Does conclusion of model change if we
interchange dummy values?
Suppose Y
i= + D + u
i
where Y = Hourly wage
D= Gender (1= Male; 0 –Female)
Now,ifweinterchangedummyvaluesas(1=Female;0-
Male),itwillnotchangeoverallconclusionoforiginal
model(seefigure)
Onlychangeis,now“otherwise”categoryhasbecome
benchmarkcategoryandallcomparisonsaremadein
relationtothiscategory
•Hence, choice of benchmark category () is strictly up to
the researcher

Extension of ANOVA Model
•Canbeextendedtoincludemorethanone
qualitativevariable
Y
i= + 
1D
1+ 
2D
2+ u
i
where Y = Hourly wage
D
1= Marital status (1= married; 0 -otherwise)
D
2= Region of residence (1= south; 0 –otherwise)
Which is the benchmark category here?
Unmarried, non-south residence

•Mean hourly wages of benchmark category is 
•Mean wages of those who are married is + 
1
•Mean wages of those who live in south is + 
2

ANCOVA Model
•Consistsamixtureofqualitativeand
quantitativeexplanatoryvariables
•Suppose,inouroriginalmodel(1)weinclude
numberofyearsofexperienceasanadditional
variable
•Nowwecanraiseonemorequestion:between
2employeeswithsameexperience,istherea
genderdifferenceinwages?

•We can express regression model as
Y
i= 
1+ 
2D + X
i+ u
i
where D is dummy; X
iis experience variable
•Now, mean salary of male is
E (Y
i/D=1,X) = 
1+ 
2+ X
i
•Mean salary of female is
E (Y
i/D=0,X) = 
1+ X
i
•Slope is same for both categories (male &
female), only intercept differs
Slope

What does common slope mean?
•
2 measures average difference in salary between
male and female, given the same level of experience
•Ifwetakeafemaleandmalewithsamelevelsof
experience,
1+
2representssalaryofmale,on
average,and
1salaryoffemaleonaverage
•Notethatsincewecontrolledforexperienceinthe
regression,thewagedifferentialcan’tbeexplainedby
differentaveragelevelsofexperiencebetweenmale
andfemale

•Hence,wecanconcludethatwagedifferentialis
duetogenderfactor

Diagrammatic Explanation
X
Y

1+ X
i

1+
2+X
i
Constant Term 
1–intercept for base group;

1 + 
2–intercept for male; and

2measures the difference in intercept

1

2
1+2
Slope

Regression Estimation Results:
Y (Cap) = 1366.267 + 525.632 D +19.807 X
(8.534)(3.114)1.456)
R
2
=0.48; F = 6.901
Interceptforfemale(base)Group
1=1366.27.It
measuresmeansalaryoffemale
InterceptformaleGroup,
1+
2=1891.90.It
measuresmeansalaryofmale,ofwhich525.63(
2)is
averagedifferenceinsalarybetweenmaleandfemale
(i.e.19.81)–asno.ofyearsofexperiencegoesup
by1year,onaverage,aworkers(maleorfemale)
salarygoesupbyRs.19.81


2–differenceininterceptis525.63andis
statisticallysignificantat5%level.
Therefore,wecanrejectthenullhypothesis
ofnogenderdifferential

Example: Several qualitative variables, with
some having more than two category:
•Example:Consumptionfunctionanalysis.
•Supposetherearethreequalitativefactors:
gender,ageofhouseholdheadandeducation
levelofhead.
•Definedummyvariablesas:
D
1=1ifmaleand=0otherwise
D
2=1ifage<25and=0otherwise
D
3=1ifagebetween25and50and=0otherwise
D
4=1ifhighschooleducationand=0otherwise
D
5=1ifH.sc.,degreeandaboveand=0otherwise

Base or Reference Groups:
Regression Model:
C
t= + Y
t+ 
1D
1+ 
2D
2+ 
3D
3+ 
4D
4 + 
5D
5 + u
t
-intercept for female head of household
-intercept term if age of head is above 50 years
-intercept term if head’s education is below high school
In short represents female head of household aged above
50 years and with below high school education

Differential intercepts or mean
compensation for other groups:
+ 
1-for male household head
+ 
2 –for age is less than 25 years
+ 
3 -for age between 25 and 50 years
+ 
4 –for high school education
+ 
5 –for above high school education
If the household head is male with age 40 years and
high school education, what is the intercept?
+ 
1+ 
3+ 
4

Interactions Involving Dummy Variables
•Consider the following model:
Y
i= 
1+ 
2D
2i+ 
3D
3i+ X
i+ u
i -----(1)
Y
i= Hourly wage
X
i= Education (years of schooling)
D
2= 1 if female, 0 if male [GENDER]
D
3= 1 if black, 0 if white [RACE]
Notethatinthismodeldummyvariablesare
interactiveinnature.How?

•Here, if mean salary is higher for female than for
male, this is so whether they (female) are black or
white
•Similarly,ifmeansalaryislowerforblack,thisisso
whetherthey(black)aremaleorfemale
•Implication:EffectofD
2andD
3onYmaynotbe
simplyadditiveasin(1)butmultiplicativeasbelow
Male Female Black White
BlackWhiteBlackWhiteMaleFemaleMaleFemale
Gender
Race

Y
i= 
1+ 
2D
2i+ 
3D
3i+ 
4 (D
2iD
3i)+ X
i+ u
i --(2)
Eq(2)includesexplicitlyinteractionbetweenGENDER&
RACE,i.e.D
2iD
3i

2–differential effect of being a female (gender alone)

3 –differential effect of being a black (race alone)

4–differential effect of being a black female (g & r)

1 –Male white (base category)
Note: While running (2), simply multiply D
2iD
3i values
Eq(2)isadifferentwayoffindingwagedifferentials
acrossallgender-racecombinations

Inotherwordsinteractivemodel(eq.2)allowsus
toobtainestimatedwagedifferentialamongall4
groups(male,female,black&white).How?
(i) Black Female (Y
i/D
2i=1, D
3i=1, X
i)

1+ 
2+ 
3 + 
4
(ii) Black Male (Y
i/D
2i=0, D
3i=1, X
i)

1+ 
3
(iii) White Male (Y
i/D
2i=0, D
3i=0, X
i)

1
(iv) White Female (Y
i/D
2i= 1, D
3i=0, X
i)

1+ 
2

Male Female Married (M)Unmarried (UM)
M UM M UMMaleFemaleMaleFemale
Gender
Marital status
Tags