The Case for Generalized Estimating Equations in State-level Analysis

tuck4prez 55 views 15 slides Apr 28, 2024
Slide 1
Slide 1 of 15
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15

About This Presentation

Presentation on the benefits of using GEE in state level studies.


Slide Content

The Case for Generalized Estimating Equations
in State-level Analysis
Tucker Staley
Department of Political Science
University of Central Arkansas
[email protected]
Prepared for the Annual Meeting of the Southern Political Science Association, Jan. 9-11, 2014.
New Orleans, LA.

Correlated Data
Assume data are independent and identically
distributed.
However, often not the case.
Panel studies
Cross-sectional time-series
Dyadic studies
Decision making environments
Specifically concerned with intra-class
correlations resulting from grouped
observations.

Dealing with Intraclass Correlations
Adjust standard errors of GLM
Ignore impact of coefficient estimates
Better Options
Generalized Linear Mixed Models
Generalized Estimating Equations

GLMM
Most common
Vectors for both fixed and random effects
accounted for
Generalized for non-normal responses with a
known link function

Issues with GLMM
Designed specifically for exchangeable
correlation between groups (clusters).
For the most part, mainly allows for an
individual level interpretation.

GEE
Marginal model
population-averaged expectations of the dependent
variables as functions of the covariates
No individual effects included
intracluster variation accounted for by adjusting the
covariance matrix
Average effect across entire sub-population
Note: We get this interpretation with independent observations
in any model and when there is an identity link in hierarchical
models.
Flexible correlation structures (“working
correlation”)

GEE Model Overview
Extension of a generalized linear model (GLM)
postulate relationship b/t DV and IV and the
conditional mean and variance of DV
GEE reduces to GLM when T=1
estimates are solutions to a set of “quasi-score”
differential equations
residuals from Fisher scoring used to consistently
estimate structure of the unknown parameters

Model Specification
Most Simply
Goal: minimize this objective function

Process
Estimate coefficients iteratively.
Estimate regression coefficients.
Use residuals from these to estimate the covariance
term.
Repeat until convergence.
Does not require a lot of parametric
assumptions as does MLE.

Issues w/ GEE
Specify the correct correlation structure
Should be based on theory. However, asymptotically efficient if incorrect.
Sample size
Smaller N may have issues with convergence.
Missing Data
Important to specify true correlation matrix.
Goodness-of-fit
Correlated residuals, so can't use standard statistics.
Uncertainty
Need to adjust s.e. (robust, bootstrapped)

Initiative and Party Power
Justin Phillips (2008)
“Direct democracy alters the ability of partisan
legislative majorities...to shape the size...of the
public sector.”
DV: Tax effort
IVs:
Partisan Control
State-level characteristics
Interactions

100 Miles of Dry
John Frendreis & Raymond Tatalovich (2010)
What factors identified as important for
Prohibition remain important today?
DV: Dry county
IVs:
Religion
Demographics
Partisan voting

Conclusions
Often deal with correlated data in the real-world.
GEE allows us to deal with intraclass
correlations and produces efficient coefficient
estimates.
More flexible than GLMM: correlation
structures, interpretation
Estimates may differ once correlated data are
accounted for.
Toss in the methodological toolbox.
Tags