5 Longitudinal Data Analysis of ttf.pptx

AhmedAlhadiAbduselam 11 views 35 slides Oct 30, 2025
Slide 1
Slide 1 of 35
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35

About This Presentation

Longitudinal data analysis .pptx


Slide Content

Longitudinal Data Analysis For MPH Students By Adisu B. Sep, 2025 HARAR, ETHIOPIA

Outline 1. Definition and peculiar features of LD 2 . Examples of Longitudinal Data 3. Structure of Longitudinal Data 4. Exploratory Data Analysis 5. Statistical M odels 6. Practical examples Using Software

Longitudinal Data Analysis Longitudinal d ata a nalysis-refers to an investigation or study in which outcome variable is measured repeatedly over time for the same subjects. Data from the same subjects are taken at multiple time points. Different subjects may have the same or different numbers of observations which may be taken at different time points (balanced vs unbalanced designs). Observations made on the same person are likely to be correlated Thus Require special statistical techniques for valid analysis and inference.

Failure to account for correlation: Individuals are assumed to be independent Ignoring dependence may lead to incorrect inference Incorrectly estimated precision / too small standard errors Confidence intervals are too narrow; too often exclude true value Results in incorrect p-values and incorrect conclusions

Components of variability in Longitudinal Data Random effects: due to inter-individual variability or heterogeneity between individual effects Serial correlation: when residuals close together are more correlated than residuals far apart Measurement error: results from the fact that in some delicate measurements even immediate replication will not avoid certain variation

Importance of LDA Record incident events Distinguish changes over time within individuals and between individuals Ability to control for individual heterogeneity More informative data: more variability and efficiency .

Importance of LDA... Ability to identify and measure effects that are not detectable in pure cross-sections or pure time series. Avoidance of aggregation bias.

Challenges/problems of LDA Missing data/attrition Repeated observations on the same individual are likely to be positively correlated . Consequently, estimated standard errors tend to be too low, leading to test statistics that are too high and p-values that are too low. Require specialized methods that account for longitudinal correlation Determining causality when covariates vary over time

M ethods to correct for dependence/correlation: Model with Robust standard errors Generalized estimating equations (GEE) Fixed effects models Mixed effects models • Many of these methods can also be used for clustered data that are not longitudinal, e.g., students within classrooms, people within neighborhoods.

Data structure of LD-long form

Restructuring data Data should be in long format for modeling. If it is in wide format initially: we use reshape command in STATA From wide to long format reshape long weight, i (id) j( time_month ) From long to wide format reshape wide bp , i (patient) j( when )

Mixed effect model

Examples (1) The Multi-Center AIDS Cohort Study (MACS ) More than 3,000 men who were at risk for acquisition of HIV1 were enrolled (Kaslow et al. 1987 ) (N = 479). To characterize the biological changes associated with disease onset . This study has demonstrated the effect of HIV1 infection on indicators of immunologic function such as CD4 cell counts . Scientific question: Whether baseline characteristics such as viral load measured immediately after sero -conversion are associated with a poor patient prognosis as indicated by a greater rate of decline in CD4 cell counts .

Examples (2) Jimma Infant Data Follow-up study of new born infants in Southwest Ethiopia. Wide ranges of data were collected on the following characteristics B asic demographic information Anthropometric measurements. Infants were followed during 12 months Measurements were taken at seven time points every two months from each child Weight was one of the variables recorded at each visit Research question: How does weight change over time?

Exploratory Data Analysis Exploratory analysis comprises techniques to visualize patterns in the data . A ddress the relationship of a response with explanatory variables, including time . Data exploration is a very helpful tool in the selection of appropriate models. M eans profile over time Individual profile plot Correlation structure

Figure: Subject specific profiles of CD4 cell counts.

Correlation structure I s useful for understanding components of variation and for identifying a correlation model for regression methods Exchangeable/Compound symmetry: the correlation between any two measurements on a given subject is assumed to be equal. There is a single variance (σ 2 ) for all 3 of the time points and there is a single covariance (σ 1 ) for each of the pairs of trials.

Unstructured No evidence of an apparent systematic pattern of variance and correlation each time point has its own variance (e.g. σ 1 2  is the variance of time 1) and each pair of time points has its own covariance (e.g., σ 21  is the covariance of time 1 and time 2). 

Autoregressive the magnitude of correlation among observations “decays” as they become farther apart . O bservations which are more proximate are more correlated than measures that are more distant. 

Statistical Models Two popular types: Population-averaged models/GEE Questions of interest are about average or mean Model population behavior by modeling the population average Observations are assumed independent across subjects Observations may be correlated within subjects 2. Mixed Effect model (Subject specific) Model individual behavior consider both correlation bewteen and wihin subjects

Ignoring the correlation among the observation will badly affect the standard error This further will affect the estimation of the confidence interval and p-value Which will later affect the decision of the test Thus, we have to account for the correlation in our analysis

Mixed Effect Models A mixed effects model for longitudinal data can be obtained from the corresponding model for cross-sectional data by introducing random effects. Fixed effect Model: Usually of interest for interpretation of results All the levels of interest are included in the data

Random effect Model: Handles additional heterogeneity due to correlated data Usually not of interest for interpretation of the results Used to describe the different “blocks” in the data Only a random sample of levels is included in the data In case of longitudinal data, subject/individual can be considered as a random effect .

Linear Mixed Model (LMM) is used to analyze repeated continuous data LMM contains both fixed and random effects LMM account for the correlation in the data by including subject specific random effects

Estimation technique Maximum Likelihood Estimation (MLE ): is a method of estimating the parameters by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable ML estimates both the fixed and random effects simultaneously, Restricted maximum likelihood (REML ): method is similar with MLE Robust when sample size is small and model is complex

Model Comparison

Both AIC and BIC is used for non-nested models The model with the smallest AIC or BIC is better The stata command to compute AIC and BIC after you fit the model Is #### estat ic

Time for Practical session………… Open software…………...STATA , R

STATA CODE: gen wt =weight/1000 xtmixed wt i.sex age c.age#c.age i.sex#c.age || ind : age, cov (un)

Conclusion

THANKS!!!
Tags