Master's thesis project on landslide prediction
Size: 13.54 MB
Language: en
Added: Aug 29, 2025
Slides: 22 pages
Slide Content
LANSLIDE SUSCEPTIBILITY MAPPING
IN ASSAM, INDIA: A GIS BASED APPROACH
USING FREQUENCY RATIO AND LOGISTIC
REGRESSION
Somdarshan Atre
20EX20031
Under the guidance of
Submitted by
Prof. Paresh Nath Singha Roy
UNDERSTANDING
THE PROBLEM
1
Heavy rains
Weak rocks
3
2
Steep hills:
Human activities
4
introductionResearch Area Literature Review Methodology Data Collection and
Preprocessing Model Development Model Validation Results & Discussions
INTRODUCTION
Landslides in Assam, India, are like giant mudslides that destroy homes, roads, and farms. They happen a lot near the Brahmaputra River and Himalayan foothills
Your paragraph text
Monsoon rains dump
over 2500 mm of water
yearly (like 10 bathtubs
of rain!)
Slopes steeper than a
playground slide (30°+)
are common
Some areas have soft
shale rock that
crumbles easily
Cutting down forests
and building without
planning makes things
worse
In 2022, landslides in Dima Hasao wrecked train tracks and left 10,000 people homeless. So, Assam needs better ways to predict and stop these disasters
Objectives
Terrain
Height of land
Steepness
Direction slopes face
Land shape
Rocks and faults
Types of rocks (shale = weak)
Closeness to cracks in the Earth
Plants
Less greenery (measured by
satellites) means higher landslide risk
Distance to rivers
How hard the rain falls
Water
What Did the Study Do?
Scientists used maps and satellite photos to figure out where landslides might happen. They looked at 9 key factors
Math Tools Use
Frequency Ratio (FR):
Checks how often landslides happen in
certain conditions (e.g., "Do landslides
occur more on steep slopes?")
Logistic Regression (LR):
Predicts landslide chances like a
weather forecast ("80% chance here!").
introductionResearch Area Literature Review Methodology Data Collection and
Preprocessing Model Development Model Validation Results & Discussions
Findings
High-risk areas:
Dima Hasao and Karbi Anglong districts
are most dangerous because of steep
slopes, weak shale rocks, and heavy rain
Best model
The LR model (AUC = 0.87) worked
better than FR (AUC = 0.79). AUC is like
a report card grade—higher is better
(max = 1)
Why does it matter?
Saves lives and money
Maps show where to focus on fixing slopes,
planting trees, or moving people
Better planning
Assam’s government can use this to build
safer roads/homes and warn people earlier
The method can help landslide-prone
regions worldwide
Useful for other areas
Significance
introductionResearch Area Literature Review Methodology Data Collection and
Preprocessing Model Development Model Validation Results & Discussions
Study Area – Assam’s Landslide Zones Climate & Geology Past Landslides
Districts Rainfall Past Landslide Events
Location
Elevation
Temperature
Geology of Area
RESEARCH AREA
Dima Hasao
Karbi Anglong
Cachar
Karimganj
Kamrup
Near Brahmaputra &
Barak rivers
From flat plains
(50m) to steep hills
(1960m)
Heavy rainfall during
monsoon: 2,500 to 3,500
mm (June–Sept)
Winter: ~15°C
Summer: ~35°C
Dima Hasao (2022)
50+ landslides along NH-27
Cachar & Karimganj (2018)
Deforestation led to landslides
Kamrup (2021)
Landslides in Guwahati
(Fatasil, Sarania Hills)
Karbi Anglong (2019)
Heavy rain (450 mm in 3 days)
Disang shale
Barail sandstone
Tipam formation
introduction Research AreaLiterature Review Methodology Data Collection and
Preprocessing Model Development Model Validation Results & Discussions
RESEARCH AREA
introduction Research AreaLiterature Review Methodology Data Collection and
Preprocessing Model Development Model Validation Results & Discussions
Lee & Pradhan (2007)
Introduced Frequency Ratio (FR) model for
landslide susceptibility mapping.
Used bivariate statistical method with GIS
integration.
Demonstrated simplicity and effectiveness in
terrain-based hazard assessment.
LITERATURE REVIEW
1.
Ayalew & Yamagishi
(2005)
Applied Logistic Regression (LR) to landslide
susceptibility analysis.
Highlighted its predictive ability using
landslide and non-landslide locations.
Emphasized the importance of selecting
relevant causative factors.
2.
Pachauri & Pant
(1992)
Early use of remote sensing and GIS
for landslide mapping in the Indian
Himalayas.
Used terrain parameters like slope,
aspect, and lithology.
Set foundation for later
susceptibility modeling approaches.
3.
Lee et al. (2004)
Showed LR provides higher predictive
accuracy than FR.
Used ROC curves to evaluate model
performance.
4.
Recent Advances
Use of remote sensing indices (NDVI,
Topographic Wetness Index).
Emphasis on model validation through AUC and
success rate curves.
5.
introduction Research Area Literature ReviewMethodology Data Collection and
Preprocessing Model Development Model Validation Results & Discussions
METHODOLOGY
1. Data Collection & Inventory
Preparation
Landslide Inventory:
285 landslide points.
70% used for training, 30% for validation.
2. Data Processing &
Thematic Layer Preparation
- Topography: Slope, elevation.
- Environmental: NDVI (vegetation), rainfall.
- Geology: Lithology, distance to roads/faults.
3. Model Development
a. Frequency Ratio (FR) Model
-Assess individual factor influence
b. Logistic Regression (LR) Model
-Evaluate combined effects of factors.
4. Model Validation &
Susceptibility Mapping
ROC-AUC :
- FR model: AUC = 0.88–0.90.
- LR model: AUC = 0.92–0.94
Accuracy: LR achieved 94% overall accuracy.
- Susceptibility Maps :
- Classification: Five zones (Very Low to Very High).
5. Final Results & Implications
Key Outcomes:
- LR model outperformed FR due to multivariate integration.
- Major triggers: Steep slopes (>30°), weak lithology, high
rainfall, deforestation, and road proximity.
introduction Research AreaLiterature Review MethodologyData Collection and
Preprocessing Model Development Model Validation Results & Discussions
2
4
1
3
5
Landslide Inventory & Key Factors
Landslide Inventory
135 locations mapped
→ 87 with landslides | 48 without
Sources: Govt reports, journals, satellite images
Clusters: Dima Hasao, Karbi Anglong
Topography
30m DEM (USGS) → Slope, Aspect,
Elevation maps
DATA COLLECTION & PREPROCESSING
Geology
Rock type maps from Bhuban Portal &
DivaGIS
Vegetation Health (NDVI)
NDVI from USGS → Measures
vegetation density & health
Land Use/Land Cover (LULC)
ESRI satellite images → Forests,
agriculture, cities
Climate Data
Rainfall: ECMWF NetCDF files
Soil Moisture: NASA SMAP mission
introduction Research AreaLiterature Review Methodology
Data Collection and
PreprocessingModel Development Model Validation Results & Discussions
Factor Impact
⛰ Slope Steeper (>30°) = More instability
?????? Aspect North-facing = More moisture
?????? Curvature Concave = Water accumulates; Convex = Runoff
⛰ Elevation Higher altitudes = Weathered rocks, thin soil
?????? NDVI Low vegetation (<0.2) = Less slope stability
?????? Land Use Urban areas & deforestation = High risk
?????? Relief >500m elevation change = Steep terrain
?????? Drainage <100m to rivers = Erosion-prone
?????? Road Proximity <50m to roads = Cuts & vibrations = slides
?????? Lithology Weak rocks (e.g., shale) = More prone to failure
?????? Rainfall >150 cm/year = Waterlogging, loss of shear strength
?????? Soil Moisture >35% moisture = High pore pressure = Landslides
12 Landslide Causative Factors
introduction Research AreaLiterature Review Methodology
Data Collection and
PreprocessingModel Development Model Validation Results & Discussions
MODEL DEVELOPMENT
Frequency Ratio (FR) Model:
Bivariate Statistical Approach: Each factor's
class is compared individually with landslide
occurrence.
1.
FR Calculation: FR = (Landslides in class / Total
landslides) ÷ (Area of class / Total area).
2.
Weight Assignment: FR values used as
weights indicate each class's landslide
favorability.
3.
Layer Integration: Weighted layers of all
factors overlaid using the raster calculator to
produce the final FR index.
4.
Map Classification: The FR index map was
categorized into five susceptibility zones using
natural breaks.
5.
Logistic Regression (LR) Model:
Binary Dependent Variable: Landslide
presence (1) or absence (0) used as the target
variable.
1.
Multivariate Analysis: All ten conditioning
factors were used as independent variables in
the regression.
2.
SPSS Implementation: The logistic regression
model was built, and the coefficients were
calculated using SPSS software.
3.
Predictive Equation: A probability value (0 to 1)
for landslide occurrence was computed using
the logistic function:
4.
Output Mapping: The probability values were
reclassified into susceptibility zones.
5.
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model DevelopmentModel Validation Results & Discussions
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model Development Model Validation Results & Discussions
Code snippet: Logistic regression
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model DevelopmentModel Validation Results & Discussions
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model DevelopmentModel Validation Results & Discussions
Predictor β (Coefficient)
Odds Ratio
(exp(β))
p-value
Intercept
–3.20 4 1
Slope (degrees)
82 1.085 <0.001
Mean Rainfall (mm)
14 1.014 2
NDVI (0–1 index)
–1.250 287 <0.001
Distance to Road (km)
–0.600 549 <0.001
(Additional factors)
–––––– –––––– –
Logistic regression results for landslide occurrence (training data). β = coefficient (log-odds).
Odds Ratio = exp(β). p–values indicate significance (p<0.01 for all).
Code snippet: Frequency Ratio
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model DevelopmentModel Validation Results & Discussions
Factor Factor Factor Factor
Slope (°)
0–5 30 Gentle slopes (rare
slides)
5–15 90 Moderate slopes
15–30 120 Increasing slide probability
30–45 180 Steep slopes (high slide likelihood)
>45 250 Very steep (very high slide risk)
Lithology
Alluvium/Soil (soft) 80 Cohesive sediments (low risk)
Sandstone/Shale 230 Weak bedrock (high risk)
Coal measure 140 Intermediate rock
Vegetation (NDVI)
High (dense forest) 40 Rooted forest (stabilizing)
Medium (scrub) 110 Partial cover
Low (barren/croplan d) 180 Sparse cover (erosion- prone)
Annual Rainfall (mm)
<1800 70 Relatively dry (few triggers)
1800–2500 100 Moderate rain
>2500 160 Very wet (frequent triggers)
Distance to Road (m)
0–500 170 Road-cuts on slopes (high risk)
500–1000 110 Near roads
>1000 55 Distant (less disturbance)
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model DevelopmentModel Validation Results & Discussions
Frequency Ratio (FR) values for representative classes of conditioning factors. FR = (landslide area ratio)/(class area ratio); values
>1 denote above-average landslide occurrence.
MODEL VALIDATION
Data Split:
70% landslide points
→ Model Training
30% landslide points
→ Model Testing
Validation Metric:
ROC–AUC (Receiver Operating
Characteristic – Area Under Curve)
Measures the model's
classification accuracy
AUC Range: 0.5 (no skill) to 1.0
(perfect prediction)
Results:
Frequency Ratio Model: AUC = 0.791
Logistic Regression Model: AUC = 0.847
Conclusion:
Both models show
good predictive ability
Logistic Regression
performs better and is
more reliable
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model Development Model ValidationResults & Discussions
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model Development Model ValidationResults & Discussions
ROC-AUC curve used for
accuracy assessment.
AUC:
FR Model: 0.77
LR Model: 0.82
Indicates good predictive
performance, with LR
outperforming FR.
Frequency Ratio (FR) Model:
Simpler, bivariate model
Highlights discrete high-risk zones
~15% of area marked High/Very High
Risk
RESULTS AND CONCLUSIONS
Landslide Distribution (2010–2024)
135 georeferenced landslides (esp. Dima
Hasao & Karbi Anglong)
Clusters in tectonically active, steep,
deforested slopes
2022 event: >5,000 landslides after >150
mm/day rainfall
Common near: Faults, roads, weak lithology
(shale/sandstone), and low vegetation
Susceptibility
Maps
Developed
Using Two
Models
Logistic Regression (LR) Model:
Advanced, multivariate model
Captures complex terrain interactions
~12% of area flagged High/Very High
Risk
Risk Zones
Classified Very Low → Very High
High-risk zones mostly in Southern Karbi
Anglong & Dima Hasao
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model Development Model Validation Results & Discussions
Landslide susceptibility map of Dima Hasao district using LR and FR
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model Development Model Validation Results & Discussions
Model Training AUC Validation AUC
FR 91 89
LR 96 92
Incomplete landslide inventory
DEM resolution limits micro-slope detail
Static model: Can’t account for sudden triggers
(e.g., earthquakes)
Remote terrain = Field validation challenges
Limitations
RESULTS AND CONCLUSIONS
Model Validation & Risk Management Insights
✔ LR outperformed FR
✔ Captured interdependencies (e.g., slope ↑, NDVI ↓
= risk ↑)
✔ Greater predictive accuracy for regional landslide
modeling
Reroute roads/rails away from
high-risk zones
Slope stabilization for safety
Low NDVI = Target areas for
afforestation
Combine rainfall data with
susceptibility zones for
alerts
Support Assam State
Disaster Management Plan
(2023–2028)
Infrastructure
Planning
Forest
Conservation
Early Warning
Systems
Policy
Integration
introduction Research AreaLiterature Review Methodology
Data Collection and
Preprocessing
Model Development Model Validation Results & Discussions