Accuracy Assessment of Machine Learning Models in Species.pptx
darkevil23
0 views
8 slides
Oct 06, 2025
Slide 1 of 8
1
2
3
4
5
6
7
8
About This Presentation
Accuracy Assessment of Machine Learning Models in Species
Size: 1.34 MB
Language: en
Added: Oct 06, 2025
Slides: 8 pages
Slide Content
Accuracy Assessment of Machine Learning Models in Species Distribution Modelling
What is Accuracy Assessment? Accuracy Assessment: Evaluation of how well a species distribution model predicts actual species occurrence Purpose: Ensure reliable predictions for conservation, climate change studies, and biodiversity planning. A good model helps us in Better information about areas of interest. Future distribution of species in context to Climate Change and other such challenges Better allocation of resources for management in terms of man power and expenses by concentrating efforts where required.
Plots generated by maxent model to visualise model accuracy
Methods of accuracy assessment AUC, Kappa, TSS - widely used but with known limitations Spatial validation, ensemble methods, process-based integration are new approaches with complex algorithms requiring specialized method. Area under the ROC It plots true positive Rate vs False Positive Rate across all thresholds that is correctly identifying positives and incorrectly flagging negatives as positives therefore is a threshold independent measure. Range 0 (classifier performing perfectly in reverse) 0.5 (random) to 1 (Perfect Discrimination). Limitations High AUC doesn't guarantee good spatial prediction as it treats pseudo-absences as true absence and does not penalize complex models. True Skill Statistics (TSS) It is considered more realistic than kappa for rare species as it is prevalence independent. The limitation is that it depends on the threshold classifier.
Prevalence, Threshold and Classifier Prevalence is the inherent trait of the data set that is how common or rare is the occurrence of the event being studied. Prevalence independent means that the classifier is not getting affected by it and depend on correctly identifying positive and negative cases Threshold refers to the cut of point that is decided Classifier – Any method that acts as decision boundary for threshold which by default is 0.5. Like I used maximum training sensitivity plus specificity threshold as it is prevalence independent therefore pseudo absence do not have any effect on it therefore good for models relying on presence only data (Liu, et al., 2013).
The precision-recall trade-off is the inverse relationship where improving precision (correct positive predictions) often decreases recall (capturing all actual positive instances), and vice versa. This threshold classifier is the point where the proportion of correctly predicted presences and pseudo-absences are maximized ( Rose and Wall, 2011). Therefore, increases the predictive ability of the model for real world scenarios as false positives are minimized. Types of Errors Overfitting – Model learns from noise in data rather than true ecological relationships. Conversely Good fit means that the model makes ecological sense. Overparameterization – Too many variables are used relative to sample size. This leads to unrealistic ecological interpretations. Omission Errors (False Negatives)- Model fails to predict species presence where it actually occurs. This leads to missed areas that require attention. Commission Errors (False Positives)- Model predicts presence where species is actually absent. It leads to overestimation of the area that needs focus for example overestimation of suitable habitat.
3) Kappa – It is a prevalence dependent metrics and therefore is generally not considered good for SDM because of Kappa Paradox. It happens with unbalanced data, like in "presence-only" data. Example – Most of the points are "No" and only a few "Yes“. Because the model guesses "No" almost all the time, and the true answer is "No" most of the time, the "agreement due to chance" becomes very high, which makes the Kappa score plummet. Therefore, it is not considered a good measure for SDMs specially, for large areas or rare species. 4) AIC - The Akaike Information Criterion (AIC) is a statistical measure used to compare different statistical or machine learning models to find the one that best fits the data. The goal is to strike a balance between how well the model explains the data (goodness of fit) and how simple the model is (parsimony). It is a comparative metrices and does not provide useful information on its own.
Variables Training AUC Test AUC TSS Kappa AIC Delta AIC a) Landscape Features Elevation (DEM) 0.766 0.755 0.398 0.011 6662.69 106.72 Ruggedness (RGD) 0.749 0.739 0.384 0.010 6717.07 161.09 Drainage Density (Drain_D) 0.562 0.504 0.004 7002.76 446.79 Distance to Protected Area (D2PA) 0.662 0.625 0.224 0.006 6894.19 338.22 Average NDVI (NDVI_avg) 0.666 0.62 0.213 0.005 6776.09 220.11 b) Anthropogenic Pressures Human Population Density (HPop_D) 0.746 0.733 0.396 0.017 6966.28 410.30 Road Density (Rd_D) 0.744 0.784 0.402 0.010 6621.07 65.09 Distance to Road (D2Rd) 0.721 0.735 0.350 0.008 6780.33 224.35 Nightlight Intensity (NI) 0.745 0.782 0.410 0.012 6743.96 187.99 Livestock Density (Liv_D) 0.714 0.707 0.287 0.013 7049.42 493.45 c) D2PA+Drain_D+DEM+NDVI_Avg (Habitat Features) 0.787 0.754 0.446 0.014 6606.94 50.97 D2PA+Drain_D+DEM+NDVI_Avg+Liv_ 0.793 0.763 0.403 0.016 6599.72 43.74 DEM+NI+HPop_D+Rd_D+D2PA (Leopard risk prediction model) 0.844 0.801 0.481 0.018 6555.97