Classification Assessment Methods.pptx

RiyadhALHaidari 66 views 17 slides Apr 08, 2023

Slide 1 of 17

About This Presentation

Classification Assessment
Methods

Size: 1023.53 KB

Language: en

Added: Apr 08, 2023

Slides: 17 pages

Slide Content

Classification Assessment Methods Riadh Al-Haidari

Introduction Classification techniques such as (logistic regression, LDA and SVM) have been applied to many applications in various fields of sciences. The analysis of such metrics and its significance must be interpreted correctly for evaluating different learning algorithms to find the most suitable algorithm for a particular problem There are several ways of evaluating classification algorithms. Most of these measures are scalar metrics and some of them are graphical

1- Confusion matrix TP TN FP FN

1- Confusion matrix 1- Accuracy 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇𝑃+𝑇𝑁 𝑇𝑃+𝐹𝑃+𝑇𝑁+𝑓𝑁 (sensitive to imbalance data) 2-Error rate (ERR) or misclassification rate (sensitive to imbalance data) ERR=1-accuracy 3- Sensitivity(recall)=TRR 𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒 𝑇𝑃𝑅 = TPR =1- False negative rate 𝑇𝑃 𝑇𝑃+𝐹𝑁 4- Specificity= 1- FPR False Positive Rate (FPR): FPR= 1- specificity (FPR) = 𝐹𝑃 𝐹𝑃+𝑇𝑁

1- Confusion matrix 5-Predictive value 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛(PPV) = 𝑇𝑃 𝑇𝑃+𝐹𝑃 6-Likelihood ratio Positive likelihood (LR+) = 𝑇𝑃𝑅 1−𝑇𝑁𝑅 Negative likelihood (LR- ) = 1−𝑇𝑃𝑅 𝑇𝑁𝑅 Where TNR is True negative rate (TNR)= 𝐹𝑁𝑅 =1- TPR 𝑇𝑁 𝑇𝑁+𝐹𝑃 Diagnostic odds ratio (DOR)= (LR+) / (LR- ) 7-Youden’s index YI=TNR+TPR- 1

1- Confusion matrix 8-Matthews correlation coefficient (MCC ) (sensitive to imbalance data) Discriminant power (DP) F1-score 11- Markedness (MK) (sensitive to imbalance data) MK=PPV+NPV- 1

1- Confusion matrix Balanced classification rate or balanced accuracy (BCR) (BCR)=1/2(TPR+TNR) Geometric Mean (GM) Optimization precision (OP) (sensitive to imbalance data) 15-Jaccard (sensitive to imbalance data)

2- Receiver operating characteristics (ROC) It is used to make a balance between the benefits(true positives), and costs( false positives) In multi- class classification, ROC becomes more complex than in binary to solution for this problem is to produce one ROC curve for each class. Insensitive with the imbalanced data

ROC The steps of generating ROC curve sorting samples according to their scores changing the threshold value from maximum to minimum to process one sample at a time and update the values of TP and FP in each time Next, the values of TPR and FPR are calculated and pushed into the ROC When the threshold becomes very low , all samples are classified as positive samples and hence the values of both TPR and FPR are one

3- Area under the ROC curve (AUC) Scalar value represents the expected performance from ROC AUC= Base*Height (FPR2 -FPR1)(TPR1 - TPR2)

4-Precision- Recall (PR) curve Shows the relationship between recall and precision Generated by changing the threshold as in ROC

Demo I used Wisconsin Diagnostic Breast Cancer Dataset (WDBC) [1] [2]To build a model to apply the classification assessment methods This data’s dimensions is (569, 32) and has tow class B and M Benign 357 Malignant 212 (Not balance) Three classification techniques: Logistic regression Decision tree Linear discriminant Three assessment methods: Confusion matrix (recall, accuracy, F1- score,precision) ROC AUC

Demo: Confusion Matrix Result

Demo: ROC &AUC

Demo Finally after assessment we can choose our best classifier based in the confusion matrix, ROC and AUC result. Therefore the best classifier is Linear discriminant as it has highest recall, accuracy, F1-score,precision and AUC. From ROC, it appears that Linear discriminant is the best one, as it near to the ideal curve ( close the left corner)

Conclusion The paper gives a detailed overview of the classification assessment measures. It explains the relations between these measures and the robustness of each of them against imbalanced data I have implemented a real problem (breast cancer diagnosis) to show the importance of evaluation of classification to choose the best classifier with the highest accurcey, which I think it is better than random numerical values as in the articles.

References [1]Tharwat, A., 2018. Classification assessment methods. Applied Computing and Informatics. [2]Breast Cancer Wisconsin (Diagnostic) Data Set, access March 28, 2020. http://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(diagnostic) . [3] Omondiagbe, D.A., Veeramani, S. and Sidhu, A.S., 2019, April. Machine Learning Classification Techniques for Breast Cancer Diagnosis. In IOP Conference Series: Materials Science and Engineering (Vol. 495, No. 1, p. 012033). IOP Publishing.

Classification Assessment Methods.pptx

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Classification Assessment Methods.pptx

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......