Model Performance Metrics. Accuracy, Precision, Recall
MeghaSharma504
809 views
11 slides
Sep 16, 2024
Slide 1 of 11
1
2
3
4
5
6
7
8
9
10
11
About This Presentation
Model Performance Metrics. Accuracy, Precision, Recall
Size: 120.73 KB
Language: en
Added: Sep 16, 2024
Slides: 11 pages
Slide Content
Data Science Model Performance Metrics (Accuracy, Precision, Recall)
Model Performance Metrics Model performance metrics are measurements used to evaluate the effectiveness and efficiency of a predictive model or machine learning algorithm. To evaluate the performance of predictive model there are metrics: Accuracy Precision Recall (Sensitivity) F1-Score Confusion Matrix ROC Curve and AUC Please check the description box for the link to Machine Learning videos.
TP TN FP FN A true positive is an outcome where the model correctly predicts the positive class . Similarly, a true negative is an outcome where the model correctly predicts the negative class . A false positive is an outcome where the model incorrectly predicts the positive class. And a false negative is an outcome where the model incorrectly predicts the negative class. Predicted Positive Negative Positive Negative Actual True Positive TP True Negative TN False Negative FN False Positive FP
Accuracy Accuracy: Accuracy is the ratio of the number of correct predictions and the total number of predictions. It is calculated as- Accuracy = (TP + TN) / (TP + TN + FP + FN) Accuracy is useful in binary classification with balanced classes, also be used for evaluating multiclass classification model when classes are balanced. When classes in the dataset are highly imbalanced, meaning there is a significant disparity in the number of instances between classes, accuracy can be misleading. A model may achieve high accuracy by simply predicting the majority class for every instance, ignoring the minority class entirely.
Example let's consider a medical diagnosis scenario where we are developing a model to predict whether a patient has a rare disease or not. Suppose we have a dataset of 100 patients, out of which only 2 have the disease. This dataset represents a highly imbalanced scenario. let's say we develop a simple classifier that always predicts that a patient does not have the disease. D espite the high accuracy of 98%, this classifier is not useful because it fails to identify any patients with the disease. It simply predicts that every patient is disease-free. In such cases, evaluation metrics like precision, recall, or F1-score provide more insightful information about the model's performance, especially concerning its ability to correctly identify the minority class (patients with the disease).
Precision Precision : Precision is a measure of a model’s performance that tells us how many of the positive predictions made by the model are actually correct. It is calculated as- Precision = TP / (TP + FP) Precision is particularly useful in scenarios where the cost of false positives is high. The importance of precision is in music or video recommendation systems, e-commerce websites, etc. where wrong results could lead to customer churn, and this could be harmful to the business. It gives us insight into the model's ability to avoid false positives , A higher precision indicates fewer false positives.
Example Suppose we have a dataset of 1000 emails, out of which 200 are spam (positive class) and 800 are not spam (negative class). After training our spam detection model, it predicts that 250 emails are spam. True Positives (TP): 150 (correctly identified spam emails) False Positives (FP): 100 (non-spam emails incorrectly classified as spam) Using these numbers, let's calculate precision: Precision=150/150+100=150/250 =0.6 So, the precision of the model is 0.6 or 60%. This means that out of all the emails predicted as spam, 60% of them are actually spam.
Recall (Sensitivity) Recall : Also known as sensitivity or true positive rate, recall measures the proportion of true positive predictions among all actual positive instances in the dataset. It is calculated as - Recall = TP / (TP + FN). Recall is particularly useful in scenarios where capturing all positive instances is crucial, even if it means accepting a higher rate of false positives. In medical diagnosis, missing a positive instance (false negative) can have severe consequences for the patient's health or even lead to loss of life. High recall ensures that the model identifies as many positive cases as possible, reducing the likelihood of missing critical diagnoses. It gives us insight into the model's ability to avoid false negatives , which are cases where patients with the disease are incorrectly diagnosed as not having it.
Example Suppose we have a dataset of 100 patients who were tested for a specific disease, where 20 patients actually have the disease (positive class), and 80 patients do not have the disease (negative class). After training our diagnostic model, True Positives (TP): 15 (patients correctly diagnosed with the disease) False Positives (FP): 5 (patients incorrectly diagnosed with the disease) False Negatives (FN): 5 (patients with the disease incorrectly diagnosed as not having the disease) True Negatives (TN): 75 (patients correctly diagnosed as not having the disease) Recall= 15/15+5 =15/20 =0.75
Precision vs Recall Precision can be seen as a measure of quality. Higher precision means that an algorithm returns more relevant results than irrelevant ones. Precision measures the accuracy of positive predictions. Precision is important when the cost of false positives is high. (e.g. spam detection). R ecall can be seen as a measure of quantity. H igher recall means that an algorithm returns most of the relevant results (whether irrelevant ones are also returned). R ecall measures the completeness of positive predictions. Recall is important when the cost of false negative is high. (e.g. disease diagnosis)
Thanks for Watching! Please check the description box for the link to Machine Learning videos.