Lecture 10 Naive Bayes Classifier.hdghpptx

ssuser68b4231 18 views 30 slides May 11, 2024
Slide 1
Slide 1 of 30
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30

About This Presentation

Data


Slide Content

Lecture -10 Naïve Bayes Classifier

Contents What is a Naïve Bayes Classifier ? Bayes Theorem Why it is called Naïve Bayes Conditional Independence How Naïve Bayes Works? – Examples Different SKLEARN Implementations of Naïve Bayes ( Types) Data Preparation Advantages and Disadvantages Where to use / not to use Naïve Bayes Use Cases 4/15/2024 2

Classification – Process Flow 4/15/2024 3

Predict the probability of buying a computer Class: C1:buys_computer = ‘yes’ C2:buys_computer = ‘no’ Data to be classified: X = (age <=30, Income = medium, Student = yes Credit rating = Fair) Training Data Test Instance 4/15/2024 4

What is Naive Bayes Classifier? 4/15/2024 5

Bayes Theorem 4/15/2024 6

Bayes Theorem - Example 4/15/2024 7

Bayes Theorem - Example Solved 4/15/2024 8

Why is it called Naïve ? Naïve Bayes Classifier is called Naïve due to its Conditional Independence Assumption . Conditional Independence Features in the dataset are mutually independent given the class 4/15/2024 9

4/15/2024 10

Conditional Independence 4/15/2024 11

How Naive Bayes Classifier works? Example: Given an example of weather conditions and playing sports. You need to calculate the probability of playing sports. You need to classify whether players will play or not, based on the weather condition. 4/15/2024 12 Naive Bayes classifier calculates the probability of an event in the following steps: Step 1: Calculate the prior probability for given class labels Step 2: Find Likelihood probability with each attribute for each class Step 3: Put these value in Bayes Formula and calculate posterior probability. Step 4: See which class has a higher probability given the input.

For simplifying prior and posterior probability calculation you can use the two tables frequency and likelihood tables. Both tables will help you to calculate the prior and posterior probability. The Frequency table contains the occurrence of labels for all features. There are two likelihood tables. Likelihood Table 1 is showing prior probabilities of labels and Likelihood Table 2 is showing the posterior probability. 4/15/2024 13

Now suppose you want to calculate the probability of playing when the weather is overcast. 4/15/2024 14

Now suppose you want to calculate the probability of playing when the weather is overcast. 4/15/2024 15

Probability of not playing 4/15/2024 16

The probability of 'Yes' class is higher. So, you can determine here if the weather is overcast than players will play the sport. 4/15/2024 17

Two or More Features Second Approach (In case of multiple features) 4/15/2024 18

4/15/2024 19 Now suppose you want to calculate the probability of playing when the weather is overcast, and the temperature is mild.

4/15/2024 20

Different SKLEARN Implementations Based on the types of attributes , there are three well-known implementations of Naïve Bayes GaussianNB Used when data is numeric Assumes features follow a normal distribution BernoulliNB Used when all the features are binary-valued MultinomialNB Works well with categorical data with more than two categories 4/15/2024 21

Advantages It is not only a simple approach but also a fast and accurate method for prediction. Naive Bayes has very low computation cost. It can efficiently work on a large dataset. It performs well in case of discrete response variable compared to the continuous variable. It can be used with multiple class prediction problems. It also performs well in the case of text analytics problems. When the assumption of independence holds, a Naive Bayes classifier performs better compared to other models like logistic regression. 4/15/2024 22

Disadvantages The assumption of independent features. In practice, it is almost impossible that model will get a set of predictors which are entirely independent. If there is no training tuple of a particular class, this causes zero posterior probability. In this case, the model is unable to make predictions. This problem is known as Zero Probability/Frequency Problem. Zero Posterior problem can be handled using Laplace Correction 4/15/2024 23

Laplace Correction 4/15/2024 24

Laplace Correction 4/15/2024 25

Use Cases of Naive Bayes Spam Filtering Text Classification Sentiment Analysis Product Recommendation System Bank Marketing Response Prediction Loan Default Prediction 4/15/2024 26

When to Use / Not to Use Naïve Bayes 4/15/2024 27 If you think conditional independence assumption is very weak and data contains numeric features which are not normally distributed , then Naïve Bayes will not give good results The independence of categorical attributes can be tested by the chi-square (χ2 ) test for independence.

Data Pre-Processing Requirements Handle data quality problems Incorrect Data Missing Values Apply discretization to Numeric Columns Apply Standardization / Normalization to Numeric Columns 4/15/2024 28

Predict the probability of buying a computer Class: C1:buys_computer = ‘yes’ C2:buys_computer = ‘no’ Data to be classified: X = (age <=30, Income = medium, Student = yes Credit_rating = Fair) Training Data Test Instance 4/15/2024 29

30 Solution P(C i ): P(buys_computer = “yes”) = 9/14 = 0.643 P(buys_computer = “no”) = 5/14= 0.357 Compute P( X|C i ) for each class P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222 P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6 P(income = “medium” | buys_computer = “yes”) = 4/9 = 0.444 P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4 P(student = “yes” | buys_computer = “yes) = 6/9 = 0.667 P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2 P( credit_rating = “fair” | buys_computer = “yes”) = 6/9 = 0.667 P( credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4 X = (age <= 30 , income = medium, student = yes, credit_rating = fair) P( X|C i ) : P( X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044 P( X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019 P( X|C i )*P(C i ) : P( X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028 P( X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007 Therefore, X belongs to class (“buys_computer = yes”) 4/15/2024
Tags