Naive Bayes_1.pptx Slides of NB in classical machine learning

AmgadAbdallah2 9 views 14 slides Jun 25, 2024
Slide 1
Slide 1 of 14
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14

About This Presentation

Naive Bayes_1.pptx Slides of NB in classical machine learning


Slide Content

Classification: Predicts categorical class labels Classifies data (using a model ) based on instance attributes to predict class labels The model is induced from a training set The model is used to classify (predict) the class of new data Typical Applications of Classification Credit approval Target marketing Medical diagnosis/prognosis Treatment effectiveness analysis Classification

Classification—A Two-Step Process (1) Model construction: describing a set of predetermined classes Each instance/example is assumed to belong to a predefined class, as determined by the class label The set of instances used for model construction: training set The model is represented as classification rules, decision trees, or mathematical formulae (2) Model usage: for classifying future or unknown objects Model Evaluation : Estimate accuracy of the model The known label of test sample is compared with the classified result from the model Accuracy rate is the percentage of test set samples that are correctly classified by the model Test set is independent of training set, otherwise over-fitting will occur

Classification Using a Classifier for Prediction Data to be classified Classifier Decision on class assignment Using Hypothesis for Prediction: classifying any example described in the same manner as the data used in training the system (i.e. same set of features)

Classification Training Set Data with known classes Classification Technique Classifier Data with unknown classes Class Assignment

Naive Bayes Naive Bayes is a simple probabilistic classifier based on applying Bayes ' theorem (or Bayes's rule) with strong independence (naive) assumptions Allows us to combine observed data and prior knowledge Provides practical learning algorithms

Who is who in Bayes ’ rule Bayes’ Rule

Example 1 Let h be the event of raining and d be the evidence of dark cloud, then we have P(dark cloud | raining): “dark cloud” could occur in many other events such as overcast day or forest fire. This probability can be obtained from historical data. – P(raining) is the priori probability of raining. This probability can be obtained from statistical record, for example, the number of rainy days throughout a year. – P(dark cloud) is the probability of the evidence “dark cloud” occurring.

More on Naive Bayes Generally, it is “better” to have more than one evidence to support the prediction of an event. Typically, the more evidences we can gather, the better the classification accuracy can be obtained. However, the evidence must relate to the event (must make sense) When we have more than one evidence for building our NB model, we could run into a problem of dependencies, i.e., some evidence may depend on one or more of other evidences. For example, the evidence “dark cloud” directly depends on the evidence “high humidity”. Dependencies can make the model a very complicated one Assume there are no dependencies  Naïve

The Naïve Bayes Classifier What can we do if our data d has several attributes? Naïve Bayes assumption: Attributes that describe data instances are conditionally independent given the classification hypothesis it is a simplifying assumption, obviously it may be violated in reality in spite of that, it works well in practice The Bayesian classifier that uses the Naïve Bayes assumption and computes the MAP hypothesis is called Naïve Bayes classifier One of the most practical learning methods Successful applications: Medical Diagnosis Text classification

Example. ‘Play Tennis’ data

The Probabilistic Model

Based on the examples in the table, classify the following datum x : x=( Outl =Sunny, Temp=Cool, Hum=High, Wind=strong) That means: Play tennis or not? Working:

A Closer Look We can ignore Pr(E) because we only need to “relatively” compare the value to other class. :

Resources Textbook reading (contains details about using Naïve Bayes for text classification): Tom Mitchell, Machine Learning (book), Chapter 6.
Tags