Decision Tree machine learning classification .pptx

asmaashalma456 21 views 30 slides Sep 21, 2024
Slide 1
Slide 1 of 30
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30

About This Presentation

Machine learning


Slide Content

Decision Tree

Agenda Introduction to classification. Introduction to decision tree. Design issues. Refrences

Introduction To Classification

Introduction To Classification Classification is the task of assigning objects to one of several predefines class . The set of records available for developing classification methods is divided into two subsets __ a training set and a test set . Training set used to build the model and test set used to validate it.

Introduction To Classification Training Phase

Introduction To Classification Classification Phase

Introduction To Classification Evaluation of classification models Counts of test records that are correctly (or incorrectly) predicted by the classification model. Confusion matrix Class = 1 Class = 0 Class = 1 f 11 f 10 Class = 0 f 01 f 00 Predicted Class Actual Class

Decision Tree

Introduction To Decision Tree A decision tree is a flowchart-like tree structure, where each internal node (non leaf node) denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (or terminal node ) holds a class label. Construction of decision tree:-  Top-Down strategy.

Introduction To Decision Tree

Example of decision tree

Example of decision tree

Example of decision tree

Example of decision tree

Example of decision tree

Design issues How should the training records be split? How should the splitting procedure stop?

a. Binary Attribute : generate two possible outcome (Binary Split). b. Nominal Attribute : Multiway Split. Methods for expressing Attribute test condition

b. Nominal Attribute : Binary Split (e.g., CART). Methods for expressing Attribute test condition

c. Ordinal Attribute : Multiway Split. Methods for expressing Attribute test condition

Methods for expressing Attribute test condition c. Ordinal Attribute : Binary Split.

Methods for expressing Attribute test condition d. Continuous Attribute : Multiway Split. d. Continuous Attribute : Binary Split.

Classify the following attributes: Time in terms of AM or PM. binary, ordinal Angles as measured in degree between 0 and 360 continuous Bronz , Silver and Gold medal as awarded at the Olymbic . Discrete , ordinal Number of patient in hospital. Discrete

Attribute Selection Measures The attribute selection measure provides a ranking for each attribute describing the given training tuples. The attribute having the best score for the measure is chosen as the splitting attribute for the given tuples. An examples of attribute selection measures are  information gain, gain ratio, and Gini index

Attribute Selection Measures (Information Gain) ID3 uses information gain as its attribute selection measure. The attribute with the highest information gain is chosen as the splitting attribute. This attribute minimizes the information needed to classify the tuples in the resulting partitions and reflects the least randomness or “impurity” in these partitions.

Attribute Selection Measures (Information Gain) The expected information ( number of bits ) needed to classify a tuple in D ( Entropy ) is given by For attribute A Final Information Gain

Examples:

Splitting binary attributes (using Information Gain) Example Class 6 (-) C0 4 (+) C1 Info(D)=0.970 Info(D) = =0.97   Suppose there are two ways (A and B) to split the data into smaller subset.

T (7) 4 + 3 - F(3) + 3 - A Splitting binary attributes (using Information Gain) Gain(A) = Info(D) - = - Gain(A) = 0.97 – 0.593 = 0.377  

T(4) 3 + 1 - F (6) 1 + 5 - B Gain(B) = Info(D) - = - Gain(B) = 0.97 – = 0.2616  

References Jerzy W. GRZYMALA-BUSSE, “Selected Algorithms of Machine Learning from Examples”, Fundamenta Informaticae 18 (1993), 193–207. Thair Nu Phyu , “Survey of Classification Techniques in Data Mining”, Proceedings of the International MultiConference of Engineers and Computer Scientists Vol. I IMECS 2009,18th–20th March, 2009, Hong Kong. J. Han and Kamber M., Data Mining : Concepts and Techniques, Third Edition, Morgan Kaufmann, 2011.
Tags