RANDOM FOREST for machine and deep learning for computer science

ssemwogerere_rajab 0 views 21 slides Oct 13, 2025
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

RANDOM FOREST for machine learning


Slide Content

RANDOM FOREST GONZAGA EDWARD OKANE 2019/HD05/25243U EGOR ELEAZAR 2019/HD05/25247U

AREAS OF DISCUSSION What is a decision tree and how does it work Key terms How does a decision tree work What is Random Forest Why use Random Forest How Random Forest works Applications of random Forest Use Case

WHAT IS A DECISION TREE AND HOW DOES IT WORK? Is a tree shaped diagram used to determine a course of action. Each branch of the tree represents a possible decision, occurrence, or reaction.

KEY TERMS Entropy – is the measure of randomness or unpredictability in the dataset

2. Information gain – the measure of decrease in entropy after the dataset is split

3. Leaf Node – carries the classification or the decision 4. Decision Node – has two or more branches

5. Root Node – The top most decision node

HOW DOES A DECICION TREE WORK Problem statement: To classify the different types of fruits in the bowl based on different features The dataset (Bowl) is quite messy and has a high entropy To split the data, we have to frame the conditions that split the data in such a way that the information gain is the highest NB: Gain is the measure of decrease in entropy after splitting

We will try to choose a condition that gives us the highest gain We will do that by splitting the data using each condition and checking the gain that we get out of them NB: The condition that gives us the highest gain will be used to make the first split

After splitting based on diameter, the entropy has reduced

We then split the right node further based on color We can then predict a lemon with 100% accuracy Apple can also be predicted with 100% accuracy

WHAT IS RANDOM FOREST Is a method that operates by constructing multiple decision trees Bunch of decision trees bundled together Based on the idea “The wisdom of the crowd” Gets predictions from each tree and selects the best solution by voting The decision of majority of the trees is chosen by the random forest as the final decision Example - Getting recommendations from friends for vacation destinations Can be used for regression and classification

WHY DO WE USE RANDOM FOREST? 1. No overfitting In overfitting, the model learns “too much” from the training data set Overfitting is the case where the overall cost is really small, but the generalization of the model is unreliable. What use is a model that has learned very well from the training data but still can’t make reliable predictions for new inputs ? We always want to find the trend, not fit the line to all the data points Training time is less

2. High Accuracy Random forest runs efficiently on large databases Produces highly accurate predictions for large data 3. Estimates missing data Maintains accuracy when a large proportion of data is missing E.g. different sets of demographic statistics coming in from various areas where; O ne set is missing number of children in the house Another set missing size of the house Random forest will look at the sets differently and build 2 different trees, then guesses which one fits better

HOW RANDOM FOREST WORKS Step 1: Select the random samples from a given dataset Step 2 : Construct a decision tree from each sample and get a prediction result from each decision tree Step 3: Perform a vote for each predicted result Step 4: Select the prediction result with the most votes as the final prediction

Lets take this blackened fruit and try to classify it This is an example where random forest works really good when missing data Diameter = 3 Colour = Orange Grows in summer = Yes Shape = Circle

APPLICATIONS Kinect Game console developed by Microsoft. Uses infrared to track body movements and recreates it in the game.

2. Remote Sensing Used in Enhanced Thematic Devices (ETM) on satellites to acquire high-resolution imaging information of the Earth’s surface L ess training time Higher accuracy

3. Object Detection Multiclass object detection e.g. Traffic where the algorithm is used in sorting out different types of vehicles such as buses, lorries, etc Provides better detection in complicated environments
Tags