machine _learning_introductionand python.pptx

ChandrakalaV15 5 views 54 slides Mar 10, 2025
Slide 1
Slide 1 of 54
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54

About This Presentation

contains the theory of machine learning


Slide Content

Introduction to Machine Learning

What is machine Learning?

Machine learning Machi ne learning  is a method of data analysis that automates analytical model building. It is a branch of Artificial   intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.

Why we need Machine Learning? Healthcare. Government Systems Marketing and sales. E-commerce and social media Transportation Etc.

Application

Application

How machine learning works

Types of Machine learning

Supervised learning  is the type of   machine learning   which involves the task of  learning  a function that maps an input to an output based on example input-output pairs. It infers a function from labeled  training  data consisting of a set of  training  examples. Example algorithms: Linear Regression Logistic Regression Decision Trees Support Vector Machines (SVM)

Unsupervised learning  is a type of machine  learning  algorithm used to draw inferences from data-sets consisting of input data without labeled responses. The most common  unsupervised learning  method is cluster analysis, which is used for exploratory data analysis to find hidden patterns or grouping in data. Example techniques: K-Means Clustering Hierarchical Clustering Principal Component Analysis (PCA)

Import required packages Load dataset Identify/ Group features and target attributes Visualize data to apply efficient algorithm Split dataset into training set and testing set load the model Train the model by providing training data Prediction based on test data score the model Determine the error and accuracy Steps involved in building machine learning model

Evaluation Metrics for Regression  

 

 

Evaluation metrics for class Accuracy Measures the percentage of correctly predicted instances. Formula: Accuracy = (TP + TN) / (TP + TN + FP + FN)

Precision Measures the correctness of positive predictions. Formula: Precision = TP / (TP + FP)

Recall (Sensitivity) Measures the model's ability to capture actual positive cases. Formula: Recall = TP / (TP + FN)

F1-Score Balances precision and recall for imbalanced datasets. Formula: F1-Score = 2 * (Precision * Recall) / (Precision + Recall)

ROC-AUC Score Measures model performance across different classification thresholds. ROC Curve: Plots True Positive Rate vs. False Positive Rate. AUC Score: 1 = Perfect, 0.5 = Random Guessing.

Confusion Matrix Summarizes model predictions compared to actual values.

Supervised Machine Learning Regression For regression analysis there is one independent set and one dependent set. When we have to model the relationship between dependent and independent variable, regression analysis will be applied. Whenever any dependent set is continuous or having numerical data, regression analysis can be applied

Linear Regression Dependent variable is continuous in nature Applied when there is a linear relationship between independent and dependent variable

 

Logistic Regression Logistic regression is kind of like linear regression but is used when the dependent variable is not a number, but something else (like a Yes/No response).  Its called Regression but performs classification as based on the regression it classifies the dependent variable into either of the classes.

Logistic regression is used for prediction of output which is binary  For example, if a credit card company is going to build a model to decide whether to issue a credit card to a customer or not. Firstly, Linear Regression is performed on the relationship between variables to get the model . The threshold for the classification line is assumed to be at 0.5 Linear Regression Logistic Sigmoid Function

Logistic Function is applied to the regression to get the probabilities of it belonging in either class. It gives the log of the probability of the event occurring to log of the probability of it not occurring. In the end, it classifies the variable based on the higher probability of either class. Log Odds

K-Nearest Neighbours (K-NN) K-NN algorithm is one of the simplest classification algorithm and it is used to identify the data points that are separated into several classes to predict the classification of a new sample point.   K-NN is a  non-parametric,  lazy learning algorithm. It classifies new cases based on a similarity measure (e.g. distance functions). KNN does not learn. It is lazy and it just memorizes the data. KNN works well with a small number of input variables (p), but struggles when the number of inputs is very large.

Decision Tree Classification Decision tree builds classification or regression models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with  decision nodes  and  leaf nodes . Decision Tree uses  Entropy  and  Information Gain  to construct a decision tree.

Information Gain It measures the relative change in entropy with respect to the independent attribute. Constructing a decision tree is all about finding the attribute that returns the highest information gain (i.e., the most homogeneous branches). The information gain is calculated for the target attribute.  

Entropy Entropy is the degree or amount of uncertainty in the randomness of elements or in other words it is a measure of impurity . Entropy of an attribute is the multiplication of Information gain of that attribute and the probability of that attribute. Out of all the features attributes one will become the root node.    

age competition type profit old yes s/w down old no s/w down old no h/w down mid yes s/w down mid yes h/w down mid no h/w up mid no s/w up new yes s/w up new no h/w up new no s/w up   1 st find out entropy of Age Down Up Old 3 Mid 2 2 New 3      

P(old) = 3/10 P(mid) = 4/10 P(new) = 3/10 E(Age) = P(old)*I(old) + P(mid)*I(mid) + P(new)*I(new) E(Age) = 3/10*0 + 4/10*1 + 3/10*0 E(Age) = 0.4 Gain(Age) = 1 – 0.4 Gain(Age) = 0.6 Gain(Age) = 0.6 Gain(Competition) = 0.124 Gain(Type) = 0 I.G. (target) = 1 Age Competition old mid new yes no down up down up

Why python?

Integrated Development environment(IDE)

Installation https://docs.anaconda.com/anaconda/install/windows/

Data types Data types  are the classification or categorization of  data  items.

Variables Variables  are containers for storing data values. 

Operators Arithmetic operators Assignment operators Comparison operators Logical operators Identity operators Membership operators Bitwise operators

Arithmetic operator

Assignment operators

Comparison Operators

Bitwise operators

Identity operator

Membership operator

Decision Making Decision-making is the anticipation of conditions occurring during the execution of a program and specified actions taken according to the conditions. if statements: An if statement consists of a Boolean expression followed by one or more statements. if...else statements: An if statement can be followed by an optional else statement, which executes when the Boolean expression is FALSE. nested if statements: You can use one if or else if statement inside another if or else if statement(s).

Loops A loop statement allows us to execute a statement or group of statements multiple times. while loop: Repeats a statement or group of statements while a given condition is TRUE. It tests the condition before executing the loop body. for loop: Executes a sequence of statements multiple times and abbreviates the code that manages the loop variable. nested loops: You can use one or more loop inside any another while, or for loop. The Infinite Loop: A loop becomes infinite loop if a condition never becomes FALSE. The range() function The built-in function range() is the right function to iterate over a sequence of numbers. It generates an iterator of arithmetic progressions.

Loop Control Statements break statement: Terminates the loop statement and transfers execution to the statement immediately following the loop. continue statement: Causes the loop to skip the remainder of its body and immediately retest its condition prior to reiterating. pass statement: The pass statement in Python is used when a statement is required syntactically but you do not want any command or code to execute.

Functions A function is a block of organized, reusable code that is used to perform a single, related action. Functions provide better modularity for your application and a high degree of code reusing. Defining a Function Calling a Function

Function Arguments Required arguments Keyword arguments Default arguments Variable-length arguments

File handling

Supervised Machine Learning Regression For regression analysis there is one independent set and one dependent set. When we have to model the relationship between dependent and independent variable, regression analysis will be applied. Whenever any dependent set is continuous or having numerical data, regression analysis can be applied

Limitations of Traditional Machine Learning Feature Engineering is Manual Scalability Issues Poor Performance on Unstructured Data Inability to Handle High-Dimensional Data Lack of Hierarchical Representations

What is Deep Neural Networks (DNNs)? A type of Artificial Neural Network (ANN) with multiple hidden layers. Designed to automatically learn features and representations from data. Used in fields like computer vision, speech recognition, and natural language processing (NLP).
Tags