Machine learning for beginners students.

digitalindia1231 15 views 69 slides Mar 02, 2025
Slide 1
Slide 1 of 69
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69

About This Presentation

This is machine learning ppt .


Slide Content

Ramanujan College , University of Delhi New Delhi-110019, India. Topic: Machine Learning

What is Machine Learning? “Learning is any process by which a system improves performance from experience.” - Herbert Simon Definition by Tom Mitchell (1998): Machine Learning is the study of algorithms that improve their performance P at some task T with experience E . A well- defined learning task is given by < P , T , E >. 3

Traditional Programming Machine Learning Computer Data Program Output Computer Data Output Program Slide credit: Pedro Domingos 4

4 Machine Learning Traditional Programming Program/Rules Data Output/ Answer Machine Learning Output/Answers Data Program/Rules/model Makes sense. Use Machine Learning Sort these numbers in decreasing order 2, 4, 18, 1, 77, 0, 85 Does not make sense. Do not use Machine Learning (what are the risks if you do?) Rock paper scissors

Introduction distinguish two approaches knowledge-based: a computer program whose logic encodes a large number of properties of the world, usually developed by a team of experts over many years. machine learning: extract information directly from historical data and extrapolate to make predictions

When Do We Use Machine Learning? ML is used when: Human expertise does not exist (navigating on Mars) Humans can’t explain their expertise (speech recognition) Models must be customized (personalized medicine) Models are based on huge amounts of data (genomics) Learning isn’t always useful: There is no need to “learn” to calculate payroll Based on slide by E. Alpaydin 5

7 Slide credit: Geoffrey Hinton Some more examples of tasks that are best solved by using a learning algorithm Recognizing patterns: Facial identities or facial expressions Handwritten or spoken words Medical images Generating patterns: Generating images or motion sequences Recognizing anomalies: Unusual credit card transactions Unusual patterns of sensor readings in a nuclear power plant Prediction: Future stock prices or currency exchange rates

8 Slide credit: Pedro Domingos Sample Applications Web search Computational biology Finance E- commerce Space exploration Robotics Information extraction Social networks Debugging software [Your favorite area]

Samuel’s Checkers- Player “Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.” - Arthur Samuel (1959) 9

Defining the Learning Task Improve on task T, with respect to performance metric P, based on experience E T: Playing checkers P: Percentage of games won against an arbitrary opponent E: Playing practice games against itself T: Recognizing hand- written words P: Percentage of words correctly classified E: Database of human- labeled images of handwritten words T: Driving on four- lane highways using vision sensors P: Average distance traveled before a human- judged error E: A sequence of images and steering commands recorded while observing a human driver. T: Categorize email messages as spam or legitimate. P: Percentage of email messages correctly classified. E: Database of emails, some with human- given labels Slide credit: Ray Mooney 10

Types of Learning 23

Types of Learning Supervised (inductive) learning Given: training data + desired outputs (labels) Unsupervised learning Given: training data (without desired outputs) Semi- supervised learning Given: training data + a few desired outputs Reinforcement learning Rewards from sequence of actions Based on slide by Pedro Domingos 24

23 Supervised Machine Learning Given data like this, how can we learn to predict the prices of other houses? as a function of the size of their living areas? House price = F (living area) What is an instance space? To make our housing example more interesting, let’s consider a slightly richer dataset in which we also know the number of bedrooms in each house: House price = F (living area, # bedrooms) What is an instance space?

Output An item drawn from an output space   Input An item drawn from an input space   System   14 Supervised Learning We consider systems that apply a function to input items x and return an output .   Dan Roth +/-

15 Supervised Learning In (supervised) machine learning, we deal with systems whose is learned from examples.   Output An item drawn from an output space   Input An item drawn from an input space   System  

16 Why use learning? We typically use machine learning when the function we want the system to apply is unknown to us, and we cannot “think” about it. The function could actually be simple.  

Output An item drawn from a label space   Input An item drawn from an instance space   Learned Model   17 Supervised Learning Target function   The space of all functions our algorithm “considers” is called the Hypothesis space.

18 Supervised learning: Training Give the learner examples in The learner returns a model   Labeled Training Data …   Learned model   Learning Algorithm is the model we’ll use in our application   ( Dan Roth, +) If (the…character of the …token is..) AND (the …. i s…. ) then Negative. Otherwise, Positive . An input example An element in the instance Space

19 Supervised learning: Testing Reserve some labeled data for testing Labeled Test Data …  

20 Supervised learning: Testing Labeled Test Data   Test Labels ...   Raw Test Data  

Apply the model to the raw test data Evaluate by comparing predicted labels against the test labels Test Labels   Raw Test Data   21 Supervised learning: Testing Learned model   Predicted Labels  

Data In, Model Out D A T A 𝒟 MACHINE LEARNING M O D EL 𝑓 ⋅ acceleration 𝑎 = Force 𝐹 m ass 𝑚 We would like to recover a model like this!

Data In, Model Out MACHINE LEARNING M O D EL 𝑓 ⋅ acceleration 𝑎 = Force 𝐹 mass 𝑚 ML Design (hypothesis class, loss function, optimizer, hyperparameters, features, …) 𝑚 𝑎 = 𝑤 + 𝑤 1 𝐹 + 𝑤 2 𝑚 + 𝑤 3 𝐹 ∗ 𝑚 + 𝑤 4 ( 𝐹 ) Example hypothesis class: (for varying values of 𝑤 , … 𝑤 4 ) 𝑚 𝑎 = + 𝐹 + 𝑚 + 𝐹 ∗ 𝑚 + 1 ( 𝐹 ) L e ar n i ng = f i nd i ng “ g oo d” v a l u e s f o r t he weig h t s 𝑤 , 𝑤 1 , … , 𝑤 # D A T A 𝒟

24 Key Issues in Machine Learning Modeling How to formulate application problems as machine learning problems ? How to represent the data? Learning Protocols (where is the data & labels coming from?) Representation What functions should we learn (hypothesis spaces) ? How to map raw input to an instance space? Any rigorous way to find these? Any general approach? Algorithms What are good algorithms? How do we define success? Generalization vs. over fitting The computational problem

25 Using supervised learning What is our instance space? Gloss: What kind of features are we using? What is our label space? Gloss: What kind of learning task are we dealing with? What is our hypothesis space? Gloss: What kind of functions (models) are we learning? What learning algorithm do we use? Gloss: How do we learn the model from the labeled data? What is our loss function/evaluation metric? Gloss: How do we measure success? What drives learning?

Output An item drawn from a label space   Input An item drawn from an instance space X   Learned Model   26 1. The instance space   Designing an appropriate instance space is crucial for how well we can predict .  

27 1. The instance space   When we apply machine learning to a task, we first need to define the instance space . Instances are defined by features: Boolean features: Is there a folder named after the sender? Does this email contain the word ‘class’? Does this email contain the word ‘waiting’? Does this email contain the word ‘class’ and the word ‘waiting’? Numerical features: How often does ‘learning’ occur in this email? How long is the email? How many emails have I seen from this sender over the last day/week/month? Bag of tokens Just list all the tokens in the input   Does it add anything if you already have the previous two features?

28 as a vector space   is an N-dimensional vector space (e.g. {0,1} N , ) Each dimension = one feature. Each is a feature vector (hence the boldface ). Think of = [ … ] as a point in :       Patient Age Clump thickness Tumor Color Distance from optic nerve Cell type

29 Good features are essential The choice of features is crucial for how well a task can be learned. In many application areas (language, vision, etc.), a lot of work goes into designing suitable features. This requires domain expertise. We can’t teach you what specific features to use for your task. But we will touch on some general principles

Output An item drawn from a label space   Input An item drawn from an instance space   Learned Model   30 2. The label space   The label space determines what kind of supervised learning task we are dealing with  

31 Supervised learning tasks I Output labels are categorical: Binary classification: Two possible labels Multiclass classification: possible labels Output labels are structured objects (sequences of labels, parse trees, graphs, etc.) Structure learning: multiple labels that are related (thus constrained)   Three events. When classifying the temporal relations between them we need to account for the relations between them.

32 Supervised learning tasks II Output labels are numerical: Regression (linear/polynomial): Labels are continuous-valued Learn a linear/polynomial function Ranking: Labels are ordinal Learn an ordering over input  

23 Supervised Learning

Output An item drawn from a label space   Input An item drawn from an instance space   Learned Model   34 3. The model   We need to choose what kind of model we want to learn

Types of Learning Supervised learning Input: Examples of inputs and desired outputs Output: Model that predicts output given a new input Unsupervised learning Input: Examples of some data (no “outputs”) Output: Representation of structure in the data Reinforcement learning Input: Sequence of agent interactions with an environment Output: Policy that maps agent’s observations to actions

23 Supervised Learning

23 Supervised Learning

38 Unsupervised learning: ?! In unsupervised learning the training data is unlabeled. The system tries to learn without a teacher. Unsupervised methods must rely solely on the intrinsic structure of data points to learn a good hypothesis. Thus, unsupervised methods do not need a teacher or domain expert who provides labels for data points. Two large families of unsupervised methods Clustering Feature Learning Two important applications of feature learning are dimensionality reduction and data visualization. Notation: In the clustering problem, we are given a training set, {x (1) , x (2) , ……. X (n) }, and want to group the data into a few cohesive “clusters.” but no labels y ( i ) are given. Unsupervised Learning

Unsupervised Learning Given x 1 , x 2 , ..., x n (without labels) The goal is to find groups of similar observations. Output hidden structure behind the x ’s – E.g., clustering 31

23 Unsupervised Learning The goal is to find groups of similar observations (Clustering).

[Source: Daphne Koller] Genes Individuals Unsupervised Learning Genomics application: group individuals by genetic similarity 32

Organize computing clusters Social network analysis Image credit: NASA/JPL- Caltech/E. Churchwell (Univ. of Wisconsin, Madison) Astronomical data analysis Market segmentation Slide credit: Andrew Ng Unsupervised Learning 33

Unsupervised Learning Independent component analysis – separate a combined signal into its original sources 34 Image credit: statsoft.com Audio from http://www.ism.ac.jp/~shiro/research/blindsep.html

Unsupervised Learning Independent component analysis – separate a combined signal into its original sources 35 Image credit: statsoft.com Audio from http://www.ism.ac.jp/~shiro/research/blindsep.html

Here are some of the most important Unsupervised learning algorithms . Clustering K-means DBSCAN Hierarchical Cluster Analysis (HCA) Visualization and dimensionality reduction Principal Component Analysis (PCA) Kernel PCA Association rule learning Apriori Eclat Anomaly detection and novelty detection One-class SVM Isolation Forest Unsupervised Learning

In supervised learning, we saw algorithms that tried to make their outputs mimic the labels y given in the training set. In that setting, the labels gave an unambiguous “right answer” for each of the inputs x. In contrast, for many sequential decision-making and control problems, it is very difficult to provide this type of explicit supervision to a learning algorithm. Example: if we have just built a four-legged robot and are trying to program it to walk, then initially, we have no idea what the “correct” actions to take are to make it walk, and so we do not know how to provide explicit supervision for a learning algorithm to try to mimic. In the reinforcement learning framework, The learning system, called an agent in this context, can observe the environment, select and perform actions, and get rewards in return (or penalties in the form of negative rewards). It must then learn by itself what is the best strategy, called a policy, to get the most reward over time A policy defines what action the agent should choose when it is in a given situation. Reinforcement Learning

Reinforcement Learning

Reinforcement Learning Examples: Credit assignment problem Game playing Robot in a maze Balance a pole on your hand 36

Machine learning is Programming 2.0 Traditional Programming Machine learning (ML) Revision:

Task specification in ML: programs → examples def compute_force(m, a): ‘’’ returns force (in N) needed to move mass m (in kg) at acceleration a (in m/s^2) ‘’’ F = m * a return F Mass m (kg) Acceleration a (m/s^2) Force F (N) 2.5 4 10 5 2 10 20 0.5 10 40 0.25 10 40 2.5 100 20 5 100 50 2 100 Here is a program to implement Newton’s second law of motion Here are some examples. Try to imitate them.

Task specification in ML: programs → examples def cow_or_turtle(image): ??? Here is a program to recognize an image as a cow or a turtle Here are some examples. Try to imitate them. “ c o w s ” “turtles”

Putting a trained ML system to use Here are some examples. Try to imitate them. “ c o w s ” “turtles” “ c o w ”

Putting a trained ML system to use Here are some examples. Try to imitate them. “ c o w s ” “turtles” “ t u rt le ”

F r am in g an ML p r o ble m ( Mi t chell ’ s P , T , E) Data curation (sourcing, scraping, collection, labeling) Data analysis / visualization ML Design (hypothesis class, loss function, optimizer, hyperparameters, features) Train model Validate / Evaluate Deploy (and generate new data) Monitor performance on new data ML Workflow

F r am in g an ML p r o ble m ( Mi t chell ’ s P , T , E) Data curation (sourcing, scraping, collection, labeling) Data analysis / visualization ML Design (hypothesis class, loss function, optimizer, hyperparameters, features) Train model Validate / Evaluate Deploy (and generate new data) Monitor performance on new data ML Workflow

F r am in g an ML p r o ble m ( Mi t chell ’ s P , T , E) Data curation (sourcing, scraping, collection, labeling) Data analysis / visualization ML Design (hypothesis class, loss function, optimizer, hyperparameters, features) Train model Validate / Evaluate Deploy (and generate new data) Monitor performance on new data ML Workflow Main focus of this class

F r am in g an ML p r o ble m ( Mi t chell ’ s P , T , D) Data curation (sourcing, scraping, collection, labeling) Data analysis / visualization ML Design (hypothesis class, loss function, optimizer, hyperparameters, features) Train model Validate / Evaluate Deploy (and generate new data) Monitor performance on new data ML Workflow Main focus of this class

F r am in g an ML p r o ble m ( Mi t chell ’ s P , T , D) Data curation (sourcing, scraping, collection, labeling) Data analysis / visualization ML Design (hypothesis class, loss function, optimizer, hyperparameters, features) Train model Validate / Evaluate Deploy (and generate new data) Monitor performance on new data ML Workflow Main focus of this class P r o j e ct

F r am in g an ML p r o ble m ( Mi t chell ’ s P , T , D) Data curation (sourcing, scraping, collection, labeling) Data analysis / visualization ML Design (hypothesis class, loss function, optimizer, hyperparameters, features) Train model Validate / Evaluate Deploy (and generate new data) Monitor performance on new data ML Workflow Main focus of this class P r o j e ct

23 Supervised Learning:

ML for Social Good Applying AI for social good | McKinsey Elizabeth Bondi et al., SPOT poachers in action: Augmenting conservation drones with automatic detection in near real time, AAAI 2018

40 Framing a Learning Problem

Designing a Learning System Choose the training experience Choose exactly what is to be learned – i.e. the target function Choose how to represent the target function Choose a learning algorithm to infer the target function from the experience Environment/ Experience Learner Knowledge Performance Element Based on slide by Ray Mooney Training data Testing data 41

Training vs. Test Distribution We generally assume that the training and test examples are independently drawn from the same overall distribution of data – We call this “i.i.d” which stands for “independent and identically distributed” If examples are not independent, requires collective classification If test distribution is different, requires transfer learning Slide credit: Ray Mooney 42

ML in a Nutshell Tens of thousands of machine learning algorithms Hundreds new every year Every ML algorithm has three components: Representation Optimization Evaluation Slide credit: Pedro Domingos 43

44 Slide credit: Ray Mooney Various Function Representations Numerical functions Linear regression Neural networks Support vector machines Symbolic functions Decision trees Rules in propositional logic Rules in first- order predicate logic Instance- based functions Nearest- neighbor Case- based Probabilistic Graphical Models Naïve Bayes Bayesian networks Hidden- Markov Models (HMMs) Probabilistic Context Free Grammars (PCFGs) Markov networks

45 Slide credit: Ray Mooney Various Search/Optimization Algorithms Gradient descent Perceptron Backpropagation Dynamic Programming HMM Learning PCFG Learning Divide and Conquer Decision tree induction Rule learning Evolutionary Computation Genetic Algorithms (GAs) Genetic Programming (GP) Neuro- evolution

47 Slide credit: Pedro Domingos Evaluation Accuracy Precision and recall Squared error Likelihood Posterior probability Cost / Utility Margin Entropy K- L divergence etc.

47
Tags