Learning paradigms Syed Ali Raza Lecturer Senior Research Member Cybernetic Intelligence Research Lab GC University Lahore
Why “Learn” ? Machine learning is to program computers to optimize a performance criterion using example data or past experience. Learning is used when: Humans are unable to explain their expertise (speech recognition) Solution changes in time (routing on a computer network) Solution needs to be adapted to particular cases (user biometrics) 2
What We Talk About When We Talk About“Learning” Given a data set D , a task T, and a performance measure M , A computer system is said to learn from D to perform the task T if after learning the system’s performance on T improves as measured by M . In other words, the learned model helps the system to perform T better as compared to no learning.
An example application An emergency room in a hospital measures n number of variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether to put a new patient in an intensive-care unit. Due to the high cost of ICU, those patients who may survive less than a month are given higher priority. Problem: to predict high-risk patients and discriminate them from low-risk patients.
Another example A credit card company receives thousands of applications for new cards. Each application contains information about an applicant Age Marital status Annual salary Outstanding debts Credit rating Problem: to decide whether an application should approved, or to classify applications into two categories, approved and not approved.
Supervised learning algorithm Learning (training): Learn a model using the training data Testing: Test the model using unseen test data to assess the model accuracy Training Data Test Data Learning Algorithm model Accuracy Step 1: Training Step 2: Testing
Classification Predicts categorical class labels Classifies data based on the training set and the values ( class labels ) in a classifying attribute and uses it in classifying new data Credit scoring Differentiating between low-risk and high-risk customers from their income and savings
Classification Process Classification Algorithms Training Data Classifier (Model) IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’
Classification Process: ( Jafri , Professor, 4) Tenured? Testing Data Classifier Unseen Data
Classification: Applications Pattern recognition Face detection and recognition Character recognition Speech recognition Use of a dictionary or the syntax of the language. Sensor fusion: Combine multiple modalities; eg, visual (lip image) and acoustic for speech Medical diagnosis 11
Regression Example: Price of a used car x : car attributes y : price y = g ( x | θ ) g ( ) model, θ parameters
Regression applications Navigating a car: Angle of the steering wheel Kinematics of a robot arm α 1 = g 1 ( x , y ) α 2 = g 2 ( x , y ) α 1 α 2 ( x , y ) Response surface design
Prediction of future cases Use the rule to predict the output for future inputs Knowledge extraction Learning a rule from data Compression Finding a rule simpler than the data it explains Outlier detection Exceptions that are not covered by the rule, e.g., fraud 14 Supervised learning
Unsupervised Learning Learning “what normally happens” No predefined output Clustering: Grouping similar instances Example applications Customer segmentation in CRM Image compression: Color quantization Bioinformatics: Learning motifs 15
Clustering Clustering is a technique for finding similarity groups in data, called clusters . I.e., it groups data instances that are similar to (near) each other in one cluster and data instances that are very different (far away) from each other into different clusters. Clustering is unsupervised learning task as no class values denoting an a priori grouping of the data instances are given, which is the case in supervised learning. 16
An illustration The data set has three natural groups of data points, i.e., 3 natural clusters.
What is clustering for? Let us see some real-life examples Example 1 : groups people of similar sizes together to make “small”, “medium” and “large” T-Shirts. Tailor-made for each person: too expensive One-size-fits-all: does not fit all. Example 2 : In marketing, segment customers according to their similarities To do targeted marketing. 18
What is clustering for? (cont…) Example 3: Given a collection of text documents, we want to organize them according to their content similarities, To produce a topic hierarchy In fact, clustering is one of the most utilized data mining techniques. It has a long history, and used in almost every field, e.g., medicine , psychology, botany, sociology, biology, archeology , marketing, insurance, libraries, etc. In recent years, due to the rapid increase of online documents, text clustering becomes important.
Aspects of clustering A clustering algorithm Partitioning clustering Hierarchical clustering A distance (similarity, or dissimilarity) function Clustering quality The quality of a clustering result depends on the algorithm, the distance function, and the application.
Reinforcement Learning Reinforcement learning is supervised learning in which limited information of the desired outputs is known Complete knowledge of the environment is not available; only basic benefit or reward information In other words, a critic rather than a teacher guides the learning process Reinforcement learning has roots in experimental studies of animal learning Training a dog by positive (“good dog”, something to eat) and negative (“bad dog”, nothing to eat) reinforcement 21
Reinforcement Learning Associative Associating action and stimuli. In other words, developing a action-stimuli mapping from reinforcement information received from the environment. Non-associative Selecting one action instead of associating actions with stimuli. The only input received from the environment is reinforcement information. Examples include genetic algorithms 22
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 Arrows indicate strength between two problem states Start maze …
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 The first response leads to S2 … The next state is chosen by randomly sampling from the possible next states weighted by their associative strength Associative strength = line width
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 Suppose the randomly sampled response leads to S3 …
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 At S3, choices lead to either S2, S4, or S7. S7 was picked (randomly)
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 By chance, S3 was picked next…
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 Next response is S4
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 And S5 was chosen next (randomly)
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 And the goal is reached …
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 Goal is reached, strengthen the associative connection between goal state and last response Next time S5 is reached, part of the associative strength is passed back to S4...
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 Start maze again…
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 Let’s suppose after a couple of moves, we end up at S5 again
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 S5 is likely to lead to GOAL through strenghtened route In reinforcement learning, strength is also passed back to the last state This paves the way for the next time going through maze
Start S 2 S 3 S 4 S 5 Goal S 7 S 8 The situation after lots of restarts …
Key Observations The reinforcement signal can be any signal evaluating the learning system's actions, not just a success/failure signal Often it takes on real values, and the objective of learning is to maximize its expected value. The critic does not directly tell the learning system how to change its actions. Reinforcement learning algorithms are selection processes. There must be variety in the action-generation process so that the consequences of alternative actions can be compared to select the best.