Applications of Machine Learning by Hayim Makabee - Predictive Analytics Expert at Pontis
Size: 7.48 MB
Language: en
Added: Oct 29, 2015
Slides: 18 pages
Slide Content
Applications of Machine Learning Hayim Makabee July/2015 Predictive Analytics Expert
Main applications of Machine Learning, by type of problem: Clustering Classification Recommendation 2 Applications of Machine Learning
Goal: Cluster observations into meaningful groups. 3 Clustering
“How to group customers for targeted marketing purposes?” “Which neighborhoods in a country are most similar to each other?” “What groups of insurance policy holders have high claim costs?” “How to group the products in a store based on their attributes?” “How to group pictures based on their description?” 4 Examples of Clustering Applications
Goal: Predict class from observations. 5 Classification
“Which category of products is most interesting to this customer?” “Is this movie a romantic comedy, documentary, or thriller?” “Is this review written by a customer or a robot?” “Will the customer buy this product?” “Is this email spam or not spam ?” 6 Examples of Classification Applications
Goal: Personalized recommendation of items to users. 7 Recommendation
“Which movies should be recommended to a user?” “If the user just listened to a song, which song would he like now?” “Which news articles are relevant for a user in a particular context?” “Which advertisements should be displayed for a user on a mobile app?” “Which products are frequently bought together?” 8 Examples of Recommendation Applications
In May 2015 Flickr released an automatic image tagging capability that mistakenly labeled a black man for an ape. Soon afterwards, Google came up with a photo labeling tool similar to Flickr, which made similar mistakes. Black men were tagged as gorillas. A recent Carnegie Mellon University study showed that Google displayed ads in a way that discriminated based on the gender of the user. 9 Failures of Machine Learning
10 ML Process Feature Engineering Learning (Training) Evaluation (Metrics) Deployment (Serving)
Derive new features from the initial data set: Aggregations: Count, Sum, Min, Max, Avg , Std Temporal: Elapsed time , Trends Continuous to Categorical: Converting real values to enumerations. Categorical to Binary: Converting enumerations to binary features. Domain-specific derived features. 11 Feature Engineering
12 Learning
Question: How to evaluate the predictive accuracy of your model? Answer: Partition the data set into Train and Test sets. Train is like the “past” you learn from, and Test is like the “future” you predict. 13 Evaluation
14 Evaluation of Binary Classification
15 Binary Classification: Selecting the Threshold
Accuracy = (True Positive + True Negative) / Total Population Precision = True Positive / (True Positive + False Positive) Recall = True Positive / ( True Positive + False Negative) Normally there is a trade-off between Precision & Recall. Business decision: Precision vs. Recall. 16 Metrics: Accuracy, Precision and Recall