In this Lunch & Learn session, Chirag Jain gives us a friendly & gentle introduction to Machine Learning & walks through High-Level Learning frameworks using Linear Classifiers.
Size: 12.08 MB
Language: en
Added: May 11, 2018
Slides: 35 pages
Slide Content
Introduction to Machine Learning Chirag Jain , ML Engineer
About Hapti k Chatbot platform for publishers, advertisers and enterprises AI powered conversational interface to drive customer engagement Reach of 30 Million Users , processing 5 Million Chats per month One of the world’s largest chatbot platforms Started in 2013 , global pioneers of chatbots
How this talk is divided Part 1: AI Introduction and applications Introduction New and Old news about AI Part 2: ML Introduction and workflow Introduction Part 3: High Level Learning framework Code (and some Math) walkthrough of linear classifier
What is AI ? Demonstration of human like intelligence by machines. A machine performing any task that needs human level intelligence can be said to be “Artificially Intelligent”
AI In everyday life today Email Categorization Web search Targeted (annoying) Ads
AI In everyday life today Maps & Navigation Computer Games Digital Assistants
Few ML success stories in the past 3 years
Few ML success stories in the past 3 years
Few ML success stories in the past 3 years Neural Style Transfer Controllable Image Generation ( Xianxu Hou et. al.)
Major Goals of AI Reasoning and Problem Solving Knowledge Representation Autonomy and Planning Self Learning via Experiences ← Machine Learning is a part of this Natural language processing Sensory Perception
Major Goals of AI Motion and Manipulation Social Intelligence General/Super Intelligence ← Media tries to sell you this
Sciences involved in AI research Computer Science Mathematics Psychology Linguistics Philosophy Many Others
Philosophy around AI Is general/super intelligence possible ? Do they have to be similar to human systems to be intelligent as us ? Can intelligent machines be dangerous ? Should we prefer more accurate systems over transparent systems ?
The vagueness and the hype Real Story: Task was to learn negotiation in natural language, not some efficient cryptic language. Researchers only reported a failed experiment trail.
The vagueness and the hype “ Artificial Intelligence - The Revolution Hasn’t Happened Yet ” - Michael I. Jordan [1] Artificial Intelligence - the “high-level” or “cognitive” capability of humans to “reason” and to “think.” Intelligence Augmentation - services that augment human intelligence and creativity. Intelligent Infrastructure - a web of computation, data and physical entities that make human environments more supportive and safe. “ The impossibility of intelligence explosion ” - François Chollet [2] [1] https://medium.com/@mijordan3/artificial-intelligence-the-revolution-hasnt-happened-yet-5e1d5812e1e7 [2] https://medium.com/@francois.chollet/the-impossibility-of-intelligence-explosion-5be4a9eda6ec
Few words on “Deep learning” Image Credits: http://neuralnetworksanddeeplearning.com/chap5.html
AI, ML, NN, DL are not new! First Programmable Computer ≈ 1936 AI research began ≈ 1956 Neural Networks - base ideas as early as 1943, polished idea ≈ 1958, research active since 1990s Deep learning - first idea proposed in 1965 , early implementations ≈ 1965 - 1971, research active since 1990s Large NNs were computationally infeasible to train back then NNs and DL went into “hibernation” for more than a decade
Resurgence of “AI” because of Deep Learning Training complex models has become feasible now Large datasets are available for some tasks Compute power has increased exponentially - we now have very powerful GPUs/TPUs Theoretical ideas in research have been polished over time Much better tools to work with! Theano,Tensorflow (Google), Keras (now Google), Torch/PyTorch(Facebook), CNTK (Microsoft), Caffe (UCB), MXNet(Apache, Amazon), sklearn, gensim, nltk
Resurgence of “AI” because of Deep Learning DL model winning in Large Scale Image Recognition Competition in 2012 by huge margins [1] proved as an inflection point. DL has continuously produced promising results in other areas. [1] ImageNet Classification with Deep Convolutional Neural Networks (Krizhevsky et. al.)
Machine Learning Blends ideas from statistics, computer science, operations research, pattern recognition, information theory, control theory and many other disciplines to design algorithms that find low-level patterns in data, make predictions and help make decisions (at scale).
Typical Machine Learning Pipeline
Common Taxonomy of ML methods Supervised Learning - some feedback is available Completely Supervised Learning Semi-Supervised Learning Active learning Reinforcement Learning Unsupervised Learning - no explicit ground truths Meta learning ...
Common Tasks for ML Classification (usually supervised) Regression (usually supervised) Clustering (unsupervised) Dimensionality Reduction ...
Classification Task is to learn to categorize input into discrete classes E.g. Input: Image Output: probabilities of image containing {dog, cat, horse, zebra} Supervised task, we have true labels for each input Metrics: To keep things simple, we will use accuracy - how many things the classifier can classify correctly. Selecting a metric depends on the data + problem
Logistic Regression - A simple linear classifier Notebook to follow along https://gist.github.com/chiragjn/24b548785d99a393fca9dccfe1439d4a
Gradient Descent Your loss function may look something like this
Gradient Descent But let’s take a simpler example
Gradient Descent Optimum value is at the bottom
Gradient Descent You spawn at some random point
Gradient Descent Gradient at any point points in direction of steepest change
Gradient Descent Learning rate is the scaling factor of the gradient step i.e. how much to nudge each variable involved
Gradient Descent Keep learning rate small, Take smaller steps
Gradient Descent Large learning rate can cause overshoots
Other things that we don’t have time for Non-Linear classifiers Learning Methods that don’t use Gradient Descent Other Metrics: Precision, Recall, F1 Overfitting and underfitting And many more tricks of the trade