Deep Learning Deep learning seeks to learn rich hierarchical representations (i.e. features) automatically through multiple stage of feature learning process. Low-level features output Mid-level features High-level features Trainable classifier Feature visualization of convolutional net trained on ImageNet ( Zeiler and Fergus, 2013)
Learning Hierarchical Representations Image recognition Pixel → edge → texton → motif → part → object Text Character → word → word group → clause → sentence → story Low-level features output Mid-level features High-level features Trainable classifier Increasing level of abstraction
Example: Training the neural network A dataset Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Initialise with random weights
Training data Fields class 1.4 2.7 1.9 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Present a training pattern. Feed it through to get output. 1.4 2.7 0.8 1.9
Training data Fields class 1.4 2.7 1.9 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Compare with target output 1.4 2.7 0.8 1.9 error 0.8
Training data Fields class 1.4 2.7 1.9 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Adjust weights based on error 1.4 2.7 0.8 1.9 error 0.8
Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc … Present a training pattern. Compare with target output Adjust weights based on error. 6.4 2.8 0.9 1 1.7 error -0.1 Repeat this thousands, maybe millions of times – each time taking a random training instance, and making slight weight adjustments. Algorithms for weight adjustment are designed to make changes that will reduce the error.
Deep Learning Training Explained
What features might you expect a good NN to learn, when trained with data like this?
But what about position invariance ??? For example unit detectors were tied to specific parts of the image. Successive layers can learn higher-level features … etc … detect lines in Specific positions v Higher level detetors ( horizontal line, “RHS vertical lune” “upper loop”, etc… etc …
Why Deep Learning is useful? Manually designed features are often over-specified , incomplete and take a long time to design and validate Learned Features are easy to adapt , fast to learn Deep learning provides a very flexible , universal , learnable framework for representing world, visual and linguistic information. Can learn both unsupervised and supervised data. Utilize large amounts of training data
How Deep Learning is useful? Deep learning is a machine learning technique that teaches computers to do what comes naturally to humans: learn by example. Deep learning is a key technology behind driverless cars, enabling them to recognize a stop sign, or to distinguish a pedestrian from a lamppost. It is the key to voice control in consumer devices like phones, tablets, TVs, and hands-free speakers. In deep learning, a computer model learns to perform classification tasks directly from images, text, or sound. Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding human-level performance. Models are trained by using a large set of labeled data and neural network architectures that contain many layers.
Performance vs Sample Size Size of Data Performance Traditional ML algorithms Deep Learning algorithms
How Deep Learning Is Useful ViSENZE evelops commercial applications that use deep learning networks to power image recognition and tagging. Customers can use pictures rather than keywords to search a company's products for matching or similar items. Skymind has built an open-source deep learning platform with applications in fraud detection, customer recommendations, customer relations management and more. They provide set-up, support and training services. Atomwise applies deep learning networks to the problem of drug discovery. They use deep learning networks to explore the possibility of repurposing known and tested drugs for use against new diseases. Descartes Labs is a spin-off from the Los Alamos National Laboratory. They analyze satellite imagery with deep learning networks to provide real-time insights into food production, energy infrastructure and more.
Cont... Affectiva is an MIT Media Lab spinoff that uses deep learning networks to identify emotions from video or still images. Their software runs on Microsoft Windows, Android and iOS and works with both desktop and mobile platforms. Applications include real-time analysis of emotional engagement with broadcast or digital content, video games that respond to the player's emotions, and offline integration with a range of biosensors for use in research or commercial environments. Gridspace uses deep learning networks to drive sophisticated speech recognition systems. Their networks begin with raw audio and build up to topic, keyword and speaker identification. Their Gridspace Memo software is designed to identify speakers, keywords, critical moments, and time-spent-talking, along with providing group take- aways from conference calls. Gridspace Sift provides similar information about customer service and contact calls. Ditto Labs has built a detection system that uses deep learning networks to identify company brands and logos in photos posted to social media. Their software also identifies the environment in which brands appear and whether a brand is accompanied by a smiling face in the photo. Clients can use Ditto to track brand presence at events or locations, compare brand performance with competitors, and target advertising campaigns.
Drawbacks of Deep Learning It requires very large amount of data in order to perform better than other techniques. It is extremely computationally expensive to train due to complex data models. Moreover deep learning requires expensive GPUs and hundreds of machines. This increases cost to the users. Determining the topology/flavor/training method/ hyperparameters for deep learning is a black art with no theory to guide you. What is learned is not easy to comprehend. Other classifiers (e.g. decision trees, logistic regression etc ) make it much easier to understand what’s going on.