Gradient Based Learning Applied to Document Recognition

244 views 18 slides May 14, 2022
Slide 1
Slide 1 of 18
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18

About This Presentation

This presentation is review of the original paper 'Gradient Based Learning Applied to Document Recognition' by Le Cun et al. Here I have performance of LeNet-5 with LeNet-1 , LeNet-4 and KNN. In result, it is obtained that LeNet-5 is the best one among all methods used in comparison.


Slide Content

Gradient Based Learning Applied to document recognition Submitted by- Gyanendra Awasthi (Roll no. -201315) INSTRUCTOR- Prof. Nishchal K. Verma

Intoduction Authored by- Yann LeCun Leon Botteou Yoshua Bengio Patrick Haffiner Presented in the proceedings of the IEEE November 1998 Yann LeCun and other co-authors introduced LeNet-5 architecture through this paper Main message of the paper is building better pattern recognition system relying more on automatic learning than hand–design heuristic

Introduction Traditional Approach Feature Extraction Module Trainable Classifier Module Limitations to this approach Feature extraction is hand crafted Classifier use low dimensional spaces in learning techniques Enabling of high dimensional spaces due to Availability of low cost machines relying on ‘numerical methods’ Availability of large database Availability of powerful machine learning techniques

Introduction Why Gradient based Learning method Easier to minimize a smooth continuous function than discrete function Minimal processing requirements Gradient Back- Propagation This with sigmoidal units can solve complicated learning tasks if applied to multi layer neural network

Convolution Neural Network(CNN) CNN is standard form of neural network architecture associated with image recognition CNN architectures are most popular deep learning framework Its applications are in marketing, healthcare, retail, automotive Characteristics of CNN architectures are- Local Receptive Fields Sub-sampling Weight Sharing Fig. A typical architecture of CNN

LeNet-5 Convolution Neural Network LeNet-5 Architecture comprises of 7 layers(not counting the input layer) 3 Convolutional Layers ( Cx ) 2 Subsampling Layers ( Sx ) 2 Fully Connected Layers ( Fx ) where x denotes the layer’s index.

LeNet-5 Architecture First Layer (C1) Convolutional layer with 6 feature map of size 28*28 Each unit in each feature map is connected to a 5*5 neighborhood in the input Contains 156 trainable parameters and 122,304 connections Second Layer (S2) A subsampling layer with 6 feature map of size 14*14 Each unit in each feature map is connected to a 2*2 neighborhood in the corresponding feature map in C1 Contains 12 trainable parameters and 5880 connections Third Layer (C3) A Convolutional layer with 16 feature map of size 28*28 Each unit in each feature map is connected to a 5*5 neighborhoods at identical locations in S2 feature map

LeNet-5 Architecture Fourth Layer (S4) A subsampling layer with 16 feature map of size 5*5 Each unit in each feature map is connected to a 2*2 neighborhood in the corresponding feature map in C3 Contains 32 trainable parameters and 2000 connections Fifth Layer (C5) A convolutional layer with 120 feature map of size 1*1 Each unit in each feature map is connected to a 5*5 neighborhood on all 16 of S4’s features Has 48120 trainable connections Sixth Layer (F6) A fully connected layer contains 84 units and fully connected to C5 Has 10164 trainable parameters

LeNet-5 Architecture

LeNet-1 Architecture Consists of 3 convolutional layers 2 Subsampling layers Number of parameters is about 3000 The architecture is as such 28×28 input image Four 24×24 feature maps convolutional layer (5×5 size) Average Pooling layers (2×2 size) Eight 12×12 feature maps convolutional layer (5×5 size) Average Pooling layers (2×2 size) Directly fully connected to the output Fig. LeNet-1 Architecture

LeNet-4 Architecture Consists of : 3 convolutional layers 2 Subsampling layers 1 Full connection layers Contains about 260,000 connections and 17,000 free parameters In LeNet-4, the input is 32*32 input layer in which 20*20 images (not deslanted ) were centred by centre of mass.

Database- Modified NIST Dataset NIST- National Institute of Standard and Technology database MNIST database Consists of handwritten images of digits from 0 to 9 Subset of famous NIST Dataset Images centered in 28*28 pixels in black and white Dataset of 70,000 images of which 60,000 are for training and remaining 10,000 for test set MNIST Dataset Training Set, 60000 Test set, 10000 Fig. MNIST Dataset

Some Training Images

Results: LeNet-1 MNIST database is used in which 10% dataset is used for validation and remaining for training. Test loss: 0.05995325744152069 Test accuracy: 0.9811999797821045 As number of epochs increases, training loss and validation loss decreases. Also training accuracy and validation accuracy increases with number of epochs.

Results: LeNet-4 Test loss: 0.052783720195293427 Test accuracy: 0.9832000136375427 As number of epochs increases, training loss and validation loss decreases. Also training accuracy and validation accuracy increases with number of epochs. The test accuracy of LeNet-4 is more than LeNet-1. Also, test loss is less than of LeNet-1.

Results: LeNet-5 Test loss: 0.038686320185661316 Test accuracy: 0.9866999983787537 As number of epochs increases, training loss and validation loss decreases. Also training accuracy and validation accuracy increases with number of epochs. The test accuracy of LeNet-5 is more than both of LeNet-4 and LeNet-1. Also, test loss is lesser than both of LeNet-1and LeNet-4.

Results: Comparison of various classifiers on MNIST Dataset

Thank You!