Introduction to Capsule Networks (CapsNets)

aureliengeron 10,933 views 55 slides Nov 22, 2017

Slide 1 of 55

About This Presentation

CapsNets are a hot new architecture for neural networks, invented by Geoffrey Hinton, one of the godfathers of deep learning.
You can view this presentation on YouTube at: https://youtu.be/pPN8d0E3900

NIPS 2017 Paper:
* Dynamic Routing Between Capsules,
* by Sara Sabour, Nicholas Frosst, Geoffrey E...

Size: 2.39 MB

Language: en

Added: Nov 22, 2017

Slides: 55 pages

Slide Content

Capsule Networks Aurélien Géron, November 2017 https://youtu.be/pPN8d0E3900

NIPS 2017 Paper Dynamic Routing Between Capsules by Sara Sabour, Nicholas Frosst, Geoffrey E. Hinton October 2017: https://arxiv.org/abs/1710.09829

Computer Graphics Rectangle x=20 y=30 angle=16° Triangle x=24 y=25 angle=-65° Instantiation parameters Image R endering

Inverse Graphics Instantiation parameters Image Inverse r endering Rectangle x=20 y=30 angle=16° Triangle x=24 y=25 angle=-65°

Capsules Capsule activations Image Inverse rendering = =

Activation vector: Capsules Length = estimated probability of presence Orientation = object’s estimated pose parameters = =

Squash( u ) = Capsules = = Convolutional Layers Reshape Squash || u || 2 1 + || u || 2 u || u ||

Equivariance = =

A hierarchy of parts Boat x=22 y=28 angle=16°

A hierarchy of parts Rectangle x=20 y=30 angle=16° Triangle x=24 y=25 angle=-65° Boat x=22 y=28 angle=16°

A hierarchy of parts Rectangle x=20 y=30 angle=-5° Triangle x=26 y=31 angle=137° House x=22 y=28 angle=-5°

Primary Capsules = = Primary Capsules

Predict Next Layer’s Output = = Primary Capsules

Predict Next Layer’s Output = = One transformation matrix W i , j per part/whole pair ( i , j ). û j | i = W i , j u i Primary Capsules

Predict Next Layer’s Output = = Primary Capsules

Compute Next Layer’s Output = = Predicted Outputs Primary Capsules

Routing by Agreement = = Predicted Outputs Primary Capsules Strong a greement!

The rectangle and triangle capsules should be routed to the boat capsules. Routing by Agreement = = Predicted Outputs Primary Capsules Strong agreement!

Clusters of Agreement

Clusters of Agreement Mean

Routing Weights = = Predicted Outputs Primary Capsules b i , j =0 for all i , j

Routing Weights = = Predicted Outputs Primary Capsules 0.5 0.5 0.5 0.5 b i , j =0 for all i , j c i = softmax( b i )

Compute Next Layer’s Output = = Predicted Outputs s j = weighted sum Primary Capsules 0.5 0.5 0.5 0.5

Compute Next Layer’s Output = = Predicted Outputs Primary Capsules 0.5 0.5 0.5 0.5 s j = weighted sum v j = squash( s j )

Actual outputs of the next layer capsules (round #1) Compute Next Layer’s Output = = Predicted Outputs Primary Capsules 0.5 0.5 0.5 0.5 s j = weighted sum v j = squash( s j )

Actual outputs of the next layer capsules (round #1) Update Routing Weights = = Predicted Outputs Primary Capsules A greement

Actual outputs of the next layer capsules (round #1) Update Routing Weights = = Predicted Outputs Primary Capsules Agreement b i , j += û j | i . v j

Actual outputs of the next layer capsules (round #1) Update Routing Weights = = Predicted Outputs Primary Capsules Agreement b i , j += û j | i . v j Large

Actual outputs of the next layer capsules (round #1) Update Routing Weights = = Predicted Outputs Primary Capsules Disa greement b i , j += û j | i . v j Small

Compute Next Layer’s Output = = Predicted Outputs Primary Capsules 0.2 0.1 0.8 0.9

Compute Next Layer’s Output = = Predicted Outputs s j = weighted sum Primary Capsules 0.2 0.1 0.8 0.9

Compute Next Layer’s Output = = Predicted Outputs Primary Capsules s j = weighted sum v j = squash( s j ) 0.2 0.1 0.8 0.9

Actual outputs of the next layer capsules (round #2) Compute Next Layer’s Output = = Predicted Outputs Primary Capsules 0.2 0.1 0.8 0.9

Handling Crowded Scenes = = = =

Handling Crowded Scenes = = = = Is this an upside down house?

Handling Crowded Scenes = = = = House Thanks to routing by agreement, the ambiguity is quickly resolved (explaining away). Boat

Classification CapsNet || ℓ 2 || Estimated Class Probability

Training || ℓ 2 || Estimated Class Probability To allow multiple classes, minimize margin loss: L k = T k max(0, m + - || v k || 2 ) + λ (1 - T k ) max(0, || v k || 2 - m - ) T k = 1 iff class k is present In the paper: m - = 0.1 m + = 0.9 λ = 0.5

Training Translated to English: “ If an object of class k is present, then || v k || 2 should be no less than 0.9. If not, then || v k || 2 should be no more than 0.1.” || ℓ 2 || Estimated Class Probability To allow multiple classes, minimize margin loss: L k = T k max(0, m + - || v k || 2 ) + λ (1 - T k ) max(0, || v k || 2 - m - ) T k = 1 iff class k is present In the paper: m - = 0.1 m + = 0.9 λ = 0.5

Regularization by Reconstruction || ℓ 2 || Feedforward Neural Network Decoder Reconstruction

Regularization by Reconstruction || ℓ 2 || Feedforward Neural Network Decoder Reconstruction Loss = margin loss + α reconstruction loss The reconstruction loss is the squared difference between the reconstructed image and the input image. In the paper, α = 0.0005.

A CapsNet for MNIST (Figure 1 from the paper)

A CapsNet for MNIST – Decoder (Figure 2 from the paper)

Interpretable Activation Vectors (Figure 4 from the paper)

Pros Reaches high accuracy on MNIST, and promising on CIFAR10 Requires less training data Position and pose information are preserved (equivariance) This is promising for image segmentation and object detection Routing by agreement is great for overlapping objects (explaining away) Capsule activations nicely map the hierarchy of parts Offers robustness to affine transformations Activation vectors are easier to interpret (rotation, thickness, skew…) It’s Hinton! ;-)

Not state of the art on CIFAR10 (but it’s a good start) Not tested yet on larger images (e.g., ImageNet): will it work well? Slow training, due to the inner loop (in the routing by agreement algorithm) A CapsNet cannot see two very close identical objects This is called “crowding”, and it has been observed as well in human vision Cons

Implementations Keras w/ TensorFlow backend: https://github.com/XifengGuo/CapsNet-Keras TensorFlow: https://github.com/naturomics/CapsNet-Tensorflow PyTorch: https://github.com/gram-ai/capsule-networks

Introduction to Capsule Networks (CapsNets)

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Introduction to Capsule Networks (CapsNets)

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Slide 52

Slide 53

Slide 54

Slide 55

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx