Mask R-CNN

JaehyunJun 740 views 21 slides Sep 07, 2017
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

Mask R-CNN present a conceptually simple, flexible, and general framework for object instance segmentation. This approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by...


Slide Content

Mask R-CNN CM Seminar 2017.09.01 Jaehyun Jun Biointelligence Laboratory Interdisciplinary Program of Neuro Science, Seoul National Univertisy http://bi.snu.ac.kr

Overview Task object detection: classify objects and localize using bounding box instance segmentation: classify each pixel into a fixed set of categories © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr 2

Main Idea Goal: develop a comparably enabling framework for instance segmentation Extension of Faster R-CNN RoIPool -> RoIAlign decouple mask and class prediction 3 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Related Work R-CNN (Region-based CNN) Bounding box: Selective Search -> AlexNet -> linear regression Classification: Selective Search -> AlexNet -> SVM Faster R-CNN Bounding box: Region Proposal Network (RPN) Bounding box + Classification : extract feature using RoIPool RoIPool : large negative effect on predicting pixel-accurate mask misalign problem 4 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

History R-CNN -> SPP-net SPP-net -> Fast R-CNN 5 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

History Fast R-CNN -> Faster R-CNN 6 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr Faster R-CNN -> Mask R-CNN

Related Work ResNeXt increasing cardinality is more effective than depth or width of networks ResNeXt works better than ResNet having same number of parameters 7 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Mask R-CNN Two stage 1. Bounding box: Region Proposal Network (RPN ) 2. Class and Box offset Output binary mask one for each classes Loss L = L cls + L box + L mask L mask : average binary cross-entropy loss positive RoIs 8 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Region Proposal Networks 1. Propose k reference boxes on sliding-window 2. Map sliding-window into lower-dimensional feature 3.1. feed on box-regression layer 3.2. feed on box-classification layer 9 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

RoIPooling 10 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

RoIPooling - Differentiable 11 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

RoIPooling - SGD step 12 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr Region-wise sampling to make mini-batches

RoIAlign RoIPool : misalignment problem RoIAlign : use decimal points as boundary size & apply bilinear interpolation (e.g. [x/16] -> x/16) 13 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Architecture Backbone: feature extraction ResNet-50 or ResNet-101 ResNeXt-50 or ResNeXt-101 Feature Pyramid Network (FPN) C4 or C5 (convolutional layer) RoIAlign : aligning the extracted features with the input quantization X -> bilinear interpolation [x/16] -> x/16 Head: bounding-box recognition & mask prediction ResNet-C4 -> 9-layer ‘res5’ FPN -> res5 + filter 14 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Dataset MS COCO Object Instance Annotations Object Keypoint Annotations 15 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Experiment Mask R-CNN vs. FCIS+++ no artifacts 16 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Experiment RoIPool vs. RoIWarp vs. RoIAlign 17 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Experiment Mask R-CNN vs. Faster R-CNN Improve 3~6% 18 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Experiment Human Pose Estimation ( Keypoint Detection) 19 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Experiment Cityscapes 1 st on Instance Level Semantic Labeling Task 20 © 2017, SNU Biointelligence Lab., http://bi.snu.ac.kr

Q & A