Tutorial on Object Detection (Faster R-CNN)

hpkim0512 8,101 views 30 slides Jan 23, 2018
Slide 1
Slide 1 of 30
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30

About This Presentation

CSE WinterSchool 2018 Tutorial on Machine Learning 발표자료
https://2018winterschool.weebly.com/


Slide Content

Tutorial Faster R-CNN Object Detection: Localization & Classification Hwa Pyung Kim Department of Computational Science and Engineering, Yonsei University [email protected]

  Bounding box regression (localization): Where? Object Detection: Classification + Regression A dog at         Dog Cat   Person Classification (recognition): What? Objection Detection Feature map Encoding ( conv&pool ) Combining features   w h Bounding box information top left corner position w = width h = height  

Dog Cat Person   pool5 features     Input image     # of pooling       Vgg16 Networks Pooling CNN-based Object Detection: There are clues of dog ( What ) at local position ( Where ) in the convolution feature map Fully-connected layers Classification Regression     These red boxes contains clues of “dog at the bounding box ”.       Dog

Multiple Object Detection: Localize and Classify all objects appearing in the image How many objects are in there? Classify these multiply overlapping objects Identify their bounding boxes PASCAL VOC2007

Background Person Dining table Extract “region proposals” using selective search method. ConvNet Region based CNN (R-CNN) method CNN input (fixed size) Affine image warping: Compute fixed-size CNN input from each region proposal, regardless of the region’s shape Affine image warping Classifier & Regressor Classifier & Regressor Classifier & Regressor

Fast R-CNN feature map ConvNet Classifier & Regressor Extract “region proposals” using selective search method. RoI pooling RoI pooling: Convert the features inside valid RoI into a small feature map with a fixed spatial extent.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks feature map Region Proposal Network RoI pooling proposals ConvNet Classifier & Regressor What is Region Proposal Network?

Region Proposal Network (RPN) Region Proposal Network     # of pooling # of filters         Conv feature map RPN RPN outputs a set of rectangular object proposals, each with an objectness score. How? Region proposals

Region Proposal Network Conv feature map       Region Proposals & Anchor Boxes     Fully-connected layers Input : each sliding window   For each sliding window ( red cuboid ) expressed by a vector the proposal is parametrized relative to an anchor.     Output : coordinates: scores: that estimate probability of object or not object for each proposal   Anchor box information center position width height   Anchor box For example,   and are fixed. is determined by the position of the red box  

Region Proposals & Anchor Boxes       Conv feature map       Fully-connected layers   and are fixed. is determined by the position of the red box   9 Anchor boxes = 3 ratios 3 scales   For example,   Output : For , coordinates: scores: that estimate probability of object or not object for each proposal   For each sliding window ( red cuboid ) expressed by a vector the 9 proposals are parametrized relative to 9 anchors.   Input : each sliding window Region Proposal Network         For ,   Anchor box information center position width height  

Region Proposal Network Fully-connected layers Conv feature map Anchor boxes       For ,                 Extract 9 Proposals relative to 9 Anchors Proposals                

    Total # of windows # of proposals per a window Total # of proposals :   Conv feature map The proposals highly overlaps each other! Need to reduce redundancy. Generate Region Proposals       Total # of windows =   Region Proposal Network

Reduce redundancy by Non-Maximum Suppression (NMS)                 Most probable proposal Region Proposal Network Step 1 . Take the most probable proposal from 1485 proposals Proposal information top left corner position width height objectness probability,            

Region Proposal Network Step 2 . Compute the between the most probable and the other proposals, and reduce proposals having   Step 1 . Take the most probable proposal from 1485 proposals Reduce redundancy by Non-Maximum Suppression (NMS)                        

Region Proposal Network Step 1 . Take the most probable proposal from 1485 proposals Reduce redundancy by Non-Maximum Suppression (NMS)                       Step 2 . Compute the between the most probable and the other proposals, and reduce proposals having    

Most probable proposal 30 proposals having IoU >0.7 are discarded. Region Proposal Network Given the most probable proposal, the blue proposals have   Summary of step 1-2 in NMS. Step 3: Get the next most probable proposal among the rest proposals & repeat the previous process.   Next most probable proposal 36 proposals having IoU >0.7 are discarded. Reduce redundancy by NMS

Region Proposal Network Before NMS After NMS 1,485 proposals 300 proposals Repeats the previous procedure until… Reduce redundancy by NMS

Summary of RPN Inputs: Conv feature map Outputs: Region proposals coordinates. Probabilities representing how likely the image in that region proposal will be an object. Region Proposal Network

feature map Region Proposal Network RoI pooling proposals ConvNet Now we are ready to explain Classifier & Regressor . Classifier & Regressor Classifier & Regressor

RoI pooling layer Proposal       Classifier & Regressor Bilinear interpolation & Max pooling Input for Classifier & Regressor : fixed-size Conv feature map Bilinear interpolation & Max pooling Convert the features inside valid RoI into a small feature map with a fixed spatial extent.                            

  300 RoI pooled feature maps RoI pooling layer generates inputs for Classifier & Regressor Classifier & Regressor                        

                      RoI pooling Classification & Regression per each proposal   Background Person TV monitor   Fully-connected layers                 Proposal Classification & Bounding-box regression Each of the 21 classes gets its own refined bounding-box prediction and assign estimated probability. Classifier & Regressor          

Summary of Classification & Regression Regress & classify each class from proposals   Background Person TV monitor     Reduce redundancy by NMS Dining table   None None Classifier & Regressor Discard bounding boxes ( or background)         Region Proposals

Summary of Classifier & Regressor Inputs: Conv feature map Region proposals Outputs: Bounding boxes coordinate of objects in the image. Classification of bounding boxes Classifier & Regressor

Training process for RPN Ground-truth proposals associated with anchors Find the nearest bounding box from each anchors, Ground-truth probability of objectness : Ground-truth proposal transformation : where   Predicted proposals Predicted probability of objectness : Predicted proposal transformation : where   Anchor boxes where   Input Image   Ground-truth Bounding boxes where Classes    

Training process for Classifier & Regressor Input Image   Ground-truth Bounding boxes where Classes   c Ground-truth Classification & Regression associated with proposals Find the nearest bounding box from each proposals Ground-truth Classification : Ground-truth Regression : where     Predicted Classification & Regression Predicted Classification : Predicted Regression : where   Region Proposals associated with anchors where    

The History of object detection in deep learning Yolo Yolo v2 SSD RCNN Fast RCNN Faster RCNN Mask RCNN DSSD 2012.12 AlexNet 2014.9 VggNet & InceptionNet 15.12.10 ResNet 2013.11.11 2015.4.30 2015.5.14 15.6.8 15.12.25 15.12.08 17.1.23 17.3.20

Application to Ultrasound-based Fetal biometry

References [ Gitbooks ] Object Localization and Detection https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/object_localization_and_detection.html [ICCV2015 Tutorial] Convolutional Feature Maps https://courses.engr.illinois.edu/ece420/sp2017/iccv2015_tutorial_convolutional_feature_maps_kaiminghe.pdf [Infographic] The Modern History of Object Recognition https://github.com/Nikasa1889/HistoryObjectRecognition [ Tensorflow Code] tf -Faster-RCNN https://github.com/kevinjliang/tf-Faster-RCNN [Medium] A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4 [ pyimagesearch ] Intersection over Union ( IoU ) for object detection https://www.pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/ [Stanford c231n] Lecture 11: Detection and Segmentation http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf

Thank you E-mail: [email protected] / Hompage : https://hpkim0512.blogspot. com
Tags