ResNet_Presentation_paper explained_slide

studyhang01 9 views 10 slides Sep 16, 2025
Slide 1
Slide 1 of 10
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10

About This Presentation

Resnet


Slide Content

Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun ILSVRC 2015 Winner, arXiv:1512.03385

Problem & Motivation Problem: Deeper networks become harder to train (vanishing gradients, degradation). Observation: Plain networks beyond 20–30 layers often perform worse. Motivation: Enable very deep networks (50–152 layers) to train effectively. Significance: Depth is critical for hierarchical feature learning in images.

Method – Residual Learning Reformulated mapping: learn residual F(x) = H(x) – x. Final output: H(x) = F(x) + x. Shortcut connections directly pass x to deeper layers. Key idea: fitting residuals is easier than direct mapping.

Method – Residual Block & Architecture Basic Residual Block: Conv–BN–ReLU–Conv–BN + identity skip. Enables extremely deep networks (e.g., 152 layers). More efficient than VGG while achieving better performance. Variants: ResNet-18, 34, 50, 101, 152.

Results – ImageNet ResNet achieved 3.57% top-5 error on ImageNet test set. Outperformed VGG-16 and GoogLeNet. 1st place at ILSVRC 2015 classification task.

Results – Other Benchmarks CIFAR-10: Plain nets degrade with depth; ResNets train up to 1000 layers. COCO Detection: ~28% improvement using ResNet backbones. Winning entries in ImageNet & COCO detection/localization/segmentation.

Strengths Enables training of very deep networks (>100 layers). Significant accuracy improvements on ImageNet, CIFAR, COCO. Simple yet powerful architecture. Influence: Foundation for modern architectures including Transformers.

Limitations Resource-intensive: requires more compute and memory. Diminishing returns beyond ~1000 layers. Initially designed for computer vision tasks only.

Summary Problem: Degradation in deep nets. Solution: Residual learning with skip connections. Impact: Enabled training of very deep networks (152 layers). Results: Record-breaking accuracy on ImageNet and COCO. Strengths: Scalable, effective, foundational. Limitations: Compute-heavy, diminishing returns.

Discussion Inspired later models: DenseNet, ResNeXt, EfficientNet. Residual connections used in Transformers (NLP, Vision). Residual learning is now a universal building block.
Tags