human action recognition with CNN is a thesis paper based on background reduction using maskrcnn and by using 3D cNN we can evaluate the result in two base model which is restnet50 and vgg16.

Research Paper Presentation Based On Human Action Recognition

Advisor Dr. Md. Abu Layek Associate Professor Department of Computer Science and Engineering Jagannath University Human AcHtion Recognition with Background substraction and 3D CNN

Presentation Agenda Evaluations and results Introduction Problem Statement Motivation Proposed Solution Background Study CNN Architecture VGG16 ResNet Methodology Tools Proposed Methodology Conclusion & Possible Improvements Summary Limitations & Future Literature Review Materials

Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion Introduction As described by the author, The reason for the lower accuracy is that some of the background elements in these classes are the same, hence our goal is to eliminate the background elements using pre-processing techniques.

Introduction Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

How deep learning influence to detect Human Action recognition? - Feature Extraction : It automates the extraction of relevant features from raw data, which is crucial for recognizing human actions. - Neural Networks : Utilizes complex neural networks capable of processing large volumes of video data to identify intricate action patterns. - Spatial-Temporal Analysis: Employs models like CNNs and RNNs to capture spatial and temporal dependencies, thereby improving recognition accuracy. Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion Introduction

Less accuracy in few classes ( Biking,Swing,Walking with Dog ) Because of same background elements Low input resolution. Problem Statement Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion 1. Clear the background noise as much as possible. 2. Develop an automatic Background remove system to fasten the process. Solution

HAR is a significant challenge for various reason Usage of cameras has expanded Identify any kind of crime or violence Motivation Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Proposed Solution Data Preprocessing Data Background Noise Redution Multiple CNN Architecture Result Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Deep Learning Deep learning is a subfield of machine learning based on ANN(Artificial Neural Network). Neural Network Shallow neural network Deep neural network It consist input layer one hidden layer output layer It consist input layer More than one hidden layer output layer Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

In deep learning the hidden units in hidden layers act like biological neuron. Each hidden unit called neuron It takes inputs from input layer and then process these inputs in each hidden units to make a sense or decision and then transfer the outputs from one hidden layer to other hidden layers. Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion Deep Learning

In deep learning, a convolutional neural network (CNN, or ConvNet ) is a class of deep neural networks, most commonly applied to analyze visual imagery. In CNN model , it consists three types of layer Convolutional layer Polling layer Fully Connected layer CNN (Convolutional Neural Network) Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Convolutional layer: Convolutional layers convolve the input and pass its result to the next layer. This layer extracts the feature with various kernel / filter. The objective of the Convolution Operation is to extract the high-level features such as edges from the input image. The first ConvLayer is responsible for capturing the Low-Level features such as color, gradient orientation, etc. With added layers, the architecture adapts to the High-Level features Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion CNN (Convolutional Neural Network)

Convolutional layer: Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion CNN (Convolutional Neural Network)

Pooling layer: Pooling layer is responsible for reducing the spatial size of the Convolved Feature. Decrease the computational power required to process the data through dimensionality reduction. There are two types of Pooling Max Pooling and Average Pooling Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion CNN (Convolutional Neural Network)

Presentation Agenda Evaluations and results Introduction Problem Statement Motivation Proposed Solution Background Study CNN Architecture VGG16 ResNet Methodology Tools Proposed Methodology Conclusion & Possible Improvements Summary Limitations & Future Literature Review Materials

Literature Review Reference Contribution Drawback Key Contribution Performance Comparison of ResNet50V2 and VGG16 Models for Feature Extraction in Deep Learning The study aimed to compare the performance of ResNet50V2 and VGG16 for feature extraction in image classification tasks. The paper suggests that while both models are effective, VGG16 may be less efficient due to slower convergence and lower accuracy in certain tasks. ResNet50V2 outperformed VGG16, exhibiting faster convergence and achieving higher accuracy in the context of masked face recognition. Human Action Recognition from Various Data Modalities The paper reviews the use of various data modalities in HAR, including the application of ResNet and VGG16 . The review does not provide a direct comparison between the models. It highlights the importance of multimodal data for improving the accuracy of HAR systems. Introduction Literature Review CNN Architecture Materials Evaluation Conclusion Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Reference Contribution Drawback Key contribution Modern architectures convolutional neural networks in human activity recognition Discusses the role of modern CNN architectures like ResNet and VGG16 in HAR Specific drawbacks of each model in the context of HAR are not detailed. Emphasizes the advancements in CNN architectures that enhance HAR performance. Literature Review Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Presentation Agenda Evaluations and results Introduction Problem Statement Motivation Proposed Solution Background Study CNN Architecture VGG16 ResNet Methodology Tools Proposed Methodology Conclusion & Possible Improvements Summary Limitations & Future Directions Literature Review Materials

Here, we have used some CNN architecture. VGG-16 ResNet-50 These architectures are success in competitions - the ImageNet Large Scale Visual Recognition Challenge (ILSVRC). CNN Architecture Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion evaluates algorithms for object detection and image classification at large scale

VGG16(Visual Geometry Group) : VGG16 is developed by oxford and win the ILSVR (ImageNet) competition in 2014. It has 16 layers. Layers Label Layers Quantity Convolutional layer 13 Fully Connected layer 3 Total 16 Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion CNN Architecture

ResNet 50: In 2015 ResNet was the winner of ImageNet challenge. In the ResNet 50 contains 50 layers. Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion CNN Architecture

Presentation Agenda Evaluations and results Introduction Problem Statement Motivation Proposed Solution Background Study CNN Architecture VGG16 ResNet Methodology Tools Proposed Methodology Conclusion & Possible Improvements Summary Limitations & Future Directions Literature Review Materials

Materials Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Presentation Agenda Evaluations and results Introduction Problem Statement Motivation Proposed Solution Background Study CNN Architecture VGG16 ResNet InceptionV3 Methodology Tools Proposed Methodology Conclusion Summary Limitations & Future Directions Literature Review Materials

Methodology Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion Tools CPU 64 bit RAM 32 GB Operating System Windows 11 Programming Language Python H/W And S/W Requirements

Data Collection Data are collected from Kaggle’s data repository . This dataset is composed a set of 101 subjects. we will be using the UCF101 dataset. It has 101 classes of human action where each of the classes contains more than 100 videos on average. The frames will be extracted from our dataset, and any background elements will be removed before we begin processing the data. Furthermore, we will maintain the 224*224 resolution of the images. Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Background subtraction by MaskRCNN Extracting Frames Training the frames in ResNet CNN Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion Model Working Processes

Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion Background subtraction Background subtraction using MaskRCNN

Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion RestNet 50 Model ResNet Model

Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion Model Development One of the first things we did after gathering the data was to extract images from each video. After that, we removed the background, taking into account only the most crucial components that were required for the detection of a certain object.

Presentation Agenda Evaluations and results Introduction Problem Statement Motivation Proposed Solution Background Study CNN Architecture VGG16 ResNet Methodology Tools Proposed Methodology Conclusion & Possible Improvements Summary Limitations & Future Literature Review Materials

Evaluation Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion 80% Training Testing Accuracy More Than 90% accuracy in new videos Background element was the issue Training Accuracy vs Testing Accuracy And Training Loss vs Testing Loss Of VGG16

Evaluation Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Evaluation Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion Training Accuracy vs Testing Accuracy And Training Loss vs Testing Loss Of ResNet50

Evaluation Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Evaluation Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion Used Model Accuracy Precision Recall F-1 Score ResNet 93.93% 95% 93% 94% VGG-16 51.68% 47% 56% 52%

Presentation Agenda Evaluations and results Introduction Problem Statement Motivation Proposed Solution Background Study CNN Architecture VGG16 ResNet Methodology Tools Proposed Methodology Conclusion & Possible Improvements Summary Limitations & Future Literature Review Materials

Same approach can be implemented in various video classification problem Conclusion Limitations Lack of original large dataset with variety of subjects. Study depends on only built-in CNN architectures. Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Conclusion Future Directions Custom Object Detection needed CNN+LSTM Model can be implemented further. Pose estimation values can be added in the model Introduction Literature Review CNN Architecture Materials Methodology Evaluation Conclusion

Bibliography T. Lima, B. Fernandes and P. Barros, "Human action recognition with 3D convolutional neural network," 2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI),2017, pp. 1-6, doi : 10.1109/LA-CCI.2017.8285700. Saoudi , E.M., Jaafari , J. and Andaloussi , S.J., 2023. Advancing human action recognition: A hybrid approach using attention-based LSTM and 3D CNN. Scientific African, 21, p.e01796. de la Torre Frade , F., MARTINEZ MARROQUIN, E., SANTAMARIA PEREZ, M.E. and MORAN MORENO, J.A., 1997. Moving object detection and tracking system: a real-time implementation. LeCun, Y. and Bengio, Y., 1995. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10), p.1995. Li, Liyuan, Weimin Huang, Irene YH Gu, and Qi Tian. "Foreground object detection from videos containing complex background." In Proceedings of the eleventh ACM international conference on Multimedia, pp. 2-10. 2003. Zhou, Q., 2001. Tracking and classifying moving objects from videos. In Proc. 2nd IEEE Workshop on Performance Evaluation of Tracking and Surveillance, 2001. Pham, H.H., Khoudour , L., Crouzil , A., Zegers , P. and Velastin , S.A., 2022. Video-based human action recognition using deep learning: a review. arXiv preprint arXiv:2208.03775. Yang, C., Mei, F., Zang, T., Tu, J., Jiang, N. and Liu, L., 2023. Human Action Recognition Using Key-Frame Attention-Based LSTM Networks. Electronics, 12(12), p.2622.

THANKS!

human action recognition with CNN is a thesis paper based on background reduction using maskrcnn and by using 3D cNN we can evaluate the result in two base model which is restnet50 and vgg16.

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

human action recognition with CNN is a thesis paper based on background reduction using maskrcnn and by using 3D cNN we can evaluate the result in two base model which is restnet50 and vgg16.

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......