identifying and classifying precise hand gestures, limited to a predefined set of distinct movements.

wajdibellil 6 views 31 slides Sep 16, 2025
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

identifying and classifying precise hand gestures, limited to a predefined set of distinct movements.


Slide Content

Hand Gesture Recognition and
Speech Conversion for Deaf and
Dumb Peoples
THE CAIN PROJECT
Supervisor: Dr. Faris Saleh Almehmadi
1
Mohammed Alziyadi 421004889
Omar Awad Alatawi 421003066
Abdulrahman marzouq albalawi 421005937
Abdullah najy alsubhi 391003129

2
Content

3
Project Scope
•Hand gesture recognition: The proposed project
focuses on identifying and classifying precise hand
gestures, limited to a predefined set of distinct
movements.
•Speech Conversion: after gestures recognition, the
system transforms them to text and then based on
Text-to-Speech (TTS) technology.
•Target users: The proposed solution is established for
deaf and dumb people who can understand basic
hand gestures.

4
Problem Statement

Deaf and dumb people face major difficulties in their
regular lives, making their oral communication hard or
impossible.

These complications affect their professional
prospects, their private relationships, and especially
their access to public services.

The existent communication devices most have some
limitations either in their cost, or often they are huge so
their use can be difficult, which makes them
inappropriate for fast or spontaneous communication
circumstances.

5
Proposed solution
The proposed solution is summarized as follows:
•Raspberry Pi: offers sufficient performance for data
processing and recognition algorithm management.
•USB Camera or Raspberry Pi Camera Module:
capture hand gestures in real time.
•Microphone: record voice or ambient sounds for voice
interactions.
•Speaker: For speech synthesis and the restitution of
messages converted into speech.

6
Project Benefits

The main benefits of this project are:
Improved the quality of life: The ability to turn gestures into
speech and utilize them to express demands and opinions
has improved people's quality of life.
Social integration: This gadget helps users participate more
in social events by facilitating communication, which lessens
their feeling of loneliness.
Personalized solution: Because it uses natural motions, this
gesture recognition technology is easier to use than other
assisted communication devices.
Accessibility: The cost and ease of use of computer vision
and speech synthesis technology are both getting lower.

7
Related Works: Types
Of Gesture

Gestures communicative are
broken down into active
gestures and symbolic
gestures.

Symbolic gestures are not
directly understandable; we
have to be initiated to
understand their meaning.

These are for example the
gestures of sign languages.

8
Related Works: Static and
dynamic gestures

There are two other categories of gestures: static
gestures, or postures, and dynamic gestures.

By combining these two aspects, the classification
proposed by Harling and Edwards is obtained:
static position, static configuration (postures);
static position, dynamic configuration;
 dynamic position, static configuration (e.g. pointing
gestures);
dynamic position, dynamic configuration (e.g., sign
language).

9
Related Works

Data gloves are equipped with sensors providing the
hand position and finger joint angles.

Data gloves have long been used for sign language
recognition, because they provide accurate and
reliable positions of the hand joints.

Unfortunately, these gloves are expensive and
cumbersome, their use is restrictive for the user.

System Design
10
10

Components: Raspberry Pi4

The main features of this product include a high-
performance 64-bit quad-core processor performance,
dual display support at resolutions up to 4K through a
pair of micro-HDMI ports, a hardware video decoder up
to 4 Kp60, up to 4 GB of RAM, dual band 2.4/5 GHz,
wireless LAN, Bluetooth 5.0, Gigabit Ethernet, USB 3.0
and PoE capability (via a separate PoE HAT add-on).
11
11

Components: Raspberry Pi4
12
12

Components: Raspberry Pi
HD Camera

The Raspberry Pi High Quality Camera (figure 3.5) is a
cheap first-rate digital digital camera from Raspberry Pi.

It gives 12-megapixel decision and a 7.9mm-diagonal
sensor for astounding low-mild performance.
13
13

Components: Raspberry Pi
Robotic Hand

Powered via way of means of Raspberry Pi, uHandPi is
an AI clever imaginative and prescient robot hand.

Apart from image processing, uHandPi can perform
device learning, and teach numerous fashions with the
assist of Tensorflow+Keras.
14
14

Components: Raspberry Pi
Microphone

The USB Microphone is a Plug & Play microphone that
can be plugged into the USB port of Raspberry Pi.

It works with all Raspberry Pi models from the Model
B+. 
15
15

Components: Raspberry Pi
Speaker

This speaker is trendy for Raspberry Pi 40PIN GPIO
extension header.

It Integrates WM8960 low energy stereo CODEC,
communicates through I2S interface.
16
16

F
l
o
w

c
h
a
r
t

d
i
a
g
r
a
m
17
17

Software Language and tools

OpenCV is an open-source computer vision and
machine learning software library.

OpenCV is a huge open-source library for computer
vision, machine learning, and image processing.
18
18

Software Language and tools

Python is an interpreted, object-oriented, high-degree
programming language with dynamic semantics.

Its high-degree constructed in records structures,
mixed with dynamic typing and dynamic binding, make
it very appealing for Rapid Application Development, in
addition to be used as a scripting language to attach or
connect existing components together.
19
19

Results & Discussion

The MediaPipe pipeline implementation consists on the
following steps:

After processing the acquired image, a palm
detector model rotates the image using the hand's
orientated bounding box.

The hand landmark model provides 3D hand key
points after processing a cropped bounding box
image.

A gesture recognizer which generates a distinct set
of gestures by classifying 3D hand key points.
20
20

Results & Discussion

BlazePalm is the first palm detector incorporated into
the MediaPipe architecture.

Using the non-maximum suppression approach on the
palm

Reduce the focus loss during training with support for a
large number of anchors coming from the high scale
variance.

Regression is used to identify the 21 essential points
precisely with a 3D hand-knuckle coordinate inside the
identified hand areas.
21
21

Results & Discussion

The Hand Landmark results in a direct coordinate
prediction, which is a model of the hand landmark in
MediaPipe
22
22

Results & Discussion

Recognizing hand gestures to display a word selected
by the user using a camera is the primary goal of the
study.

We employed four recorded hand motions, to
immediately displayed a word to make the sentence:
“hello, good morning what time is it, thank you”.

We collected our data and produced a tiny dataset of
400 samples for this project.

These datasets include four different types of hand
movements.
23
23

Results & Discussion
24
24

Results & Discussion

Confusion Matrix was
used to measure the
model's performance in
machine learning.

The scikit-learn module in
Python can be used to
create a confusion matrix.

Accuracy rate of 98%
25
25

Results & Discussion
26
26

Project management
27
27
Task
ID
Task Name
Start
Date
End
Date
Duratio
n
(Weeks)
Milestone
1Project PlanningWeek 1Week 2 1
Project Plan
Approved
2
Component
research
Week 3Week 3 1
Research
Completed
3
Hardware
procuration
Week 3Week 4 2
Dataset
Prepared
4
Prototyping &
assembly
Week 4Week 5 2
Model
Architecture
Finalized
5Programming Week 5
Week
12
6
Final code and
testing
Completed
6
Documentation
and reporting
Week 5
Week
14
2
Project
Submitted

Project Budget
28
28
Items Price in $
Raspberry PI 4 61.29
UhandPi 300
HD Camera 42.99
Microphone 16.79
Speaker 21.90
5V power supply 7.99
Total 450.96

Conclusion
29
29

Communication between humans and computer tools
has evolved alongside development digital technologies.

It is in this context that this project takes place. The
objective is to implement a tool allowing us to interact in
a natural way using our hands.

This project deals with themes of artificial intelligence,
image processing and human-machine interaction.

The objective of this project was to enable
communication between humans and machines.

Future Works
30
30

As future work:

We propose to experiment with both static and
dynamic hand gesture recognition systems.

Expand our system to collaborate with other
gadgets and human body parts.

31
Tags