COURSE OBJECTIVES
CLO1: To understand the fundamentals of computer vision and digital
image processing
CLO2: To introduce the processes involved image enhancement and
restoration.
CLO3: To facilitate the students to gain understanding color image
processing and morphology.
CLO5: To impart the knowledge of image segmentation and object
recognition techniques.
5
COURSE OUTCOMES
At the end of the course, the student will be able to :
1. Explain the fundamentals of computer vision and its applications.
2. Apply the image enhancement techniques for smoothing and
sharpening of images.
3. Compare the different image restoration and segmentation
techniques.
4. Demonstrate the smoothing and sharpening techniques for color
images.
5. Explain morphological, feature extraction, and pattern classification
techniques for object recognition.
6
A PICTURE IS WORTH A THOUSAND WORDS
7
EVERY IMAGE TELLS A STORY
8
Traffic scene
Number of vehicles
Type of vehicles
Location of closest
obstacle
Assessment of
congestion
Location of the scene
captured
Traffic rules assessment
Critical scene
identification
Emotion recognition
GOAL OF COMPUTER VISION
9
Is to perceive the “story” behind the picture
Compute properties of the world
Understand 3D shape
Recognize people or objects
Analyze patterns
Count objects and assign labels
Improve and enhancement
WHAT IS COMPUTER VISION?
1. Computer Graphics:
INPUT: Scene RepresentationOUTPUT: Image
2. Image Processing:
INPUT: Image OUTPUT: Image
Image enhancement, reconstruction, filter, compression
3. Computer Vision:
INPUT: Image OUTPUT: Interpretation
Image analysis, Image interpretation, scene understanding
10
Computer Vision is a field of computer science that
works on enabling computers to see, identify and
process images in the same way that human vision
does and then provide appropriate output.
Building machines that see
Want machines to interact with the real world
Modeling biological perception of human eye
11
WHAT IS COMPUTER VISION?
12
WHAT IS COMPUTER VISION?
VISION
Vision is the process of discovering what is
present in the world and where it is by seeing.
13
COMPUTER VISION
Computer Vision is the study of analysis of
pictures and videos in order to interpret the real
world similar to those as by human beings.
14
COMPUTER VISION
15
Image
COMPUTER VISION VS GRAPHICS
16
CV reconstructs image properties, such as shape, illumination,
and color distributions.
17
WHY COMPUTER VISION?
An image is worth 1000 words
Many biological systems rely on vision
The world is 3D and dynamic
Cameras and computers are cheap
…
18
COMPUTER VISION EXAMPLE 1
Finding People in images
Problem1: Given an image ‘I’
Question: Does image ‘I’ contain an image of
a person?
19
”YES” INSTANCES
20
”NO” INSTANCES
21
COMPUTER VISION EXAMPLE 2
Finding People in images
Problem 2: Given an image ‘I’
Question: Recognize different objects in ‘I’.
22
23
24
sky
building
flag
wall
banner
bus
cars
bus
face
street lamp
HUMAN PERCEPTION HAS ITS SHORTCOMINGS
25
HUMAN PERCEPTION HAS ITS SHORTCOMINGS
26
HUMAN PERCEPTION HAS ITS SHORTCOMINGS
27
28
BUT HUMANS CAN TELL A LOT ABOUT A SCENE
FROM A LITTLE INFORMATION
29
APPLICATIONS OF COMPUTER VISION
30
A good number of computer vision
techniques are being used today in a
wide variety of real-world
applications, which include:
1. Real-life & Industry applications
2. Consumer-level applications
REAL-WORLD APPLICATIONS
31
Computer Vision is used in
many real-world applications,
including healthcare,
transportation, security, and
manufacturing.
OPTICAL CHARACTER RECOGNITION (OCR)
32
Reading handwritten
postal codes on letters
and automatic number
plate recognition
(ANPR) of vehicles.
MACHINE INSPECTION
33
Rapid parts inspection
for quality assurance
using stereo vision with
specialized illumination
to measure tolerances on
aircraft wings or auto
body parts or looking for
defects in steel castings
using X-ray vision.
RETAIL:
34
Object recognition for
automated checkout lanes and
fully automated stores.
Analyze visual data and
improve customer experience.
●
Inventory management
●
Theft prevention
●
Personalized marketing
●
Cashierless checkout
●
Virtual mirroring
●
Behavioral analytics
●
Heat maps
WAREHOUSE LOGISTICS:
35
Autonomous package
delivery and pallet-
carrying “drives” and
parts picking by
robotic manipulators.
MEDICAL IMAGING:
36
Registering pre-operative
and intra-operative imagery
or performing long-term
studies of people’s brain
morphology as they age.
Helps doctors diagnose,
monitor, and treat medical
conditions. Include X-rays,
CT scans, MRIs,
ultrasounds, and nuclear
medicine.
3D MODEL BUILDING (PHOTOGRAMMETRY):
37
Fully automated
construction of 3D
models from aerial and
drone photographs.
Transforming a 2D image
into a 3D space.
MATCH MOVE:
38
Merging computer-
generated imagery (CGI)
with live action footage
by tracking feature points
in the source image or
video to estimate the 3D
camera motion and shape
of the environment.
MOTION CAPTURE (MOCAP):
39
Using retro-reflective
markers viewed from
multiple cameras or
other vision-based
techniques to capture
actors for computer
animation.
SURVEILLANCE:
40
Monitoring for intruders,
analyzing highway traffic
and monitoring pools
for drowning victims.
Computer vision systems
interpret and analyze
video feeds in real-time,
significantly enhancing
security system
capabilities.
FINGERPRINT RECOGNITION AND BIOMETRICS:
41
For automatic access
authentication as well as
forensic applications.
FINGERPRINT RECOGNITION AND BIOMETRICS:
42
CONSUMER-LEVEL APPLICATIONS
43
In addition to all of these real-life and
industrial applications, there exist numerous
consumer-level applications, such as things
you can do with your own personal
photographs and video. These include:
IMAGE STITCHING (MOSAIC):
44
Turning overlapping photos into a single
seamlessly stitched panoramic view.
IMAGE STITCHING EXAMPLE:
45
EXPOSURE BRACKETING:
46
Exposure bracketing is where you take a sequence of
photographs with different exposure levels, and then blend
them together to create a photograph with a much higher
dynamic range.
Merging multiple exposures taken under challenging
lighting conditions (strong sunlight and shadows) into a
single perfectly exposed image.
EXPOSURE BRACKETING:
47
IMAGE MORPHING:
48
Turning a picture of one object into another,
using a seamless morph transition.
3D MODELING:
49
Converting one or more snapshots into a 3D
model of the object or person you are
photographing.
3D MODELING:
50
VIDEO MATCH MOVE AND STABILIZATION:
51
Inserting 2D pictures or 3D models into your
videos by automatically tracking nearby
reference points or using motion estimates to
remove shake from your videos.
PHOTO-BASED WALKTHROUGHS:
52
Navigating a large collection of photographs,
such as the interior of your house, by flying
between different photos in 3D.
FACE DETECTION & TAGGING:
53
For improved camera focusing as well as more
relevant image searching.
VISUAL AUTHENTICATION:
54
Automatically logging family members onto
your home computer as they sit down in front
of the webcam.
A BRIEF HISTORY (TIMELINE OF CV)
A BRIEF HISTORY: 1970’S.
●
Designed to model visual perception of robots with
intelligent behavior based on higher-level reasoning and
planning.
●
First program was written in 1966 by MIT undergraduate
student Gerald Jay Sussman to connect a camera to a
computer and get the computer to describe what it saw.
●
In 1976 digital image processing techniques came into
existance to extract the 3D structure of the world from
images and to use this as a stepping stone towards full
scene understanding.
●
Early attempts at scene understanding involved extracting
edges and then inferring the 3D structure of an object or a
“blocks world” from the topological structure of the 2D
lines (Fig. a).
A BRIEF HISTORY: 1970’S
A BRIEF HISTORY: 1970’S.
●
3D modeling of non-polyhedral objects was also being studied.
●
One popular approach used generalized cylinders, i.e., solids
of revolution and swept closed curves, often arranged into
related parts (Figure c).
●
Sometimes they are also called as elastic arrangements of parts
pictorial structures (Figure b).
●
A qualitative approach to understanding intensities and
shading variations and explaining them by the effects of image
formation phenomena, such as surface orientation and shadows
for intrinsic images are studied (Figure d).
●
More quantitative approaches to computer vision were also
developed at the time, including the first of many feature-based
stereo correspondence algorithms (Figure e).
●
Intensity-based optical flow algorithms are studied (Figure f).
A BRIEF HISTORY: 1980’S.
●
In the 1980s, a lot of attention was focused on more
sophisticated mathematical techniques for performing
quantitative image and scene analysis.
●
Image pyramids started being widely used to perform tasks
such as image blending (Figure a) and coarse-to-fine
correspondence search.
●
Continuous versions of pyramids using the concept of scale-
space processing were also developed.
●
In the late 1980s, wavelets started displacing or augmenting
regular image pyramids in some applications.
●
The use of stereo as a quantitative shape cue was extended by a
wide variety of shape-from-X techniques, including shape
from shading.
A BRIEF HISTORY: 1980’S
A BRIEF HISTORY: 1980’S.
●
Research into better edge and contour detection (Figure 1.8c)
was also active during this period.
●
Introduction of dynamically evolving contour trackers such as
snakes, as well as three-dimensional physically based models
are studied (Figure 1.8d).
●
Researchers noticed that a lot of the stereo, flow, shape-from-X,
and edge detection algorithms could be unified, or at least
described, using the same mathematical framework.
●
Online variants of MRF algorithms that modeled and updated
uncertainties using the Kalman filter were introduced.
●
3D range data processing (acquisition, merging, modeling,
and recognition; see Figure 1.8f) continued being actively
explored during this decade.
A BRIEF HISTORY: 1990’S.
●
Projective invariants for recognition of object
motion.
●
Simultaneously, factorization techniques.
●
Bundle adjustment techniques for photogrammetry.
●
Global optimization techniques.
●
Graph cuts
●
Multi-view stereo algorithms
●
Image tracking and reconstructing techniques.
●
Image segmentation
●
Statistical learning models
●
Computer Graphics