Image Segmentation by Clustering
(Using Mahalanobis Distance)
-Manjit Chintapalli
What is Image Segmentation ?
To humans, an image is not just a random
collection of pixels; it is a meaningful
arrangement of regions and objects.
There also exits a variety of images: natural
scenes, paintings, etc. Despite the large
variations of these images, humans have no
problem to interpret them.
Image Segmentation ? (Contd..)
Image segmentation is the first step in
image analysis and pattern
recognition.
It is a critical and essential component
of image analysis system, is one of the
most difficult tasks in image
processing, and determines the quality
of the final result of analysis.
The Process…
Image segmentation is the process of
dividing an image into different
regions such that each region is
homogeneous.
Image Segmentation methods
Image segmentation methods can be
categorized as follows (this is not an
exhaustive list):
Histogram thresholding
Edge-based approaches
Region-based approaches
Hybrid: consider both edges and
regions.
Continued…
Segmentation of texture by PCA
Segmentation of texture with
moments
Color Image Segmentation
Image segmentation based on color
We use clustering to segment an
image according to color features
Clustering ?
Clustering is a common step to get a
segmentation
In the data space, clusters are
regarded as regions of similar data
points
Clustering….
Similar data points grouped together into
clusters.
Clustering…
Most popular clustering algorithms suffer from two
major drawbacks
First, the number of clusters is predefined, which
makes them inadequate for batch processing of
huge image databases
Secondly, the clusters are represented by their
centroid and built using an Euclidean distance
therefore inducing generally an hyperspheric cluster
shape, which makes them unable to capture the
real structure of the data.
This is especially true in the case of color clustering
where clusters are arbitrarily shaped
Clustering Algorithms
K-means
K-medoids
Hierarchical Clustering
There are many other algorithms used
for clustering.
K-means Clustering Algorithm
Step 1: Choose K cluster centers:
c¹(1),c²(1),c³(1)…..
Step 2: At the kth iterative step distribute the
samples among the K cluster
domains, using the relation
for all x in S(j), if |x –Cj (k)| < |x –Cj (k)|
Step 3: Compute the new cluster centers
Algorithm continued …
Step 4: If the algorithm has converged
and the procedure is terminated.
Otherwise go to Step 2
S(j): contains data points in RGB space
x: is the data point in an iteration
C(j): is the center of the cluster
How the problem was
approached ?
First the images are read from the
directory.
Then each image is transformed into a
3-D RGB space.
K-means clustering using Mahanalobis
distance and Euclidean distance was
applied.
Continued…..
Next, the mean color of each cluster is
calculated.
Then it is transformed back from 3-D
space.
Result : Segmented image
Mahalanobis Distance
M.D. is a very useful way of determining the
”similarity” of a set of values from an
”unknown”: sample to a set of values
measured from a collection of ”known”
samples
Superior to Euclidean distance because it
takes distribution of the points (correlations)
into account
Traditionally to classify observations into
different groups
Mahalanobis vs. other classical
statistical approaches
It takes into account not only the
average value but also its variance and
the covariance of the variables
measured
It compensates for interactions
(covariance) between variables
It is dimensionless
Mahalanobis Distance
D
t(x) = (x–m
t) S
-1
t (x–m
t)`
Mahalanobis
Distance
Euclidean Distance
The Euclidean distance is the straight-
line distance between two pixels
Euc dist = √((x1 -x2)² + (y1 -y2)²) ,
where (x1,y1) & (x2,y2) are two pixel
points or two data points
After executing the
program…
Original Images
More images…
During execution…
The images are transformed into 3-D RGB
space
K-means clustering is applied which groups
data points with similar color into clusters
Example:
Continued…
The data points with minimum
mahalanobis distance are grouped
together into clusters.
After grouping into clusters, the mean
color of the cluster is taken and
mapped back into the image
The program was executed with
different number of clusters
The same program was run with
Euclidean distance for distance
between data points
Continued…
Segmented Images
Segmented with 4 clusters
Original Image
Segmented with 6 clusters
Original Image 2 clusters
4 Clusters 6 Clusters
Original Image
4 Clusters
2 Clusters
6 Clusters
Mahalanobis vs. Euclidean
4 clusters Mahalanobis 4 clusters Euclidean
Original
Image
Original
Image
6 Clusters Mahalanobis 6 Clusters Euclidean
More detailed
description
using
Mahalanobis
Original
Image
6 Clusters Mahalanobis 6 Clusters Euclidean
Image more
clear than
Euclidean
Mahalanobis vs. Euclidean
Result :
•Using Mahalanobis distance in the k-
means clustering proved to be better
than Euclidean distance.
•This is because the Mahalanobis
considers the covariance between
variables.
Conclusion
The program runs with all jpeg images and
takes a lot of time to run for larger images.
This is due to the calculation of the inverse
of the covariance matrix for the
Mahalanobis distance.
The results show that Image segmentation
using Mahalanobis distance works and is
better than Euclidean distance.