Grid based method & model based clustering method

7,481 views 16 slides Jan 19, 2020
Slide 1
Slide 1 of 16
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16

About This Presentation

grid based metod


Slide Content

GRID BASED METHOD & MODEL BASED CLUSTERING METHOD Submitted by, D.SHANMUGAPRIYA I-MSC(IT) NADAR SARASWATHI COLLEGE OF ARTS & SCIENCE

CONTENTS INTRODUCTION STING WAVECLUSTER CLIQUE-Clustering in QUEST FAST PROCESSING TIME

INTRODUCTION The grid based clustering approach uses a multi resolution grid data structure. The object space is quantized into finite number of cells that form a grid structure. The major advantage of this method is fast processing time. It is dependent only on the number of cells in each dimension in the quantized space.

STING Statistical information GRID. Spatial area is divided into rectangular cells Several levels of cells-at different levels of resolution High level cell is partitioned into several lower level cells. Statistical attributes are stored in cell. (mean , maximum , minimum)

Cont… Computation is query independent Parallel processing-supported. Data is processed in a single pass Quality depends on granuerily

WAVE CLUSTER A multi-resolution clustering approach which applies wavelet transform to the feature space A wavelet transform is a signal processing technique that decomposes a signal into different frequency sub-band Both grid-based and density-based Input parameters: # of cells for each dimension The wavelet , and the # of application wavelet transform.

WAVECLUSTER FEATURES Complexity O(N) Detect arbitrary shaped clusters at different scales. Not sensitive to noise , not sensitive to input order. Only applicable to low dimensional data.

CLIQUE (clustering in QUEST) CLIQUE can be considered as both density- based and grid-based 1.It partitions each dimension into the same number of equal length interval. 2.It partitions an m-dimensional data space into non-overlapping rectangular units. 3.A unit is dense if the fraction of total data points contained in the unit exceeds the input model parameter. 4.A cluster is a maximal set of connected dense units within a subspace.

MODEL BASED CLUSTERING METHODS Attempt to optimize the fit between the data and some mathematical model. ASSUMPTION :-data are generated by a mixture of underlying portability distributes. TECHNIQUES: expectation-maximization Conceptual clustering Neural networks approach

EXPECTATION MAXIMIZATION ITERATIVE REFINEMENT ALGORITHM- used to find parameter estimates EXTENSION OF K-MEANS Assigns an object to a cluster according to a weight representing portability of membership. Initial estimate of parameters Iteratively reassigns scores.

CONCEPTUAL CLUSTERING A form of clustering in machine learning Produces a classification scheme for a set of unlabeled objects. Finds characteristics description for each concept COBWEB A popular and simple method of incremental conceptual learning. Creates a hierarchical clustering in the form of a classification tree.

COBWEB CLUSTERING MODEL Animal P(Co)=1.0 P(scales | Co)=0.25 Fish P(C1)=0.25 P(scales|C1)=1.0 Amphibian P(C2)=0.25 P(moist|C2)=1.0 Mammal/bird P(C3)=0.5 P(hair|C3)=0.5 Mammal P(C4)=0.5 P(hair|C4)=1.0 Bird P(C5)=0.5 P(feathers|c5)=1.0

NEURAL NETWORK APPOROACH Represent each cluster as an exemplar , acting as a “prototype” of the cluster. New objects are distributed to the cluster whose exemplar is the most similar according to some distance measure. SELF ORGANIZING MAP Competitive learning Involves a hierarchical architecture of several units Organization of units-forms a feature map Web document clustering .

CLUSTERING HIGH-DIMENSIONAL DATA FEATURE TRANSFORMATION METHODS PCA , SVD-Summarize data by creating linear combinations of attributes. But do not remove any attributes ; transformed attributes-complex to interpret FEATURE SELECTION METHODS Most relevant of attributes with represent to class labels Entropy analysis .
Tags