Grid based method & model based clustering method
7,481 views
16 slides
Jan 19, 2020
Slide 1 of 16
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
About This Presentation
grid based metod
Size: 359.23 KB
Language: en
Added: Jan 19, 2020
Slides: 16 pages
Slide Content
GRID BASED METHOD & MODEL BASED CLUSTERING METHOD Submitted by, D.SHANMUGAPRIYA I-MSC(IT) NADAR SARASWATHI COLLEGE OF ARTS & SCIENCE
CONTENTS INTRODUCTION STING WAVECLUSTER CLIQUE-Clustering in QUEST FAST PROCESSING TIME
INTRODUCTION The grid based clustering approach uses a multi resolution grid data structure. The object space is quantized into finite number of cells that form a grid structure. The major advantage of this method is fast processing time. It is dependent only on the number of cells in each dimension in the quantized space.
STING Statistical information GRID. Spatial area is divided into rectangular cells Several levels of cells-at different levels of resolution High level cell is partitioned into several lower level cells. Statistical attributes are stored in cell. (mean , maximum , minimum)
Cont… Computation is query independent Parallel processing-supported. Data is processed in a single pass Quality depends on granuerily
WAVE CLUSTER A multi-resolution clustering approach which applies wavelet transform to the feature space A wavelet transform is a signal processing technique that decomposes a signal into different frequency sub-band Both grid-based and density-based Input parameters: # of cells for each dimension The wavelet , and the # of application wavelet transform.
WAVECLUSTER FEATURES Complexity O(N) Detect arbitrary shaped clusters at different scales. Not sensitive to noise , not sensitive to input order. Only applicable to low dimensional data.
CLIQUE (clustering in QUEST) CLIQUE can be considered as both density- based and grid-based 1.It partitions each dimension into the same number of equal length interval. 2.It partitions an m-dimensional data space into non-overlapping rectangular units. 3.A unit is dense if the fraction of total data points contained in the unit exceeds the input model parameter. 4.A cluster is a maximal set of connected dense units within a subspace.
MODEL BASED CLUSTERING METHODS Attempt to optimize the fit between the data and some mathematical model. ASSUMPTION :-data are generated by a mixture of underlying portability distributes. TECHNIQUES: expectation-maximization Conceptual clustering Neural networks approach
EXPECTATION MAXIMIZATION ITERATIVE REFINEMENT ALGORITHM- used to find parameter estimates EXTENSION OF K-MEANS Assigns an object to a cluster according to a weight representing portability of membership. Initial estimate of parameters Iteratively reassigns scores.
CONCEPTUAL CLUSTERING A form of clustering in machine learning Produces a classification scheme for a set of unlabeled objects. Finds characteristics description for each concept COBWEB A popular and simple method of incremental conceptual learning. Creates a hierarchical clustering in the form of a classification tree.
NEURAL NETWORK APPOROACH Represent each cluster as an exemplar , acting as a “prototype” of the cluster. New objects are distributed to the cluster whose exemplar is the most similar according to some distance measure. SELF ORGANIZING MAP Competitive learning Involves a hierarchical architecture of several units Organization of units-forms a feature map Web document clustering .
CLUSTERING HIGH-DIMENSIONAL DATA FEATURE TRANSFORMATION METHODS PCA , SVD-Summarize data by creating linear combinations of attributes. But do not remove any attributes ; transformed attributes-complex to interpret FEATURE SELECTION METHODS Most relevant of attributes with represent to class labels Entropy analysis .