Competitive Learning [Deep Learning And Nueral Networks].pptx
1,409 views
78 slides
Feb 11, 2024
Slide 1 of 78
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
About This Presentation
Deep Learning
Size: 1.83 MB
Language: en
Added: Feb 11, 2024
Slides: 78 pages
Slide Content
NEURAL NETWORKS BASED ON COMPETITION Prof. J. Ujwala Rekha
Unsupervised Learning Neural network learning is not restricted to supervised learning, wherein training pairs are provided, as with the pattern classification and pattern association problems and the more general input-output mappings. A second major type of learning for neural networks is unsupervised learning, in which the net seeks to find patterns or regularity in the input data.
Competitive Learning Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data . Models and algorithms based on the principle of competitive learning include winner-take-all nets, vector quantization, self-organizing maps , etc.
Architecture of Competitive Learnin g
Implementation of Competitive Learning
FIXED-WEIGHT COMPETITIVE NETS Many neural nets use the idea of competition among neurons to enhance the contrast in activations of the neurons. In the most extreme situation, often called Winner-Take-All, only the neuron with the largest activation is allowed to remain “on”. Examples of fixed-weigh competitive nets include Maxnet , Mexican Hat, Hamming net, etc.
M AXNET
M AXNET M AXNET is a specific example of a neural net based on competition that can be used as a subnet to choose the neuron whose activation is the largest. The M AXNET has M neurons that are fully interconnected with symmetric weights, each neuron is connected to every other neuron in the net, including itself. There is no training algorithm for the M AXNET; the weights are fixed.
Architecture of M AXNET
M AXNET
M AXNET Example
Mexican Hat
Mexican Hat Mexican Hat network is a more general contrast-enhancing subnet than the M AXNET Each neuron is connected with excitatory (positively weighted) links to a number of “ cooperative neighbors ”, neurons that are in close proximity. Each neuron is also connected with inhibitory links (with negative weights) to a number of “ competitive neighbors” , neurons that are further away. There may also be a number of neurons, further away still, to which the neuron is not connected. The neurons receive an external signal in addition to these interconnection signals.
Mexican Hat
Mexican Hat
Mexican Hat
Mexican Hat Algorithm Parameter R 1 specifies the radius of the region with positive reinforcement. Parameter R 2 (> R 1 ) specifies the radius of the region of interconnections. The weights are determined by
Mexican Hat Algorithm
Hamming Net
Hamming Net A Hamming net is a maximum likelihood classification net that can be used to determine which of several exemplar vectors is most similar to an input vector. The exemplar vectors determine the weights of the net. The measure of similarity between the input vector and the stored exemplar vectors is n minus the Hamming distance between the vectors. The Hamming distance between two vectors is the number of components in which the vectors differ. For bipolar vectors x and y , x . y = a - d , where a is the number of components in which the vectors agree and d is the number of components in which the vectors differ, i.e., the Hamming distance
Architecture of Hamming Net The sample architecture shown in Figure assumes input vectors are 4-tuples, to be categorized as belonging to one of two classes.
Hamming Net
Hamming Net
Kohonen Self-Organizing Maps
Kohonen Self-Organizing Maps Self-organizing maps are based on competitive learning; In a self-organizing map, the neurons are placed at the nodes of a lattice that is usually one or two dimensional. Higher-dimensional maps are also possible but not as common. The neurons become selectively tuned to various input patterns or classes of input patterns in the course of a competitive-learning process. The locations of the neurons so tuned (i.e., the winning neurons) become ordered w.r.t. each other in such a way that a meaningful coordinate system for different input features is created over lattice.
Kohonen Self-Organizing Maps A self-organizing map is therefore characterized by the formation of a topographic map of the input patterns, in which the spatial locations (i.e., coordinates) of the neurons in the lattice are indicative of intrinsic statistical features contained in the input patterns-hence the name “self-organizing map”. The spatial location of an output neuron in a topographic map corresponds to a particular domain or feature of data drawn from the input space.
Kohonen Self-Organizing Maps Lines between the nodes are simply a representation of adjacency and not signifying any connection as in other neural networks
Kohonen Self-Organizing Maps Illustration of Feature Map after Self-Organizing Process
The principal goal of the self-organizing map (SOM) is to transform an incoming signal pattern of arbitrary dimension into a one- or two-dimensional discrete map, and to perform this transformation adaptively in a topologically ordered fashion. Kohonen Self-Organizing Maps
Kohonen Self-Organizing Maps
Each neuron in the lattice is fully connected to all the source nodes in the input layer. The network represents a feed forward structure with a single computational layer consisting of neurons arranged in rows and columns. A one dimensional lattice is a special case. Kohonen Self-Organizing Maps
Each input pattern presented to the network typically consists of a localized region or “spot” of activity against a quiet background. The location and nature of such a spot usually varies from one realization of the input pattern to another. All the neurons in the network should therefore be exposed to a sufficient number of different realizations of the input pattern in order to ensure that the self-organization process has a chance to develop properly. Kohonen Self-Organizing Maps
The algorithm responsible for the formation of the SOM proceeds first by initializing the synaptic weights in the network. Once the network has been initialized, there are three processes involved in the formation of SOM: Competition Cooperation Synaptic Adaption Kohonen Self-Organizing Maps
Competitive Process : For each input pattern, the neurons in the network compute their respective values of a discriminant function. The particular neuron with the largest value of discriminant function is declared winner of the competition. Kohonen Self-Organizing Maps
Competitive Process Kohonen Self-Organizing Maps
Kohonen Self-Organizing Maps Competitive Process
Cooperative Process Cooperative Process : The Winning neuron locates the center of a topological neighborhood of cooperating neurons. How do we define a topological neighborhood? A neuron that is firing tends to excite the neurons in its immediate neighborhood more than those farther away from it. Kohonen Self-Organizing Maps
Kohonen Self-Organizing Maps Example of Lattices and Neighborhoods
In the Figure, neighborhoods of the unit designated by # of radii R=2, 1 and 0 in a 1-D topology with 10 clusters, 2-D hexagonal grid with 49 units and 2-D rectangular grid with 49 units is shown. While the winning unit is indicated by the symbol “#” the other units are denoted by “*”. Each unit has eight nearest neighbors in the rectangular grid and only six in the hexagonal grid. Winning units that are close to the edge of the grid will have some neighborhoods that have fewer units than that shown in the respective figure. Neighborhoods do not wrap around from one side of the grid to the other; missing units are simply ignored. Cooperative Process Kohonen Self-Organizing Maps
Adaptive process enables the excited neurons to increase their individual values of the discriminant function in relation to the input pattern through suitable adjustments applied to their synaptic weights. The adjustments made are such that the response of the winning neuron to the subsequent application of a similar input pattern is enhanced. Adaptive Process Kohonen Self-Organizing Maps
Adaptive Process Kohonen Self-Organizing Maps
Kohonen Self-Organizing Maps
The learning rate α is a slowly decreasing function of time (or training epochs). The radius of the neighborhood around a cluster unit also decreases as the clustering process progresses. The formation of a map occurs in two phases: the initial formation of the correct order and final convergence. The second phase takes much longer than the first and requires a small value for the learning rate. Random values may be assigned for the initial weights. If some information is available concerning the distribution of clusters that might be appropriate for a particular problem, the initial weights can be taken to reflect that prior knowledge. Kohonen Self-Organizing Maps
Pros Techniques like dimensionality reduction and grid clustering can make it simple to understand and comprehend data. Techniques like dimensionality reduction and grid clustering can make it simple to understand and comprehend data. Cons The model cannot grasp how data is formed since it does not generate a generative data model. When dealing with categorical data, Self-Organizing Maps perform poorly, and when dealing with mixed forms of data, they do much worse. In comparison, the model preparation process is extremely slow, making it challenging to train against slowly evolving data. Kohonen Self-Organizing Maps
Learning Vector Quantization
Learning Vector Quantization Learning vector quantization (LVQ) is a pattern classification method in which each output unit represents a particular class or category. The weight vector for an output unit is often referred to as a reference (or codebook) vector for the class that the unit represents. During training, the output units are positioned by adjusting weights through supervised training to approximate the decision surfaces of the theoretical Baye’s classifier. It is assumed that a set of training patterns with known classifications is provided, along with an initial distribution of reference vectors. After training, an LVQ net classifies an input vector by assigning it to the same class as the output unit that has its weight vector (reference vector) closest to the input vector.
Architecture Learning Vector Quantization
Learning Vector Quantization
The architecture of an LVQ neural net is same as that of Kohonen self-organizing map without a topological structure being assumed for the output units. In addition, each output unit has a known class that it represents. In the LVQ network, each neuron in the first layer learns the prototype vector by calculating the distance between the weight and input vectors. If X and w belong to the same class, then we move the weights toward the new input vector; if X and w belong to different classes, then we move the weights away from this input vector. The winning neuron indicates a subclass, rather than a class. There may be several different neurons or subclasses that make up each class Learning Vector Quantization
The second layer is used to combine subclasses into a single class using w c matrix. The columns of w c represent subclasses and the rows represent classes. w c has a single 1 in each column, with the other elements set to 0. The row in which 1 occurs indicates the class of that subclass. w c is a non-trainable parameter and is dependent on the problem statement. The number of neurons will depend on the number of classes the user selects for their problem statement. Before learning, each neuron in the first layer is assigned to an output neuron to generate w c matrix. Once, w c is defined, it will never be altered. Learning Vector Quantization
Learning Vector Quantization
Learning Vector Quantization
Counter Propagation Networks
Counter Propagation Networks Counter propagation networks are multilayer networks based on a combination of input, clustering and output layers. Counter propagation nets can be used to compress data, to approximate functions or to associate patterns. A counter propagation net approximates its training input vector pairs by adaptively constructing a look-up table. Therefore, a large number of training data points can be compressed to a more manageable number of look-up table entries. If the training data represent function values, the net will approximate a function A hetero associative net is simply one interpretation of a function from a set of vectors (patterns) X to a set of vectors Y . The accuracy of the approximation is determined by the number of entries in the look-up table which equals the number of units in the cluster layer of the net.
Counter propagation nets are trained in two stages: During the first stage, the input vectors are clustered. The clusters may be formed based on either the dot product metric or the Euclidean norm metric. During the second stage of training, the weights from the cluster units to the output units are adapted to produce the desired response. There are two types of counter propagation nets: full and forward only . Counter Propagation Networks
Full Counter Propagation Full counter propagation represents a large numbe r of vector pairs, x : y by adaptively constructing a look-up table. It produces an approximation x *: y * based on input of an x vector only, or input of a y vector only , or input of an x : y pair, possibly with some distorted or missing elements in either or both vectors.
Architecture of Full Counter Propagation Networ k
Active Units during the First Phase of Counter Propagation Training
Second Phase of Counter Propagation Training
Full Counter Propagation Network First Phase of Training
Full Counter Propagation Network Second Phase of Training
Full Counter Propagation Network
Full Counter Propagation Network Algorithm for First Phase of Training
Full Counter Propagation Network Algorithm for Second Phase of Training
Full Counter Propagation Network Algorithm for Training
Full Counter Propagation Network Application
Full Counter Propagation Network Application
For testing with only an x vector for input (i.e., there is no information about the corresponding y ), it may be preferable to find the winning unit J based on comparing only the x vector and the first n components of the weight vector for each cluster layer unit. Full Counter Propagation Network Application
Forward-Only Counter Propagation Forward-only counter propagation nets are simplified version of the full counter propagation nets. Forward-only nets are intended to approximate a function y = f ( x ) that is not necessarily invertible. That means, forward-only counter propagation nets may be used if the mapping from x to y is well-defined, but the mapping from y to x is not. Forward-only counter propagation differs from full counter propagation in using only the x vectors to form the clusters on the Kohonen units during the first stage of training.
Architecture of Forward-Only Counter Propagation Network
Although the architecture of a forward-only counter propagation net appears to be the same as the architecture of a back propagation net, the counter propagation net has interconnections among the units in the cluster layer which are not shown. In general, in forward-only counter propagation, after competition, only one unit in that layer will be active and send a signal to the output layer. Forward-Only Counter Propagation
The training procedure for the forward-only counter propagation net consists of the following steps: First, an input vector is presented to the input units. The units in the cluster layer compete for the right to learn the input vector. After the entire set of training vectors has been presented, the learning rate is reduced and the vectors are presented again; this continues through several iterations. After the weights from the input layer to the cluster layer have been trained, the weights from the cluster layer to the output layer are trained. Now, as each training input vector is presented to the input layer, the associated target vector is presented to the output layer. The winning cluster unit sends a signal of 1 to the output layer. Each out unit has computed input signal and target value. Using the difference between these values, the weights between the winning cluster unit and the output layer are updated. The learning rule for these weights is similar to the learning rule for the weights from the input units to the cluster units Forward-Only Counter Propagation
Forward-Only Counter Propagation
Forward-Only Counter Propagation Network Algorithm for First Phase of Training
Forward-Only Counter Propagation Network Algorithm for Second Phase of Training
Forward-Only Counter Propagation Network Algorithm for Training