Competitive Learning [Deep Learning And Nueral Networks].pptx

1,409 views 78 slides Feb 11, 2024
Slide 1
Slide 1 of 78
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78

About This Presentation

Deep Learning


Slide Content

NEURAL NETWORKS BASED ON COMPETITION Prof. J. Ujwala Rekha

Unsupervised Learning Neural network learning is not restricted to supervised learning, wherein training pairs are provided, as with the pattern classification and pattern association problems and the more general input-output mappings. A second major type of learning for neural networks is unsupervised learning, in which the net seeks to find patterns or regularity in the input data.

Competitive Learning Competitive learning  is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data . Models and algorithms based on the principle of competitive learning include  winner-take-all nets, vector quantization, self-organizing maps  , etc.

Architecture of Competitive Learnin g

Implementation of Competitive Learning

FIXED-WEIGHT COMPETITIVE NETS Many neural nets use the idea of competition among neurons to enhance the contrast in activations of the neurons. In the most extreme situation, often called Winner-Take-All, only the neuron with the largest activation is allowed to remain “on”. Examples of fixed-weigh competitive nets include Maxnet , Mexican Hat, Hamming net, etc.

M AXNET

M AXNET M AXNET is a specific example of a neural net based on competition that can be used as a subnet to choose the neuron whose activation is the largest. The M AXNET has M neurons that are fully interconnected with symmetric weights, each neuron is connected to every other neuron in the net, including itself. There is no training algorithm for the M AXNET; the weights are fixed.

Architecture of M AXNET

M AXNET

M AXNET Example

Mexican Hat

Mexican Hat Mexican Hat network is a more general contrast-enhancing subnet than the M AXNET Each neuron is connected with excitatory (positively weighted) links to a number of “ cooperative neighbors ”, neurons that are in close proximity. Each neuron is also connected with inhibitory links (with negative weights) to a number of “ competitive neighbors” , neurons that are further away. There may also be a number of neurons, further away still, to which the neuron is not connected. The neurons receive an external signal in addition to these interconnection signals.

Mexican Hat

Mexican Hat

Mexican Hat

Mexican Hat Algorithm Parameter R 1 specifies the radius of the region with positive reinforcement. Parameter R 2 (> R 1 ) specifies the radius of the region of interconnections. The weights are determined by

Mexican Hat Algorithm

Hamming Net

Hamming Net A Hamming net is a maximum likelihood classification net that can be used to determine which of several exemplar vectors is most similar to an input vector. The exemplar vectors determine the weights of the net. The measure of similarity between the input vector and the stored exemplar vectors is n minus the Hamming distance between the vectors. The Hamming distance between two vectors is the number of components in which the vectors differ. For bipolar vectors x and y , x . y = a - d , where a is the number of components in which the vectors agree and d is the number of components in which the vectors differ, i.e., the Hamming distance

Architecture of Hamming Net The sample architecture shown in Figure assumes input vectors are 4-tuples, to be categorized as belonging to one of two classes.

Hamming Net

Hamming Net

Kohonen Self-Organizing Maps

Kohonen Self-Organizing Maps Self-organizing maps are based on competitive learning; In a self-organizing map, the neurons are placed at the nodes of a lattice that is usually one or two dimensional. Higher-dimensional maps are also possible but not as common. The neurons become selectively tuned to various input patterns or classes of input patterns in the course of a competitive-learning process. The locations of the neurons so tuned (i.e., the winning neurons) become ordered w.r.t. each other in such a way that a meaningful coordinate system for different input features is created over lattice.

Kohonen Self-Organizing Maps A self-organizing map is therefore characterized by the formation of a topographic map of the input patterns, in which the spatial locations (i.e., coordinates) of the neurons in the lattice are indicative of intrinsic statistical features contained in the input patterns-hence the name “self-organizing map”. The spatial location of an output neuron in a topographic map corresponds to a particular domain or feature of data drawn from the input space.

Kohonen Self-Organizing Maps Lines between the nodes are simply a representation of adjacency and not signifying any connection as in other neural networks

Kohonen Self-Organizing Maps Illustration of Feature Map after Self-Organizing Process

The principal goal of the self-organizing map (SOM) is to transform an incoming signal pattern of arbitrary dimension into a one- or two-dimensional discrete map, and to perform this transformation adaptively in a topologically ordered fashion. Kohonen Self-Organizing Maps

Kohonen Self-Organizing Maps

Each neuron in the lattice is fully connected to all the source nodes in the input layer. The network represents a feed forward structure with a single computational layer consisting of neurons arranged in rows and columns. A one dimensional lattice is a special case. Kohonen Self-Organizing Maps

Each input pattern presented to the network typically consists of a localized region or “spot” of activity against a quiet background. The location and nature of such a spot usually varies from one realization of the input pattern to another. All the neurons in the network should therefore be exposed to a sufficient number of different realizations of the input pattern in order to ensure that the self-organization process has a chance to develop properly. Kohonen Self-Organizing Maps

The algorithm responsible for the formation of the SOM proceeds first by initializing the synaptic weights in the network. Once the network has been initialized, there are three processes involved in the formation of SOM: Competition Cooperation Synaptic Adaption Kohonen Self-Organizing Maps

Competitive Process : For each input pattern, the neurons in the network compute their respective values of a discriminant function. The particular neuron with the largest value of discriminant function is declared winner of the competition. Kohonen Self-Organizing Maps

Competitive Process Kohonen Self-Organizing Maps

Kohonen Self-Organizing Maps Competitive Process

Cooperative Process Cooperative Process : The Winning neuron locates the center of a topological neighborhood of cooperating neurons. How do we define a topological neighborhood? A neuron that is firing tends to excite the neurons in its immediate neighborhood more than those farther away from it. Kohonen Self-Organizing Maps

Kohonen Self-Organizing Maps Example of Lattices and Neighborhoods

In the Figure, neighborhoods of the unit designated by # of radii R=2, 1 and 0 in a 1-D topology with 10 clusters, 2-D hexagonal grid with 49 units and 2-D rectangular grid with 49 units is shown. While the winning unit is indicated by the symbol “#” the other units are denoted by “*”. Each unit has eight nearest neighbors in the rectangular grid and only six in the hexagonal grid. Winning units that are close to the edge of the grid will have some neighborhoods that have fewer units than that shown in the respective figure. Neighborhoods do not wrap around from one side of the grid to the other; missing units are simply ignored. Cooperative Process Kohonen Self-Organizing Maps

Adaptive process enables the excited neurons to increase their individual values of the discriminant function in relation to the input pattern through suitable adjustments applied to their synaptic weights. The adjustments made are such that the response of the winning neuron to the subsequent application of a similar input pattern is enhanced. Adaptive Process Kohonen Self-Organizing Maps

Adaptive Process Kohonen Self-Organizing Maps

Kohonen Self-Organizing Maps

The learning rate α is a slowly decreasing function of time (or training epochs). The radius of the neighborhood around a cluster unit also decreases as the clustering process progresses. The formation of a map occurs in two phases: the initial formation of the correct order and final convergence. The second phase takes much longer than the first and requires a small value for the learning rate. Random values may be assigned for the initial weights. If some information is available concerning the distribution of clusters that might be appropriate for a particular problem, the initial weights can be taken to reflect that prior knowledge. Kohonen Self-Organizing Maps

Pros Techniques like dimensionality reduction and grid clustering can make it simple to understand and comprehend data. Techniques like dimensionality reduction and grid clustering can make it simple to understand and comprehend data. Cons The model cannot grasp how data is formed since it does not generate a generative data model. When dealing with categorical data, Self-Organizing Maps perform poorly, and when dealing with mixed forms of data, they do much worse. In comparison, the model preparation process is extremely slow, making it challenging to train against slowly evolving data. Kohonen Self-Organizing Maps

Learning Vector Quantization

Learning Vector Quantization Learning vector quantization (LVQ) is a pattern classification method in which each output unit represents a particular class or category. The weight vector for an output unit is often referred to as a reference (or codebook) vector for the class that the unit represents. During training, the output units are positioned by adjusting weights through supervised training to approximate the decision surfaces of the theoretical Baye’s classifier. It is assumed that a set of training patterns with known classifications is provided, along with an initial distribution of reference vectors. After training, an LVQ net classifies an input vector by assigning it to the same class as the output unit that has its weight vector (reference vector) closest to the input vector.

Architecture Learning Vector Quantization

Learning Vector Quantization

The architecture of an LVQ neural net is same as that of Kohonen self-organizing map without a topological structure being assumed for the output units. In addition, each output unit has a known class that it represents. In the LVQ network, each neuron in the first layer learns the prototype vector by calculating the distance between the weight and input vectors. If X and w belong to the same class, then we move the weights toward the new input vector; if X and w belong to different classes, then we move the weights away from this input vector. The winning neuron indicates a subclass, rather than a class. There may be several different neurons or subclasses that make up each class Learning Vector Quantization

The second layer is used to combine subclasses into a single class using w c matrix. The columns of w c represent subclasses and the rows represent classes. w c has a single 1 in each column, with the other elements set to 0. The row in which 1 occurs indicates the class of that subclass. w c is a non-trainable parameter and is dependent on the problem statement. The number of neurons will depend on the number of classes the user selects for their problem statement. Before learning, each neuron in the first layer is assigned to an output neuron to generate w c matrix. Once, w c is defined, it will never be altered. Learning Vector Quantization

Learning Vector Quantization

Learning Vector Quantization

Counter Propagation Networks

Counter Propagation Networks Counter propagation networks are multilayer networks based on a combination of input, clustering and output layers. Counter propagation nets can be used to compress data, to approximate functions or to associate patterns. A counter propagation net approximates its training input vector pairs by adaptively constructing a look-up table. Therefore, a large number of training data points can be compressed to a more manageable number of look-up table entries. If the training data represent function values, the net will approximate a function A hetero associative net is simply one interpretation of a function from a set of vectors (patterns) X to a set of vectors Y . The accuracy of the approximation is determined by the number of entries in the look-up table which equals the number of units in the cluster layer of the net.

Counter propagation nets are trained in two stages: During the first stage, the input vectors are clustered. The clusters may be formed based on either the dot product metric or the Euclidean norm metric. During the second stage of training, the weights from the cluster units to the output units are adapted to produce the desired response. There are two types of counter propagation nets: full and forward only . Counter Propagation Networks

Full Counter Propagation Full counter propagation represents a large numbe r of vector pairs, x : y by adaptively constructing a look-up table. It produces an approximation x *: y * based on input of an x vector only, or input of a y vector only , or input of an x : y pair, possibly with some distorted or missing elements in either or both vectors.

Architecture of Full Counter Propagation Networ k

Active Units during the First Phase of Counter Propagation Training

Second Phase of Counter Propagation Training

Full Counter Propagation Network First Phase of Training

Full Counter Propagation Network Second Phase of Training

Full Counter Propagation Network

Full Counter Propagation Network Algorithm for First Phase of Training

Full Counter Propagation Network Algorithm for Second Phase of Training

Full Counter Propagation Network Algorithm for Training

Full Counter Propagation Network Application

Full Counter Propagation Network Application

For testing with only an x vector for input (i.e., there is no information about the corresponding y ), it may be preferable to find the winning unit J based on comparing only the x vector and the first n components of the weight vector for each cluster layer unit. Full Counter Propagation Network Application

Forward-Only Counter Propagation Forward-only counter propagation nets are simplified version of the full counter propagation nets. Forward-only nets are intended to approximate a function y = f ( x ) that is not necessarily invertible. That means, forward-only counter propagation nets may be used if the mapping from x to y is well-defined, but the mapping from y to x is not. Forward-only counter propagation differs from full counter propagation in using only the x vectors to form the clusters on the Kohonen units during the first stage of training.

Architecture of Forward-Only Counter Propagation Network

Although the architecture of a forward-only counter propagation net appears to be the same as the architecture of a back propagation net, the counter propagation net has interconnections among the units in the cluster layer which are not shown. In general, in forward-only counter propagation, after competition, only one unit in that layer will be active and send a signal to the output layer. Forward-Only Counter Propagation

The training procedure for the forward-only counter propagation net consists of the following steps: First, an input vector is presented to the input units. The units in the cluster layer compete for the right to learn the input vector. After the entire set of training vectors has been presented, the learning rate is reduced and the vectors are presented again; this continues through several iterations. After the weights from the input layer to the cluster layer have been trained, the weights from the cluster layer to the output layer are trained. Now, as each training input vector is presented to the input layer, the associated target vector is presented to the output layer. The winning cluster unit sends a signal of 1 to the output layer. Each out unit has computed input signal and target value. Using the difference between these values, the weights between the winning cluster unit and the output layer are updated. The learning rule for these weights is similar to the learning rule for the weights from the input units to the cluster units Forward-Only Counter Propagation

Forward-Only Counter Propagation

Forward-Only Counter Propagation Network Algorithm for First Phase of Training

Forward-Only Counter Propagation Network Algorithm for Second Phase of Training

Forward-Only Counter Propagation Network Algorithm for Training

Forward-Only Counter Propagation Network Application

Thank You
Tags