Soft Computing Techniques Megha V Assistant Professor Dept. of Information Technology, Kannur University
Difference between Soft Computing and Hard Computing Hard Computing : Hard computing uses traditional mathematical methods to solve problems, such as algorithms and mathematical models. It is based on deterministic and precise calculations and is ideal for solving problems that have well-defined mathematical solutions .
Difference between Soft Computing and Hard Computing Soft Computing: Soft computing, on the other hand, uses techniques such as fuzzy logic, neural networks, genetic algorithms , and other heuristic methods to solve problems. It is based on the idea of approximation and is ideal for solving problems that are difficult or impossible to solve exactly.
Soft Computing Soft Computing could be a computing model evolved to resolve the non-linear issues that involve unsure, imprecise and approximate solutions of a tangle. These sorts of issues square measure thought of as real-life issues wherever the human-like intelligence is needed to resolve it. Hard Computing is that the ancient approach employed in computing that desires Associate in Nursing accurately declared analytical model.
Soft Computing The outcome of hard computing approach is a warranted, settled, correct result and defines definite management actions employing a mathematical model or algorithmic rule. It deals with binary and crisp logic that need the precise input file consecutive Hard computing isn’t capable of finding the real world problem’s solution
Need for Soft Computing The following are some of the reasons why soft computing is needed: 1. Complexity of real-world problems: Many real-world problems are complex and involve uncertainty, vagueness, and imprecision . Traditional computing methods are not well-suited to handle these complexities.
Need for Soft Computing 2. Incomplete information: In many cases, there is a lack of complete and accurate information available to solve a problem. Soft computing techniques can provide approximate solutions even in the absence of complete information.
Need for Soft Computing 3. Noise and uncertainty: Real-world data is often noisy and uncertain, and classical methods can produce incorrect results when dealing with such data. Soft computing techniques are designed to handle uncertainty and imprecision.
Need for Soft Computing 4. Non-linear problems: Many real-world problems are non-linear, and classical methods are not well-suited to solve them. Soft computing techniques such as fuzzy logic and neural networks can handle non-linear problems effectively.
Need for Soft Computing 5. Human-like reasoning: Soft computing techniques are designed to mimic human-like reasoning, which is often more effective in solving complex problems
Soft Computing: Recent Developments In the field of Big Data , soft computing working for data analyzing models, data behavior models, data decision, etc. In case of Recommender system , soft computing plays an important role for analyzing the problem on the based of algorithm and works for precise results. In Behavior and decision science , soft computing used in this for analyzing the behavior, and model of soft computing works accordingly.
Soft Computing: Recent Developments In the fields of Mechanical Engineering , soft computing is a role model for computing problems such that how a machine will works and how it will make the decision for a specific problem or input given. In this field of Computer Engineering , you can say it is core part of soft computing and computing working on advanced level like Machine learning, Artificial intelligence, etc.
Soft Computing: Characteristics Human expertise Soft computing utilizes human expertise in the form of fuzzy if-then rules, as well as in conventional knowledge representations , to solve practical problems. Biologically inspired computing models – Inspired by biological neural networks, artificial neural networks are employed extensively in soft computing to deal with perception, pattern recognition, and nonlinear regression and classification problems .
Soft Computing: Characteristics New optimization techniques Soft computing applies innovative optimization methods arising from various sources, they are genetic algorithms (inspired by the evolution and selection process) , simulated annealing (motivated by thermodynamics) , the random search method, and the downhill Simplex method . These optimization methods do not require the gradient vector of an objective function, so they are more flexible in dealing with complex optimization problems.
Soft Computing: Characteristics Numerical computation Unlike symbolic AI, soft computing relies mainly on numerical computation. Incorporation of symbolic techniques in soft computing is an active research area within this field.
Soft Computing: Characteristics Real-world applications Goal driven characteristics- Neuro-fuzzy and soft computing are goal driven; Fault tolerance- Both neural networks and fuzzy inference systems exhibit fault tolerance. Intensive computation- Without assuming too much background Model-free learning- Neural networks and adaptive fuzzy inference systems New application domains
Soft Computing Applications Handwritten Script Recognition using Soft Computing Handwritten Script Recognition is one of the demanding parts of computer science. It can translate multilingual documents and sort the various scripts accordingly. It uses the concept of “block-level technique” where the system recognizes the particular script from a number of script documents given. It uses a Discrete Cosine Transform (DCT), and discrete wavelets Transform (DWT) together, which classify the scripts according to their features
Soft Computing Applications Image Processing and Data Compression using Soft Computing Image analysis is one of the most important parts of the medical field. It is a high-level processing technique which includes recognition and bifurcation of patterns. Using soft computing solves the problem of computational complexity and efficiency in the classification.
Soft Computing Applications Image Processing and Data Compression using Soft Computing Techniques of soft computing include Genetic Algorithms, Genetic Programming, Classifier Systems, Evolution Strategies, artificial life, and a few others, which are used here. These algorithms give the fastest solutions to pattern recognition. These help in analyzing the medical images obtained from microscopes as well as examine the X-rays
Soft Computing Applications Soft Computing in Automotive Systems and Manufacturing The use of soft computing has solved a major misconception that the automobile industry is slow to adapt. Fuzzy logic is a technique used in vehicles to build classic control methods. It takes the example of human behavior, which is described in the forms of rule – “If-Then “ statements. The logic controller then converts the sensor inputs into fuzzy variables that are then defined according to these rules. Fuzzy logic techniques are used in engine control, automatic transmissions, antiskid steering, etc
Soft Computing Applications Soft Computing based Architecture An intelligent building takes inputs from the sensors and controls effectors by using them. The construction industry uses the technique of DAI (Distributed Artificial Intelligence) and fuzzy genetic agents to provide the building with capabilities that match human intelligence. The fuzzy logic is used to create behavior-based architecture in intelligent buildings to deal with the unpredictable nature of the environment, and these agents embed sensory information in the buildings
Soft Computing Applications Soft Computing and Decision Support System Soft computing gives an advantage of reducing the cost of the decision support system. The techniques are used to design, maintain, and maximize the value of the decision process. The first application of fuzzy logic is to create a decision system that can predict any sort of risk. The second application is using fuzzy information that selects the areas which need replacement
Soft Computing Applications Soft Computing Techniques in Power System Analysis Soft computing uses the method of Artificial Neural Network (ANN) to predict any instability in the voltage of the power system. Using the ANN, the pending voltage instability can be predicted. The methods which are deployed here, are very low in cost.
Soft Computing Applications Soft Computing Techniques in Bioinformatics The techniques of soft computing help in modifying any uncertainty and indifference that biometrics data may have. Soft computing is a technique that provides distinct low-cost solutions with the help of algorithms, databases, Fuzzy Sets (FSs), and Artificial Neural Networks (ANNs). These techniques are best suited to give quality results in an efficient way.
Soft Computing Applications Soft Computing in Investment and Trading The data present in the finance field is in opulence and traditional computing is not able to handle and process that kind of data. There are various approaches done through soft computing techniques that help to handle noisy data. Pattern recognition technique is used to analyze the pattern or behavior of the data and time series is used to predict future trading points.
Soft Computing Applications Recent developments in soft computing People have started using techniques of soft computing like fuzzy sets theory, neural nets, fuzzy neuro system, adaptive neuro-fuzzy inference system (ANFIS), for driving various numerical simulation analysis. Soft computing has helped in modeling the processes of machines with the help of artificial intelligence.
Soft Computing Applications Also, there are certain areas where soft computing is in budding stages only and is expected to see a massive evolution: Big Data Recommender system Behavior and decision science Mechanical Engineering Computer Engineering Civil Engineering
Artificial Neural Network Neural networks, also known as Artificial Neural Networks (ANNs) or simulated neural networks (SNNs), are a subset of machine learning and are at the heart of deep learning algorithms. Their name and structure are inspired by the human brain, mimicking the way that biological neurons signal to one another.
Artificial Neural Network ANNs are also named as “artificial neural systems,” or “parallel distributed processing systems,” or “connectionist systems.” ANN acquires a large collection of units that are interconnected in some pattern to allow communication between the units. These units, also referred to as nodes or neurons , are simple processors which operate in parallel.
Artificial Neural Network Every neuron is connected with other neuron through a connection link. Each connection link is associated with a weight that has information about the input signal. This is the most useful information for neurons to solve a particular problem because the weight usually excites or inhibits the signal that is being communicated. Each neuron has an internal state, which is called an activation signal. Output signals , which are produced after combining the input signals and activation rule, may be sent to other units.
Working of a Biological Neuron As shown in the above diagram, a typical neuron consists of the following four parts with the help of which we can explain its working − Dendrites − They are tree-like branches, responsible for receiving the information from other neurons. In other sense, we can say that they are like the ears of neuron. Soma − It is the cell body of the neuron and is responsible for processing of information, they have received from dendrites. Axon − It is just like a cable through which neurons send the information. Synapses − It is the connection between the axon and other neuron dendrites.
Artificial Neural Network
Model of Artificial Neural Network
Model of Artificial Neural Network
Architecture of ANN Comprised of a node layers, containing An input layer One or more hidden layers An output layer. Each node connects to another and has an associated weight and threshold. If the output of any individual node is above the specified threshold value, that node is activated, sending data to the next layer of the network.
Architecture of ANN Otherwise, no data is passed along to the next layer of the network. Neural networks rely on training data to learn and improve their accuracy over time.
Working of ANN
Working of ANN 1. Input units are passed i.e data is passed with some weights attached to it to the hidden layer. 2. Each hidden layer consists of neurons. All the inputs are connected to each neuron. 3. After passing on the inputs, all the computation is performed in the hidden layer
Working of ANN Computation performed in hidden layers are done in 2 steps All the inputs are multiplied by their weights . Weight is the gradient or coefficient of each variable. Shows the strength of the particular input. After assigning the weights, a bias variable is added. Bias is a constant that helps the model to fit in the best way possible. Z 1 = W 1 *In 1 + W 2 *In 2 + W 3 *In 3 + W 4 *In 4 + ….+ W n *In n + b
Working of ANN Activation function is applied to the linear equation Z1. Nonlinear transformation applied to the input before sending it to the next layer of neurons. Mathematical equations that determine whether a neuron should be activated or not. To introduce the nonlinearity in the data. There are several activation functions Sigmoid TanH Rectified Linear Unit Function ( ReLU )
Working of ANN 4. The whole process described in point 3 is performed in each hidden layer. After passing through every hidden layer, we move to the last layer i.e our output layer which gives us the final output. The process explained above is known as forwarding Propagation. 5. After getting the predictions from the output layer, the error is calculated. If the error is large, then the steps are taken to minimize the error and for the same purpose, Back Propagation is performed.
Perceptron Developed by Frank Rosenblatt by using McCulloch and Pitts model , perceptron is the basic operational unit of artificial neural networks. It employs supervised learning rule and is able to classify the data into two classes. Operational characteristics of the perceptron: It consists of a single neuron with an arbitrary number of inputs along with adjustable weights, but the output of the neuron is 1 or 0 depending upon the threshold. It also consists of a bias whose weight is always 1.
Perceptron Simple form of Neural Network Consists of a single layer where all the mathematical computations are performed.
Perceptron
Perceptron Perceptron thus has the following three basic elements − Links − It would have a set of connection links, which carries a weight including a bias always having weight 1. Adder − It adds the input after they are multiplied with their respective weights. Activation function − It limits the output of neuron. The most basic activation function has two possible outputs. This function returns 1, if the input is positive, and 0 for any negative input.
Multilayer Perceptron (ANN) Consists of more than one perception which is grouped together to form a multiple layer neural network .
Perceptron The following diagram is the architecture of perceptron for multiple output classes.
PERCEPTRON NETWORK Training Algorithm Perceptron network can be trained for single output unit as well as multiple output units. Training Algorithm for Single Output Unit Step 1 − Initialize the following to start the training − Weights Bias Learning rate α
PERCEPTRON NETWORK Training Algorithm Learning Rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function. For easy calculation and simplicity, weights and bias must be set equal to 0 and the learning rate must be set equal to 1. Step 2 − Continue step 3-8 when the stopping condition is not true. Step 3 − Continue step 4-6 for every training vector x .
PERCEPTRON NETWORK Training Algorithm Step 4 − Activate each input unit as follows − Step 5 − Now obtain the net input with the following relation − Here ‘b’ is bias and ‘n’ is the total number of input neurons.
PERCEPTRON NETWORK Training Algorithm Step 6 − Apply the following activation function to obtain the final output. Step 7 − Adjust the weight and bias as follows − Case 1 − if y ≠ t then,
PERCEPTRON NETWORK Training Algorithm Case 2 − if y = t then, Here ‘y’ is the actual output and ‘t’ is the desired/target output. Step 8 − Test for the stopping condition, which would happen when there is no change in weight.
Back Propagation Neural Networks Back Propagation Neural (BPN) is a multilayer neural network consisting of the input layer, at least one hidden layer and output layer. As its name suggests, back propagating will take place in this network. The error which is calculated at the output layer, by comparing the target output and the actual output, will be propagated back towards the input layer.
Back Propagation 1. Weights are initialized randomly and the errors are calculated after all the computation. 2.Then the gradient is calculated i.e derivative of error w.r.t current weights. 3. Then new weights are calculated using 4 This process continues till we reach global minima and loss is minimized.
Back Propagation Neural Networks
Back Propagation Neural Networks Architecture As shown in the diagram, the architecture of BPN has three interconnected layers having weights on them. The hidden layer as well as the output layer also has bias, whose weight is always 1, on them. As is clear from the diagram, the working of BPN is in two phases. One phase sends the signal from the input layer to the output layer, and the other phase back propagates the error from the output layer to the input layer .
Back Propagation Neural Networks Training Algorithm For training, BPN will use binary sigmoid activation function. The training of BPN will have the following three phases. Phase 1 − Feed Forward Phase Phase 2 − Back Propagation of error Phase 3 − Updating of weights
Back Propagation Neural Networks Training Algorithm All these steps will be concluded in the algorithm as follows Step 1 − Initialize the following to start the training − Weights Learning rate α For easy calculation and simplicity, take some small random values. Step 2 − Continue step 3-11 when the stopping condition is not true. Step 3 − Continue step 4-10 for every training pair.
Back Propagation Neural Networks Training Algorithm Phase 1 Step 4 − Each input unit receives input signal x i and sends it to the hidden unit for all i = 1 to n Step 5 − Calculate the net input at the hidden unit using the following relation − Here b 0j is the bias on hidden unit, v ij is the weight on j unit of the hidden layer coming from i unit of the input layer. Now calculate the net output by applying the following activation function
Back Propagation Neural Networks Training Algorithm Send these output signals of the hidden layer units to the output layer units. Step 6 − Calculate the net input at the output layer unit using the following relation − Here b 0k is the bias on output unit, w jk is the weight on k unit of the output layer coming from j unit of the hidden layer. Calculate the net output by applying the following activation function
Back Propagation Neural Networks Training Algorithm Phase 2 Step 7 − Compute the error correcting term, in correspondence with the target pattern received at each output unit, as follows −
Back Propagation Neural Networks Training Algorithm Step 8 − Now each hidden unit will be the sum of its delta inputs from the output units.
Back Propagation Neural Networks Training Algorithm Phase 3 Step 9 − Each output unit ( y k k = 1 to m) updates the weight and bias as follows − Step 10 − Each output unit ( z j j = 1 to p) updates the weight and bias as follows − Step 11 − Check for the stopping condition, which may be either the number of epochs reached or the target output matches the actual output.
Back Propagation Process of updating and finding the optimal values of weights or coefficients which helps the model to minimize the error The weights are updated with the help of optimizers. Optimizers are the methods/ mathematical formulations to change the attributes of neural networks. Gradient Descent is one of the optimizers.
Associate Memory Network An associate memory network refers to a content addressable memory structure that associates a relationship between the set of input patterns and output patterns . A content addressable memory structure is a kind of memory structure that enables the recollection of data based on the intensity of similarity between the input pattern and the patterns stored in the memory.
Associate Memory Network Let's understand this concept with an example:
Associate Memory Network The figure given above illustrates a memory containing the names of various people. If the given memory is content addressable, the incorrect string "Albert Einstin " as a key is sufficient to recover the correct name "Albert Einstein." In this condition, this type of memory is robust and fault-tolerant because of this type of memory model, and some form of error-correction capability.
Associate Memory Network There are two types of associate memory- an auto-associative memory and hetero associative memory . Auto-associative memory: An auto-associative memory recovers a previously stored pattern that most closely relates to the current pattern. It is also known as an auto-associative correlator . Consider x[1], x[2], x[3],….. x[M], be the number of stored pattern vectors, and let x[m] be the element of these vectors, showing characteristics obtained from the patterns. The auto-associative memory will result in a pattern vector x[m] when putting a noisy or incomplete version of x[m].
Auto-associative memory:
Hetero-associative memory: In a hetero-associate memory , the recovered pattern is generally different from the input pattern not only in type and format but also in content. It is also known as a hetero-associative correlator.
Hetero-associative memory: Consider we have a number of key response pairs {a(1), x(1)}, {a(2),x(2)},…..,{a(M), x(M)}. The hetero-associative memory will give a pattern vector x(m) when a noisy or incomplete version of the a(m) is given. Neural networks are usually used to implement these associative memory models called neural associative memory (NAM). The linear associate is the easiest artificial neural associative memory.
Working of Associative Memory: Associative memory is a depository of associated pattern which in some form. If the depository is triggered with a pattern, the associated pattern pair appear at the output. The input could be an exact or partial representation of a stored pattern.
Working of Associative Memory: If the memory is produced with an input pattern, may say α, the associated pattern ω is recovered automatically. These are the terms which are related to the Associative memory network: Encoding or memorization: Encoding or memorization refers to building an associative memory. It implies constructing an association weight matrix w such that when an input pattern is given, the stored pattern connected with the input pattern is recovered.
Adaptive Resonance Theory (ART) Adaptive resonance theory is a type of neural network technique developed by Stephen Grossberg and Gail Carpenter in 1987 . The basic ART uses unsupervised learning technique. The term “adaptive” and “resonance” used in this suggests that they are open to new learning(i.e. adaptive) without discarding the previous or the old information(i.e. resonance).
Adaptive Resonance Theory (ART) The ART networks are known to solve the stability-plasticity dilemma i.e., stability refers to their nature of memorizing the learning and plasticity refers to the fact that they are flexible to gain new information. Due to this the nature of ART they are always able to learn new input patterns without forgetting the past. ART networks implement a clustering algorithm. Input is presented to the network and the algorithm checks whether it fits into one of the already stored clusters. If it fits then the input is added to the cluster that matches the most, else a new cluster is formed.
Operating Principal of ART The main operation of ART classification can be divided into the following phase Recognition phase − The input vector is compared with the classification presented at every node in the output layer. The output of the neuron becomes “1” if it best matches with the classification applied, otherwise it becomes “0”. Comparison phase − In this phase, a comparison of the input vector to the comparison layer vector is done. The condition for reset is that the degree of similarity would be less than vigilance parameter.
Operating Principal of ART Search phase − In this phase, the network will search for reset as well as the match done in the above phases. Hence, if there would be no reset and the match is quite good, then the classification is over. Otherwise, the process would be repeated and the other stored pattern must be sent to find the correct match.
Types of Adaptive Resonance Theory(ART) Carpenter and Grossberg developed different ART architectures as a result of 20 years of research. The ARTs can be classified as follows: ART1 – It is the simplest and the basic ART architecture. It is capable of clustering binary input values. ART2 – It is extension of ART1 that is capable of clustering continuous-valued input data. Fuzzy ART – It is the augmentation of fuzzy logic and ART. ARTMAP – It is a supervised form of ART learning where one ART learns based on the previous ART module. It is also known as predictive ART. FARTMAP – This is a supervised ART architecture with Fuzzy logic included.
Basic of Adaptive Resonance Theory (ART) Architecture The adaptive resonant theory is a type of neural network that is self-organizing and competitive. It can be of both types, the unsupervised ones(ART1, ART2, ART3, etc ) or the supervised ones(ARTMAP). Generally, the supervised algorithms are named with the suffix “MAP”. But the basic ART model is unsupervised in nature and consists of : F1 layer or the comparison field(where the inputs are processed) F2 layer or the recognition field (which consists of the clustering units)
Basic of Adaptive Resonance Theory (ART) Architecture The Reset Module (that acts as a control mechanism) Generally two types of learning exists, slow learning and fast learning. In fast learning, weight update during resonance occurs rapidly. It is used in ART1. In slow learning, the weight change occurs slowly relative to the duration of the learning trial. It is used in ART2.
Advantage of Adaptive Resonance Theory (ART) It exhibits stability and is not disturbed by a wide variety of inputs provided to its network. It can be integrated and used with various other techniques to give more good results. It can be used for various fields such as mobile robot control, face recognition, land cover classification, target recognition, medical diagnosis, signature verification, clustering web users, etc. It has got advantages over competitive learning (like bpnn etc ). The competitive learning lacks the capability to add new clusters when deemed necessary. It does not guarantee stability in forming clusters.
Limitations of Adaptive Resonance Theory Some ART networks are inconsistent (like the Fuzzy ART and ART1) as they depend upon the order of training data, or upon the learning rate.
Application of ART ART neural networks used for fast, stable learning and prediction have been applied in different areas. The application incorporates target recognition, face recognition, medical diagnosis, signature verification, mobile control robot.
Application of ART Target recognition: Fuzzy ARTMAP neural network can be used for automatic classification of targets depend on their radar range profiles. Tests on synthetic data show the fuzzy ARTMAP can result in substantial savings in memory requirements when related to k nearest neighbor( kNN ) classifiers. The utilization of multiwavelength profiles mainly improves the performance of both kinds of classifiers.
Application of ART Medical diagnosis: Medical databases present huge numbers of challenges found in general information management settings where speed, use, efficiency, and accuracy are the prime concerns. A direct objective of improved computer-assisted medicine is to help to deliver intensive care in situations that may be less than ideal. Working with these issues has stimulated several ART architecture developments, including ARTMAP-IC.
Application of ART Signature verification: Automatic signature verification is a well known and active area of research with various applications such as bank check confirmation, ATM access, etc. The training of the network is finished using ART1 that uses global features as input vector and the verification and recognition phase uses a two-step process. In the initial step, the input vector is coordinated with the stored reference vector, which was used as a training set, and in the second step, cluster formation takes place.
Application of ART Mobile control robot: Nowadays, we perceive a wide range of robotic devices. It is still a field of research in their program part, called artificial intelligence. The human brain is an interesting subject as a model for such an intelligent system. Inspired by the structure of the human brain, an artificial neural emerges. Similar to the brain, the artificial neural network contains numerous simple computational units, neurons that are interconnected mutually to allow the transfer of the signal from the neurons to neurons. Artificial neural networks are used to solve different issues with good outcomes compared to other decision algorithms.
Adaptive Resonance Theory Grossberg presented the basic principles of the adaptive resonance theory. A category of ART called ART1 has been described as an arrangement of ordinary differential equations by carpenter and Grossberg. These theorems can predict both the order of search as the function of the learning history of the system and the input patterns.
Adaptive Resonance Theory
Adaptive Resonance Theory
Self Organizing Maps – Kohonen Maps Self Organizing Map (or Kohonen Map or SOM) is a type of Artificial Neural Network which is also inspired by biological models of neural systems from the 1970s. It follows an unsupervised learning approach and trained its network through a competitive learning algorithm. SOM is used for clustering and mapping (or dimensionality reduction) techniques to map multidimensional data onto lower-dimensional which allows people to reduce complex problems for easy interpretation. SOM has two layers, one is the Input layer and the other one is the Output layer .
Self Organizing Maps – Kohonen Maps The architecture of the Self Organizing Map with two clusters and n input features of any sample is given below:
How do SOM works? Let’s say an input data of size (m, n) where m is the number of training examples and n is the number of features in each example. First, it initializes the weights of size (n, C) where C is the number of clusters. Then iterating over the input data, for each training example, it updates the winning vector (weight vector with the shortest distance ( e.g Euclidean distance) from training example). Weight updating rule is given by :
How do SOM works? where alpha is a learning rate at time t, j denotes the winning vector, i denotes the i th feature of training example and k denotes the k th training example from the input data. After training the SOM network, trained weights are used for clustering new examples. A new example falls in the cluster of winning vectors.
Self Organizing Maps
Applications Clustering Clustering is probably the most trivial application where you can use SOFM. In case of clustering, we treat every neuron as a centre of separate cluster. One of the problems is that during the training procedure when we pull one neuron closer to one of the cluster we will be forced to pull its neighbors as well. In order to avoid this issue, we need to break relations between neighbors, so that any update will not have influence on other neurons. If we set up this value as 0 it will mean that neuron winner doesn’t have any relations with other neurons which is exactly what we need for clustering.
Applications Clustering In the image below you can see visualized two features from the iris dataset and there are three SOFM neurons colored in grey. As you can see it managed to find pretty good centers of the clusters.
Applications Space approximation Space approximation is similar to clustering, but the goal is here to find the minimum number of points that cover as much data as possible. Since it’s similar to clustering we can use SOFM here as well. But as we saw in the previous example data points wasn’t using space efficiently and some points were very close to each other and some are further. Now the problem is that clusters don’t know about existence of other clusters and they behave independently. To have more cooperative behavior between clusters we can enable learning radius in SOFM.
Applications High-dimensional data visualization We can use SOFM with two-dimensional feature map in order to catch dimensional properties of the datasets with only two features. If we increase number of dimensions to three it still would be possible to visualize the result, but in four dimensions it will become a bit trickier. If we use two-dimensional grid and train SOFM over the high-dimensional data then we can encode network as a heat map where each neuron in the network will be represented by the average distance to its neighbors.