XI089SujanaMallavara
0 views
57 slides
Oct 10, 2025
Slide 1 of 57
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
About This Presentation
ann ppt
Size: 2.4 MB
Language: en
Added: Oct 10, 2025
Slides: 57 pages
Slide Content
Unit – 1–Introduction to Neural Networks
Unit Explains... History of neural network research Biological inspiration Network of neurons Artificial Neural Networks Neuron Model Network Architecture
What is Neural Network? A neural network is a computational model made up of layers of interconnected nodes ("neurons") , where each node processes input and passes it to the next layer to perform tasks like classification, prediction, or pattern recognition.
How does it works?
Why NN are powerful? Can learn complex patterns Handle non-linear relationships Used in : Image & speech recognition Language translation Medical diagnosis Self-driving cars Chatbots like ChatGPT
Artificial neural network An Artificial Neural Network is a machine learning model made up of nodes (neurons) arranged in layers , where each node mimics the function of a brain neuron. ANNs are used to approximate functions, classify data, and recognize patterns.
What this ANN can do? Classification (e.g., Is this a cat or a dog?) Regression (e.g., Predict house prices) Pattern Recognition (e.g., Handwriting, voice, or face) Natural Language Processing (e.g., Chatbots, translation) Medical Diagnosis , Autonomous Vehicles , and more
Key feature of ANN
Example Imagine a network that classifies animals based on features: Input: [fur=1, claws=1, swims=0] Hidden layer: Processes these values Output: "Cat" or "Dog"
Types of ANN
History of NN
Biological inspiration
Neural Computation Neural computation is the process by which neurons (biological or artificial) process, transform, and transmit information through networks, leading to perception, learning, memory, or decision-making.
Models of computation A model of computation is a formal or conceptual model that describes how information is processed and computed.
Elements of computing models
Network of Neurons
Structure of Neuron
Information processing The brain is a complex network of interconnected neurons that process, transmit, and store information. This occurs through electrical and chemical signaling at the level of individual neurons and synapses. Neurons communicate via electrical impulses (action potentials) and chemical signals (neurotransmitters) . Key Steps in Neuronal Information Processing: Dendritic Input: Dendrites receive signals from other neurons via synapses. Excitatory inputs depolarize the neuron (bringing it closer to firing threshold). Inhibitory inputs hyperpolarize it (making firing less likely).
Integration at the Soma (Cell Body): Inputs are summed up ( spatial and temporal summation ). If the combined input reaches the threshold potential , an action potential is triggered. Action Potential (Electrical Signal): A rapid depolarization and repolarization wave traveling down the axon. All-or-none principle: The strength is constant; information is encoded in firing frequency. Synaptic Transmission (Chemical Signal): At the axon terminal, neurotransmitters are released into the synaptic cleft. These bind to receptors on the next neuron, continuing the signal.
Information storage at synapses Memory and learning depend on changes in synaptic strength, primarily through: A. Short-Term Plasticity (Temporary Changes) Facilitation: Increased neurotransmitter release due to rapid successive stimuli. Depression: Reduced release due to neurotransmitter depletion. Long-Term Plasticity (Long-lasting Changes) Long-Term Potentiation (LTP): Repeated strong stimulation strengthens synapses. Involves NMDA receptor activation , calcium influx, and insertion of AMPA receptors . Associated with learning and memory.
Long-Term Depression (LTD): Weakening of synapses due to low-frequency stimulation. Important for pruning unnecessary connections. Structural Plasticity: Formation of new synapses ( synaptogenesis ) or removal of old ones. Changes in dendritic spines (tiny protrusions where synapses form).
Neurons as self organizing systems Self-organization refers to the spontaneous emergence of order in complex systems without external control. Neurons exhibit self-organization through network formation, plasticity, and adaptation , enabling learning, memory, and intelligence. . Key Principles of Self-Organization in Neurons (A) Local Interactions → Global Order Neurons communicate via synapses , forming networks without a central controller. Simple rules (e.g., Hebbian learning : "Neurons that fire together wire together") lead to complex behaviors (e.g., memory, pattern recognition).
B) Plasticity: Adaptive Rewiring Synaptic plasticity (LTP/LTD) strengthens or weakens connections based on activity. Structural plasticity (new dendrites/synapses) reshapes networks dynamically. (C) Competition & Cooperation Competition: Limited synaptic resources force neurons to "compete" for connections (e.g., neural Darwinism). Cooperation: Synchronized firing (e.g., gamma oscillations) enhances signal transmission. (D) Criticality & Phase Transitions Neuronal networks operate near a critical state (balance between order and chaos), optimizing information processing.
Examples of Self-Organization in Neural Systems (A) Development of Neural Circuits Spontaneous activity (e.g., retinal waves) refines synaptic connections before birth. Axon guidance via chemical gradients (e.g., netrins, semaphorins ) shapes brain wiring. (B) Learning & Memory Experience-dependent plasticity: Sensory input (e.g., visual cortex tuning) reorganizes neural maps. Memory consolidation: Hippocampus replays experiences during sleep, strengthening cortical connections. (C) Recovery After Injury Neuroplasticity: After stroke, surviving neurons reorganize to compensate for damage.
Comparison Feature Biological Neurons Artificial Self-Organizing Models Learning Rule Hebbian plasticity, STDP Unsupervised learning (e.g., SOMs, GANs) Adaptation Continuous (lifelong plasticity) Fixed after training (unless online learning) Robustness Fault-tolerant (degeneracy) Fragile (adversarial attacks) Energy Efficiency Ultra-low power (~20W for human brain) High power (e.g., GPT-3: megawatts)
Artificial neural networks
Network of primitive functions An Artificial Neural Network (ANN) is a composition of simple, differentiable functions (neurons) organized in layers. Each neuron performs a primitive operation, and their combination allows the network to approximate complex functions. 1 . Core Primitive Functions in ANNs (A) Linear Transformation (Weighted Sum) Each neuron computes a weighted sum of its input s:
Inputs: x1,x2,...,xn x 1 , x 2 ,..., xn (from previous layer or data). Weights: w1,w2,...,wn w 1 , w 2 ,..., wn (learnable parameters). Bias: b b (shifts the activation threshold).
Approximation of functions Refers to the ability of the network to learn and represent an unknown function using a set of input-output data points. Essentially, ANNs act as function approximators, learning the underlying relationship between inputs and outputs in a given dataset. This allows them to predict outputs for new, unseen inputs based on the learned approximation. neural network is just a composition of linear transformations and non-linear functions. Armed with these non-linearities, NN can approximate any function given enough hidden units. This is called the Universal approximation theorem and has been proven with relatively weak assumptions. Intuitively, if a NN is able to represent a step function , which is a function made up of finitely many piecewise constant components, then it is able to approximate any other function with arbitrary accuracy given enough of these step components.
Neuron model Single input neuron Multiple input neuron Transfer functions
Single input neuron vs multiple input neuron
Continuation.....
Transfer function A transfer function , also called an activation function , is a key part of a neuron in artificial neural networks. It transforms the weighted sum of inputs into the neuron's output. Why it is needed? Without a transfer function, the neuron would only perform linear operations , and even deep networks would behave like a single-layer linear model. → The transfer function introduces non-linearity , allowing the network to learn complex patterns like speech, images, and text.
Common Activation function
Network architectures
Single layer neuron architecture
Multilayer neuron architeture
Recurrent networks Recurrent Neural Networks (RNNs) differ from regular neural networks in how they process information. While standard neural networks pass information in one direction i.e from input to output, RNNs feed information back into the network at each step. Imagine reading a sentence and you try to predict the next word, you don’t rely only on the current word but also remember the words that came before. RNNs work similarly by “remembering” past information and passing the output from one step as input to the next i.e it considers all the earlier words to choose the most likely next word. This memory of previous steps helps the network understand context and make better predictions.
How does RNN work? At each time step RNNs process units with a fixed activation function. These units have an internal hidden state that acts as memory that retains information from previous time steps. This memory allows the network to store past knowledge and adapt based on new inputs. Updating the Hidden State in RNNs
Backpropogation through time
Since RNNs process sequential data Backpropagation Through Time (BPTT) is used to update the network's parameters. The loss function L(θ) depends on the final hidden state h3 h 3 and each hidden state relies on preceding ones forming a sequential dependency chain: h3 h 3 depends on depends on h2,h2 depends on h1,…,h1 depends on h0 .
Types of RNN
Limitations of Recurrent Neural Networks (RNNs) While RNNs excel at handling sequential data they face two main training challenges i.e vanishing gradient and exploding gradient problem : Vanishing Gradient : During backpropagation gradients diminish as they pass through each time step leading to minimal weight updates. This limits the RNN’s ability to learn long-term dependencies which is crucial for tasks like language translation. Exploding Gradient : Sometimes gradients grow uncontrollably causing excessively large weight updates that de-stabilize training. These challenges can hinder the performance of standard RNNs on complex, long-sequence tasks.
Applications of Recurrent Neural Networks Time-Series Prediction : RNNs excel in forecasting tasks, such as stock market predictions and weather forecasting. Natural Language Processing (NLP) : RNNs are fundamental in NLP tasks like language modeling, sentiment analysis and machine translation. Speech Recognition : RNNs capture temporal patterns in speech data, aiding in speech-to-text and other audio-related applications. Image and Video Processing : When combined with convolutional layers, RNNs help analyze video sequences, facial expressions and gesture recognition.