Demystifying Neural Networks And Building Cybersecurity Applications
cisoplatform7
174 views
55 slides
Jul 17, 2024
Slide 1 of 55
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
About This Presentation
In today's rapidly evolving technological landscape, Artificial Neural Networks (ANNs) have emerged as a cornerstone of artificial intelligence, revolutionizing various fields including cybersecurity. Inspired by the intricacies of the human brain, ANNs have a rich history and a complex structur...
In today's rapidly evolving technological landscape, Artificial Neural Networks (ANNs) have emerged as a cornerstone of artificial intelligence, revolutionizing various fields including cybersecurity. Inspired by the intricacies of the human brain, ANNs have a rich history and a complex structure that enables them to learn and make decisions. This blog aims to unravel the mysteries of neural networks, explore their mathematical foundations, and demonstrate their practical applications, particularly in building robust malware detection systems using Convolutional Neural Networks (CNNs).
Size: 2.52 MB
Language: en
Added: Jul 17, 2024
Slides: 55 pages
Slide Content
Demystifying Neural Networks
FireCompass Technologies Inc.
www.firecompass.com
Arnab Chattopadhyay, CTO
TOC
1.The inspiration of Artificial Neural Network - the biology behind it
2.History of ANN
3.Structure of ANN
4.Neural Network Zoo
5.The Mathematics behind ANN
6.Build a ANN from scratch
7.Deep Neural Network
8.Convolutional Neural Network (CNN)
9.Additional Mathematics for CNN
10.Build a malware detection system using CNN
The Inspiration of ANN
- the Biology behind it1
Some Interesting Facts about Human Brain
Some Interesting Facts:
Human brain contains 100 Billion+
Neurons and 100 Trillion+ Synapses
10,000+ specific types (by structure) of
neurons in human brain
This small piece of the human brain
used in the experiment
●Has around 4000 nerve fibres
connected to a single Neuron
●Holds about 1.4 Petabytes
beyond which the researchers
could not measure because the
data set was so huge
The image is from Google mapping of a piece of Human Brain
Taking a look at Neuron
What is a Neuron
●A Neuron is a special type of cell with the sole purpose of transferring
information around the body
●Neuron is similar to other cells in that it has a cell body with a nucleus and
organelles
●A Neuron has few extra features which allows it transferring “action
potentials”
Key components of a Neuron
●Dendrite: Receive signals from neighboring neurons (like a radio
antenna)
●Soma: Integration and signal processing, protein synthesis, metabolic
activities
●Axon: Transmit signals over a distance (like telephone wires)
●Axon Terminal: Transmit signals to other neuron dendrites or tissues
(like a radio transmitter)
So, how a neuron sends signal to the body?
●Action Potential
○Electrical impulses that sends
signals around the body
●There are few more key
properties on which the function
of the Action Potential is
dependent
○Concentration Gradient
○Resting Membrane Potential
How Action Potentials Work
●Action potentials are temporary shift (from negative to positive) in the neuron’s membrane potential caused by ions
suddenly flowing in and out of the neuron
○ Sufficient current is required to initiate a voltage response in a cell membrane; if the current is insufficient to depolarize the membrane to
the threshold level, an action potential will not fire.
●There exist “voltage-gated” channels - they open and closed depends on the voltage difference across the cell
membrane
●Voltage-gated sodium channels have two gates
○ gate m
○ gate h
●Voltage-gated potassium channel only has one
○ gate n
●Gate m (the activation gate) is normally closed, and opens when the cell starts to get more positive.
●Gate h (the deactivation gate) is normally open, and swings shut when the cells gets too positive.
●Gate n is normally closed, but slowly opens when the cell is depolarized (very positive).
Firing a Neuron
●When we talk about neurons “firing” or being “active,” we’re talking about the action potential: a brief, positive change
in the membrane potential along a neuron’s axon
●When an Action Potential occurs, the neuron sends the signal to the next neuron in the communication chain, and, if an
action potential also occurs in the next neuron, then the signal will continue being transmitted
●What causes an action potential?
○ When a neuron receives a signal from another neuron (in the form of neurotransmitters, for most neurons), the signal causes a change in the
membrane potential on the receiving neuron.
○ The signal causes opening or closing of voltage-gated ion channels, channels that open or close in response to changes in the membrane
voltage.
○ The opening of voltage-gated ion channels causes the membrane to undergo either a hyperpolarization, where the membrane potential
increases in magnitude (becomes more negative) or a depolarization, where the membrane potential decreases in magnitude (becomes
more positive).
○ Whether the membrane undergoes a hyperpolarization or a depolarization depends on the type of voltage-gated ion channel that opened.
Firing a Neuron
●Not all depolarizations result in an action potential.
○The signal must cause a depolarization that is large enough in magnitude to
overcome the threshold potential, or the specific voltage that the membrane must
reach for an an action potential to occur.
○The threshold potential is usually about -55 mV, compared to the resting potential
of about -70 mV.
○If the threshold potential is reached, then an action potential is initiated at the
axon hillock in the following stages, through the action of the associated ion
channels:
Stages of Action Potential - Visualization
The formation of an action potential can be divided into
five steps:
[1] A stimulus from a sensory cell or another neuron
causes the target cell to depolarize toward the threshold
potential.
[2] If the threshold of excitation is reached, all Na+
channels open and the membrane depolarizes.
[3] At the peak action potential, K+ channels open and K+
begins to leave the cell. At the same time, Na+ channels
close.
[4] The membrane becomes hyperpolarized as K+ ions
continue to leave the cell. The hyperpolarized membrane is
in a refractory period and cannot fire. (5) The K+ channels
close and the Na+/K+ transporter restores the resting
potential.
[5] The K+ channels close and the Na+/K+ transporter
restores the resting potential.
Activation Function
History of ANN Development
2
Structure of ANN
3
Typical ANN Structure - roughly mapping to the Biological Neuron
Neural Network Zoo
4
Mathematics of ANN
5
Single Perceptron
●Definition
○A Neural Network is basically a dense interconnection
of layers, which are further made up of basic units
called perceptrons.
○A perceptron consists of input terminals, the
processing unit and the output terminals.
○The input terminals of a perceptron are connected to
the output terminals of the preceding perceptrons.
●A single perceptron receives a set of n-input values
from the previous layer.
●It calculates a weighted average of the values of the
vector input, based on a weight vector w and adds
bias to the results.
●The result of this calculation is passed through a
non-linear activation function, which forms the
output of the unit.
●The final goal is to optimize the weights and biases
that minimizes the inference error
Multi-layer Perceptron
L1, L2 and L3 represent a cascade of three
layers, where each layer is a stack of
perceptrons piled upon on one another
For mathematical modelling, all values
including the input, the output and the
intermediate weights are vectorized
Keeping L2 as the reference layer, the
input vector arises from output of layer L1.
Whereas the output vector of L2 is fed as
the input of L3
Neural Network Structure
1.Input Layer: The input layer consists of neurons that receive the input data
a.This layer does not perform any computations but simply passes the input features to
the first hidden layer.
2.Hidden Layers: Each hidden layer is composed of neurons that apply an
activation function to a weighted sum of inputs from the previous layer
a.The purpose of these layers is to extract features and learn complex patterns in the
data.
3.Output Layer: The output layer generates the final predictions of the network.
The number of neurons in the output layer depends on the task (e.g., one neuron
for binary classification, multiple neurons for multi-class classification, or
regression tasks).
Neural Network Functions
●In a neural network, "feedforward" and "backpropagation" are two
fundamental processes that define how the network makes predictions and
learns from data, respectively.
●Feedforward refers to the process of passing input data through the neural
network to generate an output i.e. inference
○This involves propagating the input data through each layer of the network until the
final output is produced.
●Backpropagation is the process of training the neural network by adjusting its
weights and biases based on the error of its predictions.
○It involves calculating the gradient of the loss function with respect to each weight
and bias and updating them to minimize the loss.
Feedforward Steps
Input Layer: input data X is fed into the network
Hidden Layer: data is transformed as it passes through each hidden layer. For each hidden layer l, for l =
1,2,3,.... L-1, the following operations occur:
Linear Transformation: Compute weighted sum + bias
z
(l)
= W
(l)
a
(l-1)
+ b
(l)
Activation Function: Apply a nonlinear function f to z
(l)
a
(l)
= f( z
(l)
)
Output Layer: The transformed data reaches the output layer, where a final linear transformation is
applied, followed by an output activation function (depending on the type of task)
z(L) = W(L) a(L-1) + b(L)
Backpropagation (1)
●Backpropagation is the process of training the neural network by adjusting its weights
and biases based on the error of its predictions
●It involves calculating the gradient of the loss function with respect to each weight and
bias and updating them to minimize the loss
●Loss Calculation Steps:
Backpropagation (2)
Backpropagation (3)
Class of Functions - Activation Functions
Class of Functions - Loss (Error) Functions
Putting all together
Build an ANN from Scratch
6
Deep Neural Network
7
What is Deep Neural Network
●Deeр learning is a subset оf machine learning
based оn artifiсial neural netwоrks, with ‘deeр’
referring tо the multiрle layers in these neural
netwоrks
●Compare to соnventiоnal maсhine learning
algоrithms, deeр learning mоdels сan extraсt
riсher, mоre abstraсt reрresentatiоns frоm
large and соmрlex datasets like images, text,
and audiо
Father of Deep Learning, Geoffrey Hinton
Ref: TheAiEdge.io
Convolutional Neural Network (CNN)
8
Introducing CNN
●CNNs are a class of Deep Neural Networks that can recognize and classify
particular features from images and are widely used for analyzing visual
images.
●There are two main parts to a CNN architecture
○A convolution tool that separates and identifies the various features of the image for analysis in a process
called Feature Extraction.
○A fully connected layer that utilizes the output from the convolution process and predicts the class of the
image based on the features extracted in previous stages
●Convolutional Layer
○This layer is used to extract the various features from the input images
○ In this layer, the mathematical operation of convolution is performed between the input
image and a filter of a particular size MxM
○By sliding the filter over the input image, the dot product is taken between the filter and the
parts of the input image with respect to the size of the filter (MxM).
○The output is termed as the Feature map which gives us information about the image such as
the corners and edge
○Later, this feature map is fed to other layers to learn several other features of the input image.
Architecture of CNN - Pooling Layer
●Generally, a Convolutional Layer is followed by a Pooling Layer
●The primary aim of this layer is to decrease the size of the convolved feature map
to reduce computational costs
●This is performed by decreasing the connections between layers and
independently operates on each feature map
Depending upon the method used, there are several types of Pooling operations.
○In Max Pooling, the largest element is taken from the feature map
○Average Pooling calculates the average of the elements in a predefined size
Image section
○The total sum of the elements in the predefined section is computed in Sum
Pooling
Architecture of CNN - FC Layer
●The Fully Connected (FC) layer consists of the weights and biases along with the
neurons and is used to connect the neurons between two different layers
(structurally similar to multilayer perceptron but plays different role)
●These layers are usually placed before the output layer and form the last few
layers of a CNN Architecture.
●In this, the input image from the previous layers are flattened and fed to the FC
layer
●The flattened vector then undergoes few more FC layers where the
mathematical function operations usually take place
○In this stage, the classification process begins to take place
Introducing Convolution
●Convolution is a mathematical operation that allows the
merging of two sets of information
●In the case of CNN, convolution is applied to the input
data to filter the information and produce a feature map
●This filter is also called a kernel, or feature detector, and
its dimensions can be, for example, 3x3
●To perform convolution, the kernel goes over the input
image, doing matrix multiplication element after element
●The result for each receptive field (the area where
convolution takes place) is written down in the feature
map
Receptive Field
Introducing Convolution
●Filters are responsible for feature extraction in CNNs
○For example, some filters might become specialized to
detect horizontal edges in an image, others might
detect vertical edges, colors, textures, etc.
○As the model becomes deeper, the filters can
recognize more complex patterns
●During training, the CNN learns the best values for
the filter weights to achieve the task at hand (e.g.,
classifying images, detecting objects, etc.)
●This is done through backpropagation and
optimization algorithms, just like any other weights in
a neural network
Visualizing Filters: Plot of the First 6 Filters
From VGG16 With One Subplot per Channel
Convolution operation in simple terms
●Place the Filter: Start by placing the filter on the top-left corner of the picture. The filter covers a small
portion of the picture (say a 3x3 area if the filter is 3x3).
●Multiply and Add: For each position of the filter:
○Multiply each number in the filter by the corresponding number in the picture that the filter is
covering.
○Add up all these multiplied numbers to get a single number.
●Record the Result: Write down this single number in a new grid. This new grid will be the output of the
convolution operation.
●Slide the Filter: Move the filter a little bit to the right (or down, depending on the order of movement).
Repeat the multiply and add process, and write down the result in the new grid.
●Repeat: Keep sliding the filter over the whole picture, repeating the multiply and add process, and
writing down the results until you've covered the entire picture.
●By the end, a new grid of numbers that represents the original picture but transformed in a way that
highlights certain features, like edges or textures
Additional Mathematics for CNN
9
Mathematics of Convolution Operation (1)
Mathematics of Convolution Operation (2)
Mathematics of Pooling (1)
Mathematics of Pooling (2)
Example Calculation of Convolution and Pooling
Malware Detection using CNN10
Logical Flow of the solution
Malware Binary
Preprocess the
Converted Image
Preprocess input
through CNN
Prediction
Output
Predicted Malware Family Name
No
Yes
Convert to 8 bit Vector 8 bit Vector convert to Grayscale image
Convert to Gray-scale Image
CNN Architecture
CNN model includes following layers to make it perform feature and pattern extractions from images and help
classify the malware family.
●Convolutional Layer : 30 filters, (3 * 3) kernel size
●Max Pooling Layer : (2 * 2) pool size
●Convolutional Layer : 15 filters, (3 * 3) kernel size
●Max Pooling Layer : (2 * 2) pool size
●DropOut Layer : Dropping 25% of neurons.
●Flatten Layer
●Dense/Fully Connected Layer : 128 neurons, Relu activation function
●DropOut Layer : Dropping 50% of neurons.
●Dense/Fully Connected Layer : 50 neurons, Softmax activation function
●Dense/Fully Connected Layer : num_class neurons, Softmax activation function
The input has a shape of [64 * 64 * 3] : [width * height * depth]. In our case, each Malware is a RGB image.
Malware Can belong to one of the following
classes
●Adialer.C
●Agent.FYI
●Allaple.A
●Allaple.L
●Alueron.gen!J
●Autorun.K
●C2LOP.P
●C2LOP.gen!g
●Dialplatform.B
●Dontovo.A
●Fakerean
●Instantaccess
●Lolyda.AA1
●Lolyda.AA2
●Lolyda.AA3
●Lolyda.AT
●Malex.gen!J
●Obfuscator.AD
●Rbot!gen
●Skintrim.N
●Swizzor.gen!E
●Swizzor.gen!I
●VB.AT
●Wintrim.BX
●Yuner.A