Generative AI and the Rise of Large Language Models
dsclaubyblos
122 views
20 slides
May 14, 2024
Slide 1 of 20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
About This Presentation
A slideshow tackling a hot new topic: GenAI. Visually explains the core fundemental mechanisms behind LLMs and Generative Models.
Size: 2.05 MB
Language: en
Added: May 14, 2024
Slides: 20 pages
Slide Content
Generative AI
The Rise of Language Models
AI is the new Black
Google releasing
their new model
with 80B params
OralB dropping
their AI
toothbrush
AI, ML, DL, GI...
AI
ML
DL
GI
Simulated intelligence
exhibited by machines.
Pattern Recognition from data (Linear
Regression, Random Forests, ...)
Uses Neural Networks to learn
complex data relationships.
Creates new content (images,
text, music) based on patterns
learned from existing data.
Large Language Models
Petabytes of
input data,
Billions of
parameters
Designed to understand
and interact with human
language, grasp grammar,
context and even cultural
references
Model that makes
predictions or generates
outputs based on input
Large Language Models
Applications
Text
Summarization
Chatbots
Code
Generation
Copywriting
Synthetic
Data
Language
Translation
Softmax
Decoder
Output
Embeding
The Transformer, centerpiece of GenAI
Encoder
Input
Embedding
Attention Layers, where every
word becomes aware of its
surroundings
Pretrained in an unsupervised
manner with large unlabeled
datasets
Fine tuned on supervised
training to perform better
Unlike RNN, runs sequences in
parallel which makes them
very fast
Attention
Layers
Softmax
Embeding
Decoder
Embeding
DecoderDecoder
The Underlying Magic: Encoders and
Decoders
Encoder
Embedding
EncoderEncoder
GPT (Generative Pretrained Transformer)
BERT
But how does it actually
work?
Prediction
Word Embeddings
Embeddings
Vectorizing Words
Text Corpus
[ ? ] went to the woods...
[ ? ] sold all their stocks in...
[][]
0.7
0.5
1.1
3.4
0.2
0.1
9.1
4.4
Extract
Attention Mechanism
The quick brown fox jumps over the lazy dog
Q: Any adjectives
in front of me?[]
0.2
0.1
9.1
4.4
Attention Mechanism
The quick brown fox jumps over the lazy dog
Q: Any adjectives
in front of me?[]
0.2
0.3
7.8
4.4
Let me adjust
my definition...
Right here! I am!
Transfer Learning
Acquired knowledge in the pretraining phase is transferred to downstream tasks
Model is fine tuned to specific task using an annotated dataset
Enables generalization to new tasks and reduces the need for large annotated
datasets
As always, this sounds too
good to be true...
Hallucinations
Where do they come from?
Due to:
Limited Knowledge: The user is asking about something the model was not
trained on
Model: Limitations of the current LLM (small model, needs fine-tuning)
Data:
Most common technique is Retrieval Augmented Generation
If Data is not well organized -> will retrieve wrong info
Hallucination is inevitable: https://arxiv.org/abs/2401.11817
Hallucinations
How to deal with them?
Model
Fine-Tuning
Retrieval
Augmented
Generation
(RAG)
Prompt
Engineering
X
Our
Data
Vector
DB
Similarity
Search
Retrieval Augmented Generation
Query
Response
LangChain
Simplifies LLM-powered app development, connecting to diverse data
sources for personalized experiences
Supports chatbots, retrieval-augmented generation, summarization, and
synthetic data. Integrates with Amazon, Google, Microsoft, OpenAI, etc.
Chains with programming languages, platforms, financial data, web
scraping, scientific papers.