Introduction to Multilingual Retrieval Augmented Generation (RAG)

chloewilliams62 766 views 34 slides May 09, 2024
Slide 1
Slide 1 of 34
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34

About This Presentation

Retrieval augmented generation (RAG) is the most popular style of large language model application to emerge from 2023. The most basic style of RAG works by vectorizing your data and injecting it into a vector database like Milvus for retrieval to augment the text output generated by an LLM. This is...


Slide Content

1 | © Copyright 2024 Zilliz1
Yujian Tang | Zilliz

Multilingual RAG

2 | © Copyright 2024 Zilliz2
Yujian Tang
Senior Developer Advocate, Zilliz
[email protected]
https://www.linkedin.com/in/yujiantang
https://www.twitter.com/yujian_tang
Speaker

3 | © Copyright 2024 Zilliz3
01RAG Review
CONTENTS
03
04Demo
02LLMs and Embedding Models
Vector Databases

4 | © Copyright 2024 Zilliz4
01
RAG Review

5 | © Copyright 2024 Zilliz5
RAG
RAG
Inject your data via a vector
database like Milvus/Zilliz
Primary Use Case
-Factual Recall
-Forced Data Injection
-Cost Optimization

6 | © Copyright 2024 Zilliz6
Query LLM
Milvus
Your Data
Embedding
Model

7 | © Copyright 2024 Zilliz7
02
LLMs and Embedding Models

8 | © Copyright 2024 Zilliz8
How did LLMs come about?

9 | © Copyright 2024 Zilliz9
A Basic Neural Net

10 | © Copyright 2024 Zilliz10
A Recurrent Neural Network

11 | © Copyright 2024 Zilliz11
A Transformer Architecture

12 | © Copyright 2024 Zilliz12
GPT

13 | © Copyright 2024 Zilliz13
What about Embedding Models?

14 | © Copyright 2024 Zilliz14
Vector
Databases
Deep Learning Models w/o Last Layer

15 | © Copyright 2024 Zilliz15
LLMs
-Large models
-Generate text
-Reasoning capability
-Based on
transformers
Embedding Models
-Smaller
-Non predictive
-Non generative

16 | © Copyright 2024 Zilliz16
03
Vector Databases

17 | © Copyright 2024 Zilliz17
Find Semantically Similar Data
Apple made profits of $97 Billion in 2023

I like to eat apple pie for profit in 2023

Apple’s bottom line increased by record numbers in 2023

18 | © Copyright 2024 Zilliz18
But wait! There’s more!

19 | © Copyright 2024 Zilliz19
Semantic Similarity
Image from Sutor et al
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Man = [0.5, 0.2]
Queen - Woman + Man = King
Queen = [0.3, 0.9]
- Woman = [0.3, 0.4]
[0.0, 0.5]
+ Man = [0.5, 0.2]
King = [0.5, 0.7]Man = [0.5, 0.2]

20 | © Copyright 2024 Zilliz20
Similarity metrics are ways to measure distance in
vector space

21 | © Copyright 2024 Zilliz21
Vector Similarity Metric: L2 (Euclidean)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
d(Queen, King) = √(0.3-0.5)
2
+ (0.9-0.7)
2

= √(0.2)
2
+ (0.2)
2

= √0.04 + 0.04
= √0.08 ≅ 0.28

22 | © Copyright 2024 Zilliz22
Vector Similarity Metric: Inner Product (IP)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Queen · King = (0.3*0.5) + (0.9*0.7)
= 0.15 + 0.63 = 0.78

23 | © Copyright 2024 Zilliz23
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Vector Similarity Metric: Cosine
??????
cos(Queen, King) = (0.3*0.5)+(0.9*0.7)
√0.3
2
+0.9
2
* √0.5
2
+0.7
2

= 0.15+0.63 _
√0.9 * √0.74
= 0.78 _
√0.666
≅ 0.03

24 | © Copyright 2024 Zilliz24
Vector Similarity Metrics
Euclidean - Spatial distance
Cosine - Orientational distance
Inner Product - Both

With normalized vectors, IP = Cosine

25 | © Copyright 2024 Zilliz25
Indexes organize the way we access our data

26 | © Copyright 2024 Zilliz26
Inverted File Index
Source:
https://towardsdatascience.com/similarity-search-with-ivfpq-9c6348fd4db3

27 | © Copyright 2024 Zilliz27
Hierarchical Navigable Small Worlds (HNSW)
Source:
https://arxiv.org/ftp/arxiv/papers/1603/1603.09320.pdf

28 | © Copyright 2024 Zilliz28
Scalar Quantization (SQ)

29 | © Copyright 2024 Zilliz29
Product Quantization
Source:
https://towardsdatascience.com/product-quantization-for-similarity-search-2f1f67c5fddd

30 | © Copyright 2024 Zilliz30
Indexes Overview
-IVF = Intuitive, medium memory, performant
-HNSW = Graph based, high memory, highly performant
-Flat = brute force
-SQ = bucketize across one dimension, accuracy x
memory tradeoff
-PQ = bucketize across two dimensions, more accuracy x
memory tradeoff

31 | © Copyright 2024 Zilliz31
04
Demo

32 | © Copyright 2024 Zilliz32
Query LLM
Language Data
Embedding
Model(s)

33 | © Copyright 2024 Zilliz33
RAG

34 | © Copyright 2024 Zilliz34
Start building
with Zilliz Cloud today!
zilliz.com/cloud
Tags