01-Oct-2024_PES-VectorDatabasesAndAI.pdf

bunkertor 186 views 80 slides Oct 01, 2024
Slide 1
Slide 1 of 80
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80

About This Presentation

data


Slide Content

1 | © Copyright 2024 Zilliz1
Deep Dive on Vector Databases and AI
Tim Spann @ Zilliz

2 | © Copyright 2024 Zilliz2 2| © Copyright 10/22/23 Zilliz 2| © Copyright 2024 Zilliz
Tim Spann
Principal Developer
Advocate, Zilliz
[email protected]
https://www.linkedin.com/in/timothyspann/
https://x.com/PaaSDev

It will be held by "Linux Legion Club" of The PES University.
The session will be held in AMA manner that is:
●05 min : intro
●30+ mins : presentation
●left over time : Q&A by audience

4
Vector Database : making sense of unstructured data
2024
A vector database stores embedding vectors and allows for semantic
retrieval of various types of unstructured data.

5 | © Copyright Zilliz5 | © Copyright Zilliz5
Milvus is an Open-Source Vector Database to
store, index, manage, and use the massive
number of embedding vectors generated by
deep neural networks and LLMs.
contributors
400
stars
29K
docker pulls
66M
forks
2.7K
+
Milvus: The most widely-adopted vector database

6
Rich functionality
2024

7 | © Copyright Zilliz7
Well-connected in LLM infrastructure to
enable RAG use cases
Framework
Hardware
Infrastructure
Embedding Models LLMs
Software Infrastructure
Vector Database

8 | © Copyright Zilliz8

92024

1
0
Retrieval-Augmented Generation (RAG)
2024
A technique that combines the
strength of retrieval-based and
generative models:
●Improve accuracy and relevance
●Eliminate hallucination
●Provide domain-specific
knowledge

11 | © Copyright Zilliz11
D Demos

12 | © Copyright Zilliz12
Multi-Modal Search
multimodal-demo.milvus.io

13 | © Copyright Zilliz13
AArchitecture

142024
Learn
https://zilliz.com/learn/Retrieval-Augmente
d-Generation
https://zilliz.com/blog/scale-search-with-mi
lvus-handle-massive-datasets-with-ease

15 | © Copyright Zilliz15
Meta Storage
Root Query Data Index
Coordinator Service
Proxy
Proxy
etcd
Log Broker
SDK
Load Balancer
DDL/DCL
DML
NOTIFICATION
CONTROL SIGNAL
Object Storage
Minio / S3 / AzureBlob
Log Snapshot Delta File Index File
Worker Node
QUERY DATA DATA
Message Storage
VECTOR
DATABASE
Access Layer
Query Node Data Node Index Node
Milvus Architecture

16 | © Copyright Zilliz16
Distributed
Architecture

17 | © Copyright Zilliz17
















Dynamic Scaling ??????








Stateless components
for Easy Scaling
Data sharding across
multiple nodes
Horizontal Pod
Autoscaler (HPA)
●Query, Index, and
Data Nodes can be
scaled
independently
●Allows for optimized
resource allocation
based on workload
characteristics
●Distributes large
datasets across
multiple Data Nodes
●Enables parallel
processing for
improved query
performance
●Automatically
scales up and down
●Custom metrics can
be used (e.g., query
latency, throughput)

18 | © Copyright Zilliz18
Stateless Architecture
























Stateless Components
All Milvus components are deployed Stateless.
Object Storage
Milvus relies on Object Storage (MinIO, S3, etc) for data
persistence.
Vectors are stored in Object Storage, Metadata is in etcd.
Scaling and Failover
Scaling and failover don't involve traditional data rebalancing.
When new pods are added or existing ones fail, they can
immediately start handling requests by accessing data from the
shared object storage.

19 | © Copyright Zilliz19
















Different Consistency levels
Trade Offs
●Strong: Guaranteed up-to-date
reads, highest latency
●Bounded: Reads may be slightly
stale, but within a time bound
●Session: Consistent reads within a
session, may be stale across
sessions
●Eventually: Lowest latency, reads
may be stale
Ensures every node or replica has the
same view of data at a given time.
●Strong consistency for critical
applications requiring accurate
results
●Eventually consistency for
high-throughput,
latency-sensitive apps

20 | © Copyright Zilliz20
S
Schemas

21 | © Copyright Zilliz21
Schemas
https://milvus.io/docs/schema.md
Properties Description Note
name Name of the field in the collection to
create
Data type: String.
Mandatory
dtype Data type of the field Mandatory

description Description of the field Data type: String.
Optional
is_primary Whether to set the field as the
primary key field or not
Data type: Boolean (true or false).
Mandatory for the primary key field

22 | © Copyright Zilliz22
Schemas
Properties Description Note
auto_id Mandatory for primary key
field)
Switch to enable or disable
automatic ID (primary key)
allocation.
True or False
max_length Mandatory for VARCHAR
field)
Maximum length of strings allowed
to be inserted.
1, 65,535
dim Dimension of the vector Data type: Integer ∈1, 32768.
Mandatory for a dense vector field.
Omit for a sparse vector field.
is_partition_key Whether this field is a partition-key
field.
Data type: Boolean (true or false).

23 | © Copyright Zilliz23
Data Types
Primary key field supports:

INT64: numpy.int64
VARCHAR VARCHAR


Scalar field supports:

BOOL Boolean (true or false)
INT8: numpy.int8
INT16: numpy.int16
INT32: numpy.int32
INT64: numpy.int64
FLOAT: numpy.float32
DOUBLE: numpy.double
VARCHAR VARCHAR

JSON JSON

Array: Array

24 | © Copyright Zilliz24
Data Types
JSON as a composite data type is available. A JSON field comprises key-value pairs. Each key is a string,
and a value can be a number, string, boolean value, array, or list. For details, refer to JSON: a new data
type.

Vector field supports:

BINARY_VECTOR: Stores binary data as a sequence of 0s and 1s, used for compact feature representation
in image processing and information retrieval.

FLOAT_VECTOR: Stores 32-bit floating-point numbers, commonly used in scientific computing and
machine learning for representing real numbers.

FLOAT16_VECTOR: Stores 16-bit half-precision floating-point numbers, used in deep learning and GPU
computations for memory and bandwidth efficiency.

BFLOAT16_VECTOR: Stores 16-bit floating-point numbers with reduced precision but the same exponent
range as Float32, popular in deep learning for reducing memory and computational requirements without
significantly impacting accuracy.

SPARSE_FLOAT_VECTOR : Stores a list of non-zero elements and their corresponding indices, used for
representing sparse vectors. For more information, refer to Sparse Vectors.

25 | © Copyright Zilliz25
E
Embedding Models

26 | © Copyright Zilliz26
Vector Embedding

27 | © Copyright Zilliz27
Vector Space

28 | © Copyright Zilliz28
Choose Your Embedding Function

292024
Further Dive
https://zilliz.com/learn/generative-ai
https://zilliz.com/learn/what-are-binary-v
ector-embedding

30 | © Copyright Zilliz30
I Indexing

31 | © Copyright Zilliz31
Choose Your Own Index
https://zilliz.com/learn/choosing-right-vector-index-for-your-project

32 | © Copyright Zilliz32
Category Index Accuracy Latency ThroughputIndex TimeCost
Graph-based Cagra GPU High Low Very High Fast Very High

HNSW
High Low High Slow High

DiskANN
High High Mid Very Slow Low
Quantization-base
d or cluster-based
ScaNN Mid Mid High Mid Mid
IVF_FLAT Mid Mid Low Fast Mid

IVF +
Quantization
Low Mid Mid Mid Low

33 | © Copyright Zilliz33

34 | © Copyright Zilliz34
Picking an Index
●100% Recall – Use FLAT search if you need 100% accuracy
●10MB < index_size < 2GB  Standard IVF
●2GB < index_size < 20GB  Consider PQ and HNSW
●20GB < index_size < 200GB  Composite Index, IVF_PQ or
HNSW_SQ
●Disk-based indexes

352024
Indexes
Most of the vector index types supported by Milvus use approximate nearest neighbors search ANNS,
●HNSW: HNSW is a graph-based index and is best suited for scenarios that have a high demand for
search efficiency. There is also a GPU version GPU_CAGRA, thanks to Nvidiaʼs contribution.
●FLAT: FLAT is best suited for scenarios that seek perfectly accurate and exact search results on a small,
million-scale dataset. There is also a GPU version GPU_BRUTE_FORCE .
●IVF_FLAT: IVF_FLAT is a quantization-based index and is best suited for scenarios that seek an ideal
balance between accuracy and query speed. There is also a GPU version GPU_IVF_FLAT.
●IVF_SQ8: IVF_SQ8 is a quantization-based index and is best suited for scenarios that seek a significant
reduction on disk, CPU, and GPU memory consumption as these resources are very limited.
●IVF_PQ: IVF_PQ is a quantization-based index and is best suited for scenarios that seek high query
speed even at the cost of accuracy. There is also a GPU version GPU_IVF_PQ.

362024
Indexes Continued.
●SCANN: SCANN is similar to IVF_PQ in terms of vector clustering and product quantization. What makes
them different lies in the implementation details of product quantization and the use of SIMD
Single-Instruction / Multi-data) for efficient calculation.
●DiskANN: Based on Vamana graphs, DiskANN powers efficient searches within large datasets.

37 | © Copyright 2024 Zilliz37
Indexing
Strategies
•Tree based

•Graph based

•Hash based

•Cluster based

38 | © Copyright Zilliz38

39 | © Copyright Zilliz39

40 | © Copyright Zilliz40
Quantization Based Indexes
https://medium.com/@tspann/how-good-is-quantization-in-milvus-6d224b5160b0

https://zilliz.com/learn/scalar-quantization-and-product-quantization?source=post_page-----6d224b51
60b0--------------------------------

41 | © Copyright 2024 Zilliz41 | © Copyright 9/25/23 Zilliz 41
FLAT

42 | © Copyright 2024 Zilliz42
FLAT Index

43 | © Copyright 2024 Zilliz43 | © Copyright 9/25/23 Zilliz 43
Inverted File FLAT
IVFFLAT

44 | © Copyright 2024 Zilliz44
IVFFLAT Index

45 | © Copyright 2024 Zilliz45
IVFFLAT Index

46 | © Copyright 2024 Zilliz46
IVFFLAT Index

47 | © Copyright 2024 Zilliz47 | © Copyright 9/25/23 Zilliz 47
Hierarchical Navigable
Small World HNSW

48 | © Copyright 2024 Zilliz48
HNSW  Skip List

49 | © Copyright 2024 Zilliz49
•Built by randomly shuffling data points and inserting them one by
one, with each point connected to a predefined number of edges
M.

⇒ Creates a graph structure that exhibits the "small world".

⇒ Any two points are connected through a relatively short path.
HNSW  NSW Graph

50 | © Copyright 2024 Zilliz50
HNSW

51 | © Copyright Zilliz51
S
Search

52 | © Copyright Zilliz52
How Similarity Search Works
V
n, 1



1
2
34
5
Transform into
Vectors
Unstructured Data
Images
User Generated
Content
Video
Documents
Audio
Vector Embeddings
Perform Approximate
Nearest Neighbor
Similarity Search
Perform Query
Get Results
Store in Vector Database

53 | © Copyright Zilliz53
Image from Nvidia
Vector Search Overview

54 | © Copyright Zilliz54
Image from Nvidia
Vector Search Overview

55 | © Copyright Zilliz55
Vector Similarity Measures: L2 Euclidean)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
d(Queen, King) = √(0.3-0.5)
2
+ (0.9-0.7)
2

= √(0.2)
2
+ (0.2)
2

= √0.04 + 0.04
= √0.08 ≅ 0.28

56 | © Copyright Zilliz56
Vector Similarity Measures: Inner Product IP
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Queen · King = (0.3*0.5) + (0.9*0.7)
= 0.15 + 0.63 = 0.78

57 | © Copyright Zilliz57
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Vector Similarity Measures: Cosine
??????
cos(Queen, King) = (0.3*0.5)+(0.9*0.7)
√0.3
2
+0.9
2
* √0.5
2
+0.7
2

= 0.15+0.63 _
√0.9 * √0.74
= 0.78 _
√0.666
≅ 0.03

58 | © Copyright Zilliz58
Hybrid Search

59 | © Copyright Zilliz59
M
METADATA FILTERING

60 | © Copyright Zilliz60

61 | © Copyright 2024 Zilliz61
Filtering on Metadata
●Search Space Reduction w/ Pre-Filtering
●Bitset Wizardry ??????
○Use Compact Bitsets to represent Filter Matches
○Low-level CPU operations for speed
●Scalar Indexing
○Bloom Filter
○Hash
○Tree-based

62 | © Copyright Zilliz62
P
Partition by Name or Key

How To
https://medium.com/@tspann/partitioning-collections-by-name-39
5eb48a2238

https://medium.com/@tspann/super-charge-your-ai-applications-
with-easy-high-speed-multi-tenancy-with-partition-keys-5fa58112
7dd6

64 | © Copyright Zilliz64
15
RERANKING

65 | © Copyright Zilliz65
Choose Your Rerank Function
https://medium.com/@tspann/ranking-for-relevance-with-bm25-b2d9dd62e2f8

66 | © Copyright Zilliz66
Rerankers
https://zilliz.com/learn/what-are-rerankers-enhance-information-retrieval

67 | © Copyright Zilliz67
Q
Q & A

68 | © Copyright Zilliz68
https://zilliz.com/learn/milvus-notebooks

69 | © Copyright Zilliz69
Vector Database Resources
Give Milvus a Star!




Chat with me on Discord!
https://github.com/milvus-io/milvus

70
Unstructured Data Meetup


https://www.meetup.com/unstructured-data-meetup-new-york/

This meetup is for people working in unstructured data. Speakers will come present about related topics
such as vector databases, LLMs, and managing data at scale. The intended audience of this group
includes roles like machine learning engineers, data scientists, data engineers, software engineers, and
PMs.
This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.

https://medium.com/@tspann/unstructured-street-data-in-new-york-8d3cde0a1e5b

https://medium.com/@tspann/not-every-field-is-just-text-numbers-or-vectors-976231e90e4d

https://medium.com/@tspann/shining-some-light-on-the-new-milvus-lite-5a0565eb5dd9

https://medium.com/@tspann/unstructured-data-processing-with-a-raspberry-pi-ai-kit-c959dd7fff47
Raspberry Pi AI Kit Hailo
Edge AI

76 | © Copyright 2024 Zilliz76
76
This week in Milvus, Towhee, Attu, GPT
Cache, Gen AI, LLM, Apache NiFi, Apache
Flink, Apache Kafka, ML, AI, Apache Spark,
Apache Iceberg, Python, Java, Vector DB
and Open Source friends.
https://bit.ly/32dAJft
https://github.com/milvus-io/milvus

AIM Weekly by Tim Spann

77 | © Copyright 2024 Zilliz77
milvus.io
github.com/milvus-io/
@milvusio
@paasDev


/in/timothyspann
Connect with me! Thank you!

78 | © Copyright 2024 Zilliz78
Milvus
Open Source Self-Managed

Zilliz Cloud
SaaS Fully-Managed

github.com/milvus-io/milvus
Getting Started with Vector
Databases
zilliz.com/cloud
29K - Star us on GitHub!

79 | © Copyright 2024 Zilliz79
Zilliz is
Hiring!

Join our Team

zilliz.com/careers
•Developer Advocate
•Staff Software Engineer
•Solutions Architect
•and more!

80 | © Copyright 2024 Zilliz80
Get started for free
zilliz.com/cloud