Vector Search @ sw2con for slideshare.pptx

jbellis 155 views 46 slides May 14, 2024

Slide 1 of 46

About This Presentation

Modern Vector Search, covering the biggest advances in the past ten years. Featuring HNSW, DiskANN, PQ, LVQ, Fused ADC

Size: 33.6 MB

Language: en

Added: May 14, 2024

Slides: 46 pages

Slide Content

Modern Vector Search SW2Con 2024

magic [0.3025549650192261, 0.1912980079650879, 0.04950578138232231, 0.13541743159294128, 0.22033651173114777, 0.3047471046447754, 0.03519149497151375, 0.41724318265914917, 0.46010446548461914, 0.13088607788085938, 0.11903445422649384, 0.30909594893455505, 0.2992345690727234, 0.17327798902988434, 0.02294405922293663, 0.20794396102428436, 0.46378788352012634, 0.16246692836284637, 0.7109631896018982, 0.20986509323120117, 0.1922052949666977, ... 2048 Dimensions

K Nearest Neighbors search (KNN) ‹#›

The curse of dimensionality, KNN edition ‹#›

The curse of dimensionality, KNN edition ‹#› 3D: 0.1%

‹#› ANN (Approximate Nearest Neighbor)

Partitioned and graph indexes ‹#›

Milestones in ANN ‹#› 2015: FastScan 2016: HNSW 2017: Quick ADC 2018: Quicker ADC 2019: DiskANN 2020: SCANN and APQ 2021: NGT QG 2022: SPANN 2023: LVQ 2024: JVector LTM C ompression Graph construction Compression Compression Graph construction + compression Compression Compression Partitioning Compression Graph construction

IVF Partitioning ‹#›

Search ‹#›

KMeans ‹#›

IVF search ‹#› Coarse: O(centroids) Accurate: O(M * N/centroids) Centroid count needs to be relatively high FAISS recommends O(sqrt(N)) = 64K for N in 1..10M

SPANN (MSR, 2022) ‹#› Better partitioning Dynamic pruning during search Hybrid architecture: centroids in memory, postings lists on disk Scales to 1B+ vectors

Vector database index choices ‹#› Astra Lucene Milvus Pinecone Qdrant Weaviate pgVector Graph Graph Graph Graph Graph Graph Partitioned Graph

Partitioning downsides ‹#› KMeans is O(t*k*n*d) Incremental construction is difficult and slow Difficult to handle deletes SOTA is relatively complex

Graph indexes ‹#›

HNSW (Malkov + Yashunin, 2016) ‹#› First modern graph index Still in use in e.g. Lucene Single-pass search, everything in memory

HNSW: diversity heuristic ‹#›

Hierarchical NSW ‹#›

Larger-than-memory HNSW ‹#›

DiskANN (MSR, 2019) ‹#› Single graph layer Coarse + Accurate passes Coarse performed using compressed vectors in memory Accurate reranks coarse results using full resolution vectors from disk Scales to 1B+ vectors

DiskANN single layer design ‹#›

Non-blocking concurrency = linear scaling ‹#›

Product Quantization (PQ) ‹#›

Product Quantization (PQ)

Without reranking ‹#›

PQ with transparent reranking ‹#›

Binary Quantization is very lossy ‹#›

PQ is very, very hard to beat consistently ‹#›

LVQ: better than PQ at small(er) ratios ‹#›

DiskANN performance ‹#› O(log N) coarse search O(topK) rerank Still O(N) memory use

Beyond DiskANN ‹#› 2023: 10M is a big vector index 2024: 1B is a big vector index (and customers are asking when they can have 10B)

Larger-than-memory index construction ‹#› DiskANN (2019) Split dataset into 40 partitions using kmeans Index each partition separately, adding each node to closest 2 partitions Take the union of edges across all partitioned indexes to make one big index 5 days to build Deep1B dataset (350GB)

Larger-than-memory index construction ‹#› JVector (2024) Build the index using two-phase search (PQ in memory, full resolution on disk) 3h to build Cohere-v3-wikipedia (180GB)

Reducing memory footprint from O(N) to O(1) ‹#› Fused ADC First implemented by NGT (Yahoo! Japan, 2021) Apply Quicker ADC to graph indexes PQ lookup tables stored on disk, not in memory

Memory for 10M openai-v3-small vectors ‹#›

‹#› Conclusion

Milestones in ANN ‹#› 2015: FastScan 2016: HNSW 2017: Quick ADC 2018: Quicker ADC 2019: DiskANN 2020: SCANN 2021: NGT QG 2022: SPANN 2023: LVQ 2024: JVector LTM

Not like this ‹#›

What actually matters ‹#› Basic: Support for PQ Support for reranking Advanced: Larger-than-memory index construction O(1) memory footprint for queries Support for LVQ

Further reading ‹#› DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node Quicker ADC : Unlocking the hidden potential of Product Quantization with SIMD Locally-Adaptive Quantization for Streaming Vector Search

Vector Search @ sw2con for slideshare.pptx

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Vector Search @ sw2con for slideshare.pptx

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......