Multimodal Retrieval Augmented Generation (RAG) with Milvus

chloewilliams62 536 views 26 slides Jun 27, 2024

Slide 1 of 26

About This Presentation

We've seen an influx of powerful multimodal capabilities in many LLMs. In this talk, we'll vectorize a dataset of images and texts into the same embedding space, store them in Milvus, retrieve all relevant data using multilingual texts and/or images and input multimodal data as context into ...

Size: 2.57 MB

Language: en

Added: Jun 27, 2024

Slides: 26 pages

Slide Content

1 | © Copyright 2024 Zilliz1
Multimodal RAG with Milvus
Yi Wang @ Zilliz

2 | © Copyright 2024 Zilliz2
01RAG is the New Search
CONTENTS
02Multimodal Retrieval with Milvus

3 | © Copyright 2024 Zilliz3
RAG is the New Search

4 | © Copyright 2024 Zilliz4
Retrieval-Augmented Generation

5 | © Copyright 2024 Zilliz5
A Typical Search System
Picture Credit: https://web.eecs.umich.edu/~nham/EECS398F19/

11 | © Copyright 2024 Zilliz11
Multi-modal Retrieval
●Combining text and
image in the search
query
●Retrieving
multi-modal content
for generation
Query = "feuilles brunes pendant la journée"

(i.e. "brown leaves during daytime")

17 | © Copyright 2024 Zilliz17
Data Preparation
?????? Download images.zip file directly from:
https://huggingface.co/datasets/unum-cloud/ann-unsplash-25k/tree/main
import glob, time, pprint
import numpy as np
from PIL import Image
import pandas as pd

# Load image files and descriptions
image_data = pd.read_csv('images.csv')
print(image_data.shape)
display(image_data.head(2))

# List of image urls and texts.
image_urls = list(image_data.photo_id)
image_texts = list(image_data.ai_description)

18 | © Copyright 2024 Zilliz18
Create a Milvus Collection
# STEP 1. Connect to milvus
connection = connections.connect(
alias="default",
host='localhost', # or '0.0.0.0' or 'localhost'
port='19530'
)

# STEP 2. Create a new collection and build index

EMBEDDING_DIM = 256
MAX_LENGTH = 65535

# Step 2.1 Define the data schema for the new Collection.
fields = [
# Use auto generated id as primary key
FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, auto_id=True, max_length=100),
FieldSchema(name="text_vector", dtype=DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM),
FieldSchema(name="image_vector", dtype=DataType.FLOAT_VECTOR, dim=EMBEDDING_DIM),
FieldSchema(name="chunk", dtype=DataType.VARCHAR, max_length=MAX_LENGTH),
FieldSchema(name="image_filepath", dtype=DataType.VARCHAR, max_length=MAX_LENGTH),
]
schema = CollectionSchema(fields, "")

# Step 2.2 create collection
col = Collection(“Demo_multimodal”, schema)

# Step 2.3 Build index for both vector columns .
image_index = {"metric_type": "COSINE"}
col.create_index("image_vector", image_index)
text_index = {"metric_type": "COSINE"}
col.create_index("text_vector", text_index)
col.load()

19 | © Copyright 2024 Zilliz19
Data Vectorization & insertion
# STEP 4. Insert data into milvus OR zilliz.
# Prepare data batch.
chunk_dict_list = []
for chunk, img_url, img_embed, text_embed in zip(
batch_texts,
batch_urls,
image_embeddings, text_embeddings):

# Assemble embedding vector, original text chunk, metadata.
chunk_dict = {
'chunk': chunk,
'image_filepath': img_url,
'text_vector': text_embed,
'image_vector': img_embed
}
chunk_dict_list.append(chunk_dict)

# Actually insert data batch.
# If the data size is large, try bulk_insert()
col.insert(data=chunk_dict_list)
# STEP 3. Data vectorization(i.e. embedding).
image_embeddings, text_embeddings = embedding_model(
batch_images=batch_images,
batch_texts=batch_texts)

20 | © Copyright 2024 Zilliz20
# STEP 4. hybrid_search() is the API for multimodal search
results = col.hybrid_search(
reqs=[image_req, text_req],
rerank=RRFRanker(),
limit=top_k,
output_fields=output_fields)
Final step: Search

23 | © Copyright 2024 Zilliz23
[Multimodal] search with text + image query

Query = text + image
1. "silhouette d'une personne assise sur une
roche au couche du soleil"
(i.e. "silhouette of person sitting on rock formation
during golden hour")

2. Image below
Result

26 | © Copyright 2024 Zilliz26 | © Copyright 9/25/23 Zilliz 26
curl --request POST \
--url “${MILVUS_HOST}:${MILVUS_PORT}/v2/vectordb/entities/advanced_search” \
--header “Authorization: Bearer ${TOKEN}” \
--header “accept: application/json” \
--header “content-type: application/json” \
-d
{
"collectionName": "book",
"search": {
"search_by": {
"ﬁeld": "book_intro_vector",
"data": [1, 2, ...],
},
"search_by": {
"ﬁeld": "book_cover_vector",
"data": [2, 3, ...],
},
},
"rerank": {
"strategy": "rrf",
},
"limit": 10,
}

Retrieve Params
Re-rank Params

Multimodal Retrieval Augmented Generation (RAG) with Milvus

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Multimodal Retrieval Augmented Generation (RAG) with Milvus

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......