Multimodal Search with Open-Source Tools

chloewilliams62 122 views 37 slides Oct 11, 2024
Slide 1
Slide 1 of 37
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37

About This Presentation

A recent and exciting development in the world of Generative AI has been the use of language to understand images, video, and sound. One example is multi-modal retrieval, which is the process of using one modality, like text, to search another modality, like images. It is not only useful for search ...


Slide Content

1| © Copyright 10/22/23 Zilliz
Stefan Webb
Developer Advocate, Zilliz
[email protected]
https://www.linkedin.com/in/stefan-webb
https://x.com/stefan_webb
Unstructured Data Meetup | Host

2| © Copyright 10/22/23 Zilliz
Welcome Speakers
Practical Guide to
Deploying LLMs
TECH TALK
Chaoyu Yang
Founder & CEO, BentoML
Timeplus - One single binary
to tackle streaming and
historical analytics
TECH TALK
Ken Chen
Co-Founder & Chief Architect, Timeplus
Enhancing Context
Provision for LLMs:
Beyond RAG and Prompts
Brian Hart
Engineer, Tecton
LIGHTNING TALK

Scan for a
chance to
win a prize
tonight!

4| © Copyright 10/22/23 Zilliz
Milvus
Open Source Self-Managed

Zilliz Cloud
SaaS Fully-Managed

github.com/milvus-io/milvus

Getting Started with Vector Databases
zilliz.com/cloud

Searching the Web with Gen AI

or
Apple





or
Rising dough





or
Change car tire
Rising Dough
Proofing
Bread


Why is Semantic Search Difficult?

Why is Semantic Search Important?
10%
Other
newly generated data in 2025
will be unstructured data90%
Data Source: The Digitization of the World by IDC

Solution: Deep Learning
Similarity Search

Vector Embedding

Vector Space

Embeddings Models

New Challenge: Search in Vector Spaces
How to Index and
Search?

●High-dimensional
●> 1000 dims
How to Scale?

●10-100 million vectors?
●Billions?
●Trillions?
●Billions of users?
Multiple Data Types?

●Text
●Images
●Audio
●Graphs
●…

Milvus is an Open-Source Vector Database to
store, index, manage, and use the massive
number of embedding vectors generated by
deep neural networks and LLMs.
contributors
400
stars
30K
docker pulls
66M
forks
2.7K
+
Milvus: High-performance, scalable vector database

Retrieval Augmented
Generation RAG
Expand LLMs' knowledge by
incorporating external data sources
into LLMs and your AI applications
Match user behavior or content
features with other similar ones to
make effective recommendations
Recommender System
RecSys
Search for semantically similar
texts across vast amounts of
natural language documents
Text/Semantic Similarity
Search
Molecular Similarity
Search
Search for similar substructures,
superstructures, and other
structures for a specific molecule
Fraud & Anomaly
Detection
Detect data points, events, and
observations that deviate
significantly from the usual pattern
Multimodal Similarity
Search
Search over multiple types of
data simultaneously, e.g. text,
audio, images, video
Common AI Use Cases

Deployment Options
Milvus Lite

●Locally hosted
●Suitable for prototyping
and demos
●10s of millions of vectors
Milvus Standalone

●Single remote/local server
●“Medium” scale
●Simplified setup,
maintenance, etc.
compared to cluster
●100s millions of vectors
Milvus Cluster

●Distributed system
●Many different types of
nodes
●100s of billions of vectors

Why Open-Source?
Cost-effective Innovation Community

Why Not Traditional Databases?
Suboptimal
Indexing / Search
Scaling Inadequate Query
& Analytics Support

Benchmarks
Shows 3-20x faster comparing with
open source Milvus
At least 6x faster than other vector databases
https://github.com/zilliztech/VectorDBBench

20| © Copyright 10/22/23 Zilliz
Integrated into Generative AI Tooling Ecosystem
Framework
Hardware
Infrastructure
Embedding Models LLMs
Software Infrastructure
Vector Database

Milvus Users

The Forrester Wave™ Vector
Database Providers, Q3 2024
Zilliz is recognized as the Leader in
the Vector DB Space

How Does Similarity Search Work?

Multimodal Semantic Similarity?
(multimodal models)

Multi-Modal Embeddings
text
encoder
image
encoder
“the lion sleeps”
“a lion roars”
“a dog is walked”





Why Multi-Modal?
Retrieval /
Similarity Search
Foundation
Models
RAG
vector
database
large
vision-language
model
“what is the user
clicking on?”
“from the screenshot
provided, I see the user
is…”
“photos and
recordings of Iberian
lynx”
“produce a graph of
revenue from
2021-2023”

How Does it Work?
•Dataset of e.g., (image, text) pairs
•Typically, mined from web
•For example, (<img src>, <img alt>)
•Pre-processing required for performance
•Train encoders so that embeddings for (image, text) are close
•Either initialized from pre-trained encoders or from scratch
•Simultaneously, penalize distances for (image, random text)
•This is called, Contrastive Learning

MagicLens: Task
[Figure 1; DeepMind, 2024]
Input:
•a query image, and;
•an instruction that specifies a specific
semantic relation

Output:
•matching image(s) from our database.

MagicLens: Data
[Figure 2; DeepMind, 2024]
Steps:
•Find image pairs on the same
webpages
•With PaLI, use images + alt. text to
generate descriptions of each image
and relationship between them
•Filter for sensitive content, similarity
•With PaLM2, form an instruction that
relates query to target

MagicLens: Model
[Figure 4; DeepMind, 2024]
3⃣ query image + instruction text
4⃣ target image + empty text
1⃣ initialized to pre-trained models
2⃣ additional layers

MagicLens: Training
Minimize contrastive loss function, which for an element of a minibatch
is:
4⃣ similarity between (query, instruction)
and (query, “”)
1⃣ similarity between (query, instruction) and (target, “”)
3⃣ similarity between (query, instruction)
and (non-target, “”)
2⃣ sum over minibatch

Milvus Demo: Multimodal Image Search
●https://multimodal-demo.milvus.io/

Join the
Milvus
Discord!

Have you built
something cool
using Milvus or
Zilliz? We want to
hear all about it.
Share Your Story

Zilliz is
Hiring!

Join our Team

Zilliz.com/careers
•Account Executives
•Revenue Operations
Manager
•Engineering Manager
•Staff Software Engineer

36 | © Copyright 10/22/23 Zilliz36 | © Copyright 10/22/23 Zilliz
Become a
Speaker!
Interesting in speaking at and/or
sponsoring a Zilliz Unstructured
Data Meetup? Fill out this form!


??????????????????

Join us at our next meetup!
lu.ma/unstructured-data-meetup
Tags