Multimodal Embeddings
Frank Liu | Head of AI & ML, Zilliz
Frank Liu
Head of AI & ML | Zilliz
A Quick Refresher
Beyond Text and Image Embeddings
Multimodal Embeddings
Demo Time!
Agenda
01
02
03
04
About Milvus
Milvus is an open-source vector database
for GenAI projects. pip install on your laptop,
plug into popular AI dev tools, and push to
production with a single line of code.
29K
GitHub Stars
25M
Downloads
250
Contributors
2,600
Forks
Easy Setup
Pip-install to start coding in a notebook within seconds
Integration
Plug into OpenAI, Langchain, LlmaIndex, and many more
Reusable Code
Write once, and deploy with one line of code into the production
environment
Feature-rich
Dense & sparse embeddings, filtering, reranking and beyond
A Quick
Refresher
Vectors Unlock Unstructured Data
Knowledge Base
Documents)
Embedding Models Vectors Vector Databases
Embeddings models workhorses of AI apps
Beyond Text and Image
Embeddings
Back in the day…
…feature vectors were handcrafted
SIFT
TFIDF
Harris Corner Detector
Circa 2012, convnets became immensely popular
Source: CS230 notes
And many people discovered the power of vectors
RNNs were the OG language model
Source: CS230 notes
Vectors are for more than just text and images
Vectors are for more than just text and images
Vectors are for more than just text and images
Vectors are for more than just text and images
Vectors are for more than just text and images
Multimodal
Embeddings
Visual + language embeddings (CLIP-like)
Source: CLIP blog
One embedding space, six modalities (ImageBind)
Source: Girdhar, et al.
LLMs are becoming natively multimodal…
Source: Llama Team, AI @ Meta
… and the best embedding models will too
Original: Fuyu-8B blog
Demo Time
Vanilla RAG is no longer enough…
Your Documents
Embedding Model
Milvus
Search
Gen AI Model
Reliable Answers
What is the default
AUTOINDEX distance
metric in Milvus Client?
The default
AUTOINDEX distance
metric in Milvus Client is
L2.
Question
Question + Context
… we need multimodal RAG
Multimodal Model
Milvus
Question
Question + Context
Search
Gen AI Model
Reliable Answers
What kind of music did
they play in the
pre-show?
The musician played
improvised electronic
music.