Building an Agentic RAG locally with Ollama and Milvus

chloewilliams62 1,136 views 21 slides Jun 27, 2024
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

With the rise of Open-Source LLMs like Llama, Mistral, Gemma, and more, it has become apparent that LLMs might also be useful even when run locally. In this talk, we will see how to deploy an Agentic Retrieval Augmented Generation (RAG) setup using Ollama, with Milvus as the vector database on your ...


Slide Content

1 | © Copyright 8/16/23 Zilliz1 | © Copyright 8/16/23 Zilliz
Stephen Batifol | Zilliz

Unstructured Data Meetup, June 25th
Using LLM Agents with Llama
3, LangGraph and Milvus

2 | © Copyright 8/16/23 Zilliz2 | © Copyright 8/16/23 Zilliz
Stephen Batifol
Developer Advocate, EMEA, Zilliz
[email protected]
https://www.linkedin.com/in/stephen-batifol/
https://twitter.com/stephenbtl
Speaker

3 | © Copyright 8/16/23 Zilliz3 | © Copyright 8/16/23 Zilliz
27K+
GitHub
Stars
25M+
Downloads
250+
Contributors
2,600
+Forks
Milvus is an open-source vector database for GenAI projects. pip install on your
laptop, plug into popular AI dev tools, and push to production with a single line of
code.
Easy Setup

Pip-install to start
coding in a notebook
within seconds.
Reusable Code

Write once, and
deploy with one line
of code into the
production
environment
Integration

Plug into OpenAI,
Langchain,
LlmaIndex, and
many more
Feature-rich

Dense & sparse
embeddings,
filtering, reranking
and beyond

4 | © Copyright 8/16/23 Zilliz4 | © Copyright 8/16/23 Zilliz
Seamless integration with all popular AI toolkits

5 | © Copyright 8/16/23 Zilliz5 | © Copyright 8/16/23 Zilliz | © Copyright 8/16/23 Zilliz 5
RAG
(Retrieval Augmented Generation)

6 | © Copyright 8/16/23 Zilliz6 | © Copyright 8/16/23 Zilliz
Basic Idea
Use RAG to force the LLM to work with your data
by injecting it via a vector database like Milvus

7 | © Copyright 8/16/23 Zilliz7 | © Copyright 8/16/23 Zilliz
Basic RAG Architecture

8 | © Copyright 8/16/23 Zilliz8 | © Copyright 8/16/23 Zilliz
01
Tech Stack

9 | © Copyright 8/16/23 Zilliz9 | © Copyright 8/16/23 Zilliz
•Framework for building LLM Applications
•Focus on retrieving data and integrating with
LLMs
•Integrations with most AI popular tools
???????????? LangChain

10 | © Copyright 8/16/23 Zilliz10 | © Copyright 8/16/23 Zilliz
???????????? LangGraph by LangChain
•Build Stateful apps with LLMs and Multi-Agents workflow
•Cycles and Branching
•Human-in-the-Loop
•Persistence

11 | © Copyright 8/16/23 Zilliz11 | © Copyright 8/16/23 Zilliz
Ollama
•Run LLMs anywhere

•Run Embedding Models

12 | © Copyright 8/16/23 Zilliz12 | © Copyright 8/16/23 Zilliz

13 | © Copyright 8/16/23 Zilliz13 | © Copyright 8/16/23 Zilliz
02
Agentic RAG

14 | © Copyright 8/16/23 Zilliz14 | © Copyright 8/16/23 Zilliz
•Routing: Adaptive RAG
•Route Questions to different retrieval approaches

•Fallback: Corrective RAG
•Fallback to web search if docs are not relevant to query

•Self-Correction: Self-RAG
•Try to fix answers with hallucinations or don’t address question
General Ideas

15 | © Copyright 8/16/23 Zilliz15 | © Copyright 8/16/23 Zilliz
General Ideas for Agents
•Reflection
•Self-Correction Mechanism

•Planning:
•The agent doesn’t just react to the query
•Lays out a step-by-step process to retrieve or generate the best answer

•Tool use
•Search for Knowledge Base in Milvus
•Search the web for more information

16 | © Copyright 8/16/23 Zilliz16 | © Copyright 8/16/23 Zilliz
General Ideas

17 | © Copyright 8/16/23 Zilliz17 | © Copyright 8/16/23 Zilliz | © Copyright 8/16/23 Zilliz 17
Demo!

18 | © Copyright 8/16/23 Zilliz18 | © Copyright 8/16/23 Zilliz
milvus.io
github.com/milvus-io/
@milvusio
@stephenbtl


/in/stephen-batifol
Questions?

19 | © Copyright 8/16/23 Zilliz19 | © Copyright 8/16/23 Zilliz
02
Advanced RAG techniques

20 | © Copyright 8/16/23 Zilliz20 | © Copyright 8/16/23 Zilliz
●Divide & Conquer
○Query Enhancement: better express or process the query intent.
○Indexing Enhancement: data cleanup, better parser and chunking
○Retriever Enhancement: more retrievers and hybrid search strategy
○Generator Enhancement: prompt engineering and more powerful LLM
Types of RAG Enhancement Techniques

21 | © Copyright 8/16/23 Zilliz21 | © Copyright 8/16/23 Zilliz
Meta Storage
Root Query Data Index
Coordinator Service
Proxy
Proxy
etcd
Log Broker
SDK
Load Balancer
DDL/DCL
DML
NOTIFICATION
CONTROL SIGNAL
Object Storage
Minio / S3 / AzureBlob
Log Snapshot Delta File Index File
Worker Node
QUERY DATA DATA
Message Storage
VECTOR
DATABASE
Access Layer
Query Node Data Node Index Node
Milvus Architecture
Tags