This explains how to construct the “RAG system” that incorporates data into LLMs to enable them to generate more accurate and relevant responses. 📝
If you are interested, you can check out the demo here:
https://github.com/endrol/RagStudy
Size: 608.82 KB
Language: en
Added: Sep 09, 2024
Slides: 12 pages
Slide Content
Build your own RAG system
What is RAG
Retrieval-Augmented Generation (RAG) solves specification problem by adding
your data to LLM
-LLM is powerful
-LLM + customize data
-Large file
-RAG *
Steps to build a RAG
Loading: text files, pdfs, website, database
Indexing: vector embeddings
Storing: Store metadata index
Querying: Matching query and answer
Evaluating: check the pipeline performance
Documents
There are many types of files could be fit in a RAG system
-Text files *
-Tables
-PDFs
-Website *
-MultiModels (image,
video, voice, etc) ->
MM Search
Step
Documents
Open-source community readers
Indexing and Storage
A rag will help to index data into a structure that’s easy to retrieve, usually vector
embeddings.
Embedding Model
-OpenAI
-Huggingface,
-etc
Step
Indexing and Storage
Once the embedding are created, we can store them for fast retrieval
Indexing and Storage
Graph Structure, GraphRAG
Basic element: triplet
Step NebulaGraph
Create KG and store
Graph structure
-Analyze for multiple points
-A->B->C
-Summarize concepts
Answer
After retrieving k source, LLM synthesize answer from sources
Evaluations
It’s important to have a evaluation metrics to see if RAG system works
-Response evaluation
-response match the query
-response match the retrieved context
-answer match the GroundTruth
-etc
-Retrieval evaluation
-whether the retrieved sources relevant to query
Steps (RAGAS)
1.generate testset
2.evaluate metrics
End
-Information Retrieval with a smart filter
-Easily build a customized system