The rapid advancement of language models has greatly improved text generation and interaction, but integrating these models into industry poses challenges in balancing data utility with privacy. This presentation will provide practical insights into building a Retrieval Augmented Generation (RAG) sy...
The rapid advancement of language models has greatly improved text generation and interaction, but integrating these models into industry poses challenges in balancing data utility with privacy. This presentation will provide practical insights into building a Retrieval Augmented Generation (RAG) system, with a focus on protecting sensitive data. RAG combines the generative power of language models with precise information retrieval, allowing relevant data to be incorporated on demand without exposing private information. We’ll explore the technical aspects of RAG, including its architecture and privacy safeguards, and present a case study demonstrating how RAG meets specific privacy requirements and the challenges faced during implementation.
Size: 4.72 MB
Language: en
Added: Sep 16, 2024
Slides: 27 pages
Slide Content
12/09/2024 Privacy-Aware Retrieval Augmented Generation | Taming the uncharted
About us CROZ is a Croatian-German biz-tech consulting company that brings the most complex organizations where the new – and improved value is created. >>>
The stories that tell our story | Taming the uncharted
Industrialized AI Day 1 software Suspect process Single- developer Manual build , no test Lucky punch Simple /Old school Industrialized AI means transferring the best practices of software engineering and DevOps to AI models and associated artifacts (data, models, pipelines, automation, monitoring, audit trail, ...). Reproducible at any time Industrialized AI Auditable Living software , constant dev CI/CD pipeline DevOps team (s)
| Taming the uncharted THE WHAT_ Why GenAI matters? has general knowledge communicates using natural language Basic reasoning engine
| Taming the uncharted THE WHAT_ GenAI is much more than token generation Speech to text Text to speech Voice editing Translation IMAGE Image generation Video generation OCR Transcribe and analyze Summarize Classify Chat with Translate Generate content User stories Acceptance criteria Test cases Code translation Code generation TEXT AUDIO IMAGE
THE WHAT_ Should we build or buy? IMAGE Image generation Video generation OCR Transcribe and analyze | Taming the uncharted
| Taming the uncharted THE WHAT_ “Data is still the new oil” IMAGE Image generation Video generation OCR Transcribe and analyze
| Taming the uncharted Case study: HR virtual assistant in a bank Goal : Assistant capable of answering frequent HR questions to demonstrate RAG architecture Challanges : How to trim assistant's knowledge base access if some documents are restricted? UX when latency is not small? How to answer question if answer is part of diagram in the document? What if document change revisions frequently? GDPR compliance (processing in EU, Anonymization)
| Taming the uncharted Case study: HR virtual assistant in a bank
| Taming the uncharted THE HOW_ The brute force way Prompt All documents From knowledge base LLM Question - Query documents Answer LLM
| Taming the uncharted THE HOW_ The brute force way Number/size of the documents Number of document may be huge Multiple versions of the document LLM context window may be too small Large context = large cost What to do with images or diagrams Privacy Documents may be confident Documents can contain PII Good prompt is not ensuring data leakage Prompt injection possibility
| Taming the uncharted THE WHY_ What does it take to build a virtual assistant differently? Find relevant documents or actions Generate answer or action 2. 3. Understand the intent/query and the context 1.
| Taming the uncharted THE HOW_ What do we have in the knowledge base Documents with mainly machine-readable text Every document is containing its metadata Documents may include images, charts or diagrams
| Taming the uncharted THE HOW_ Can lexical search help us? Example based on Wikipedia search Query: What is the capital of the United States? Lexical search (BM25): Capital punishment (the death penalty) has existed in the United States …. Ohio is one of the 50 states in the United States . Its capital is Columbus . Nevada is one of the United States ’ states . Its capital …
| Taming the uncharted THE HOW_ Multilingual lexical search Language identification English index Croatian index German index Document or query Not scalable!
| Taming the uncharted Where semantic search shines USA vs US vs United States Lexical gap Flight from Zagreb to Brazil Respecting word order Flight from Brazil to Zagreb correlation N umpy correlation SciPy Related terms Human resources ljudski resursi Multilingual
| Taming the uncharted THE HOW_ Transforming content into short code : embedding word embedding vector Find semantically relevant documents
| Taming the uncharted What do we get from each search type? Very fast Works great for exact matching Works great with short queries Interpretable Lexical search Semantic search Semantic context understanding Multilingual Multimodal Works well with longer queries
| Taming the uncharted What if we use both? Lexical search candidates Which day of the month can I expect my salary to be paid? Semantic search candidates Hybrid search that combines both user query and relevancy scores from both candidate groups
Find semantically relevant documents embedding vector vector database Where can I find details of my account ? How do I make a loan installment payment? How do I retrieve monthly statements for a transaction account? What’s my account balance? top most relevant documents from the database user query | Taming the uncharted Screenshot of app Re-ranking
| Taming the uncharted LLM as a basic reasoning engine Relevant chunks using hybrid search Which day of the month can I expect my salary to be paid? Chunking strategy Documents documents Answer LLM
| Taming the uncharted THE WHY_ Let's build a virtual assistant (or a copilot) Find relevant documents 2. Which day of the month can I expect my salary to be paid? You can expect your salary to be paid by the 15th of the month. Can you tell me more about the car policy in the company? Certainty! All employees that that are at L6 or above can get a company car if they are more than 3 years ... Student HR manager HR manager Student Unfortunately, I cannot answer this question
| Taming the uncharted Security trimming with roles Define different user access groups (e.g. all vs restricted) {user: dkopljar , group_id : 1001} Define document-level permissions for defined groups {document: contract.txt, group_id : [1001]} Filter before searching relevant documents {filter : ( group_ids : search.in(1001))}
| Taming the uncharted RAG with security trimming Lexical search candidates Question Answer Filter documents based on role Lexical search candidates Semantic search candidates Re-ranking using hybrid search Generate answer using: 1) prepared prompt 2) relevant documents 3) user question
| Taming the uncharted Case study: HR virtual assistant in a bank Goal : Assistant capable of answering frequent HR questions to demonstrate RAG architecture Specifics of the environment : Deploy in cloud or on prem (Azure or K8s) Access management ( Entra AD or Keycloak ) All documents are stored in Sharepoint or Minio Use Azure OpenAI GPT or Llama 3 Backend in Spring, frontend in React Modular design (change LLM, embeddings, …)