GPT, LLM, RAG, and RAG in Action: Understanding the Future of AI-Powered Information Retrieval
Muralidharantnj
412 views
36 slides
Mar 09, 2025
Slide 1 of 36
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
About This Presentation
Understanding GPT & LLMs – How Large Language Models work and their impact on AI-driven applications.
✅ Challenges of LLMs – Why traditional LLMs struggle with real-time knowledge updates, factual accuracy, and dynamic content.
✅ Introduction to RAG – How RAG enhances LLMs by combining...
Understanding GPT & LLMs – How Large Language Models work and their impact on AI-driven applications.
✅ Challenges of LLMs – Why traditional LLMs struggle with real-time knowledge updates, factual accuracy, and dynamic content.
✅ Introduction to RAG – How RAG enhances LLMs by combining retrieval-based search with AI-generated responses.
✅ RAG in Action – Real-world applications, including chatbots, document search, and AI-driven knowledge management.
✅ Future of RAG & AI – How RAG can improve AI systems and what’s next in the field of AI-driven search.
Size: 5.06 MB
Language: en
Added: Mar 09, 2025
Slides: 36 pages
Slide Content
RAG in Action Let's Build AI-Powered Document Intelligence Transform Static Documents into Interactive Knowledge M uralidharan D eenathayalan 🚀 Director – Solution Architect & Technology, 🌍 www.ryvalx.com
Agenda We will be discussing about, 1. Introduction to GPT, LLM, RAG 2. Deep dive into RAG 3. Let’s build the app 4. Q & A
Introduction to GPT Let’s understand what is GPT….. GPT stands for Generative Pre-trained Transformer AI model designed to generate human-like text based on the input it receives. Learn the input pattern and generates meaningful data based on its massive training data. G ( Generative ) : The model can generate, text, code, summary and even it can generate new content based on the input given. Example : Write a hello world program in C# P ( Pre-trained ) : The model has been exposed with huge amount of text data before being fine-tuned for a specific tasks. Example : Expand JSON T ( Transformers ) : Transformers architecture helps how each word connects each other in a sentence. Example : Ravi helped Ramu with his homework because he is good at math. Here, the transformer intelligently identifies “he” refers Ramu but not Ravi.
Evolution to GPT
Key capabilities of GPT
Limitations of GPT
Components of GPT
How GPT Works
GPT Architecture
Frameworks for GPT
Apps using GPT 1️. ChatGPT (by OpenAI) : OpenAI’s chatbot that interacts in a conversational way Use Cases: Customer support, personal assistance, content generation. Example: Users ask questions, and ChatGPT provides detailed, human-like responses. 2️. GitHub Copilot : AI-powered coding assistant Use Cases: Auto-generating code, suggesting code completions, improving developer productivity. Example: When writing Python functions, Copilot suggests the next lines of code automatically. 3️. Jasper AI : AI-powered content writing tool Use Cases: Blog posts, marketing copy, email content, SEO writing. Example: A company uses Jasper to generate product descriptions and social media posts. 4. Notion AI : AI-powered writing assistant integrated into Notion Use Cases: Automating note-taking, summarizing documents, improving writing.
Introduction to LLM Let’s understand what is LLM….. LLM stands for Large Language Model Advanced AI system trained on vast amounts of text data to understand, generate, and manipulate human language. L ( Large ) : Refers to the model size, typically billions of parameters. Typically, from billions to trillions L ( Language ) : Specializes in understanding and generating text-based data. M ( Model ) : A neural network trained to perform NLP (Natural Language Processing) tasks. .
Evolutions of LLM Year Model Organization Parameters Key Innovations/Features 2017 Transformer Google N/A Introduced the transformer architecture with attention mechanisms 2018 BERT Google 340M Bidirectional training, masked language modeling 2018 GPT-1 OpenAI 117M First GPT model, unidirectional training 2019 GPT-2 OpenAI 1.5B Improved text generation, more coherent long-form content 2019 XLNet Google 340M Permutation-based training, addressing BERT limitations 2019 RoBERTa Facebook 355M Optimized BERT training methodology 2020 T5 Google 11B Text-to-text transfer transformer, unified NLP tasks 2020 GPT-3 OpenAI 175B Few-shot learning, emergent abilities at scale 2021 Codex OpenAI 12B Code generation, powers GitHub Copilot 2021 Gopher DeepMind 280B Enhanced knowledge representation 2021 DALL-E OpenAI 12B Text-to-image generation capabilities 2022 PaLM Google 540B Improved reasoning, multilingual capabilities 2022 ChatGPT OpenAI 175B RLHF-optimized conversational interface 2022 Chinchilla DeepMind 70B Compute-optimal scaling laws 2023 GPT-4 OpenAI Not disclosed Multimodal capabilities, improved reasoning 2023 Claude Anthropic Not disclosed Constitutional AI, focus on alignment 2023 Llama Meta 7B-70B Open-weight models for research 2023 Gemini Google Not disclosed Natively multimodal architecture 2024 Claude 3 Anthropic Not disclosed Enhanced reasoning, multimodal capabilities 2024 GPT-4o OpenAI Not disclosed Optimized multimodal processing, reduced latency 2024 Llama 3 Meta 8B-400B Improved instruction following, multilingual support
LLM == GPT Similarity Description Transformer-Based Utilizes transformer architecture as their foundational design Large Parameter Count Typically, billions to trillions of parameters Pre-training Approach Both are pre-trained on massive text corpora before any specialized fine-tuning Text Generation Capabilities Both can generate coherent, contextually relevant text in response to prompts Fine-tuning Applications Both can be fine-tuned for specialized tasks like summarization, translation, and question-answering
LLM != GPT Concept GPT (Generative Pre-trained Transformer) LLM (Large Language Model) Classification A specific type/family of language models The broader category that includes GPT and many others Architecture Uses transformer decoder architecture Can use various architectures (encoder, decoder, or both) Developer Exclusively developed by OpenAI Created by many organizations (Google, Meta, Anthropic, etc.) Training Method Autoregressive pre-training (predicting next token) Various methods including masked language modeling, span corruption, etc. Specialization Primarily optimized for generative tasks Includes models specialized for different purposes (understanding, translation, multimodal processing)
LLM Architecture
LLM Frameworks
Types of LLM
Apps built using LLM
Introduction to RAG Let’s understand what is RAG….. RAG stands for Retrieval Augmentation Generation . RAG is an AI method that helps language models find useful information before answering, making responses more accurate and reliable. Bridges the gap between search and generation. R ( Retrieval ) : Finding the information you need from a collection of documents or data. A ( Augmentation ) : Adding the retrieved information to enhance the AI's knowledge . G ( Generation ) : Creating a new, helpful response based on the retrieved information.
RAG Frameworks LangChain - Popular framework for building RAG applications with components for document processing, embedding, retrieval, and generation. LlamaIndex - Specialized framework for connecting LLMs to external data Semantic Kernel - Microsoft's framework for AI orchestration including RAG Chroma/FAISS/Pinecone/ Weaviate - Vector databases for storing and retrieving embeddings Sentence-Transformers - Framework for creating text embeddings Unstructured - Library for extracting text from various document formats Weaviate /Milvus/ Qdrant - Vector databases with hybrid search capabilities
Apps using RAG Microsoft Copilot for Microsoft 365 : Retrieves information from your documents, emails, meetings, and chats to provide contextual assistance. Zendesk AI : Retrieves relevant support documentation to assist agents and customers. Klarity : Analyzes contracts and legal documents using RAG to extract and validate information. Glean : Enterprise search that uses RAG to provide more relevant results from company documents. Claude with Claude Artifact Search : Retrieves information from uploaded documents to augment responses. GitHub Copilot X : Uses RAG to pull information from documentation and codebase context.
Let's Build AI-Powered Document Intelligence Prerequisites Azure Resources: Azure Account - An active Azure subscription Azure OpenAI Service - With API access and a deployed model Azure AI Search - For document indexing and retrieval Azure Storage Account - For storing your documents (e.g., Blob Storage) Technical Requirements: Python 3.8+ - The application is written in Python Required Python Packages: azure-search-documents azure-core openai python- dotenv Configuration Azure OpenAI Configuration – Endpoint, API key and Model name Azure AI Search Configuration – Endpoint, API key and Index name
Let's Build AI-Powered Document Intelligence
Let's Build AI-Powered Document Intelligence
Let's Build AI-Powered Document Intelligence
M uralidharan D eenathayalan 🚀 Director – Solution Architect & Technology, 🌍 www.ryvalx.com 📖 Blog : https://blogs.codingfreaks.net 💼 LinkedIn : https://www.linkedin.com/in/muralidharand Stay In-touch