GPT, LLM, RAG, and RAG in Action: Understanding the Future of AI-Powered Information Retrieval

Muralidharantnj 412 views 36 slides Mar 09, 2025
Slide 1
Slide 1 of 36
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36

About This Presentation

Understanding GPT & LLMs – How Large Language Models work and their impact on AI-driven applications.
✅ Challenges of LLMs – Why traditional LLMs struggle with real-time knowledge updates, factual accuracy, and dynamic content.
✅ Introduction to RAG – How RAG enhances LLMs by combining...


Slide Content

RAG in Action Let's Build AI-Powered Document Intelligence Transform Static Documents into Interactive Knowledge M uralidharan D eenathayalan 🚀 Director – Solution Architect & Technology, 🌍 www.ryvalx.com

Agenda We will be discussing about, 1. Introduction to GPT, LLM, RAG 2. Deep dive into RAG 3. Let’s build the app 4. Q & A

Introduction to GPT Let’s understand what is GPT….. GPT stands for Generative Pre-trained Transformer AI model designed to generate human-like text based on the input it receives. Learn the input pattern and generates meaningful data based on its massive training data. G ( Generative ) : The model can generate, text, code, summary and even it can generate new content based on the input given. Example : Write a hello world program in C# P ( Pre-trained ) : The model has been exposed with huge amount of text data before being fine-tuned for a specific tasks. Example : Expand JSON T ( Transformers ) : Transformers architecture helps how each word connects each other in a sentence. Example : Ravi helped Ramu with his homework because he is good at math. Here, the transformer intelligently identifies “he” refers Ramu but not Ravi.

Evolution to GPT

Key capabilities of GPT

Limitations of GPT

Components of GPT

How GPT Works

GPT Architecture

Frameworks for GPT

Apps using GPT 1️. ChatGPT (by OpenAI) : OpenAI’s chatbot that interacts in a conversational way Use Cases: Customer support, personal assistance, content generation. Example: Users ask questions, and ChatGPT provides detailed, human-like responses. 2️. GitHub Copilot : AI-powered coding assistant Use Cases: Auto-generating code, suggesting code completions, improving developer productivity. Example: When writing Python functions, Copilot suggests the next lines of code automatically. 3️. Jasper AI : AI-powered content writing tool Use Cases: Blog posts, marketing copy, email content, SEO writing. Example: A company uses Jasper to generate product descriptions and social media posts. 4. Notion AI : AI-powered writing assistant integrated into Notion Use Cases: Automating note-taking, summarizing documents, improving writing.

Introduction to LLM Let’s understand what is LLM….. LLM stands for Large Language Model Advanced AI system trained on vast amounts of text data to understand, generate, and manipulate human language. L ( Large ) : Refers to the model size, typically billions of parameters. Typically, from billions to trillions L ( Language ) : Specializes in understanding and generating text-based data. M ( Model ) : A neural network trained to perform NLP (Natural Language Processing) tasks. .

Evolutions of LLM Year Model Organization Parameters Key Innovations/Features 2017 Transformer Google N/A Introduced the transformer architecture with attention mechanisms 2018 BERT Google 340M Bidirectional training, masked language modeling 2018 GPT-1 OpenAI 117M First GPT model, unidirectional training 2019 GPT-2 OpenAI 1.5B Improved text generation, more coherent long-form content 2019 XLNet Google 340M Permutation-based training, addressing BERT limitations 2019 RoBERTa Facebook 355M Optimized BERT training methodology 2020 T5 Google 11B Text-to-text transfer transformer, unified NLP tasks 2020 GPT-3 OpenAI 175B Few-shot learning, emergent abilities at scale 2021 Codex OpenAI 12B Code generation, powers GitHub Copilot 2021 Gopher DeepMind 280B Enhanced knowledge representation 2021 DALL-E OpenAI 12B Text-to-image generation capabilities 2022 PaLM Google 540B Improved reasoning, multilingual capabilities 2022 ChatGPT OpenAI 175B RLHF-optimized conversational interface 2022 Chinchilla DeepMind 70B Compute-optimal scaling laws 2023 GPT-4 OpenAI Not disclosed Multimodal capabilities, improved reasoning 2023 Claude Anthropic Not disclosed Constitutional AI, focus on alignment 2023 Llama Meta 7B-70B Open-weight models for research 2023 Gemini Google Not disclosed Natively multimodal architecture 2024 Claude 3 Anthropic Not disclosed Enhanced reasoning, multimodal capabilities 2024 GPT-4o OpenAI Not disclosed Optimized multimodal processing, reduced latency 2024 Llama 3 Meta 8B-400B Improved instruction following, multilingual support

LLM == GPT Similarity Description Transformer-Based Utilizes transformer architecture as their foundational design Large Parameter Count Typically, billions to trillions of parameters Pre-training Approach Both are pre-trained on massive text corpora before any specialized fine-tuning Text Generation Capabilities Both can generate coherent, contextually relevant text in response to prompts Fine-tuning Applications Both can be fine-tuned for specialized tasks like summarization, translation, and question-answering

LLM != GPT Concept GPT (Generative Pre-trained Transformer) LLM (Large Language Model) Classification A specific type/family of language models The broader category that includes GPT and many others Architecture Uses transformer decoder architecture Can use various architectures (encoder, decoder, or both) Developer Exclusively developed by OpenAI Created by many organizations (Google, Meta, Anthropic, etc.) Training Method Autoregressive pre-training (predicting next token) Various methods including masked language modeling, span corruption, etc. Specialization Primarily optimized for generative tasks Includes models specialized for different purposes (understanding, translation, multimodal processing)

LLM Architecture

LLM Frameworks

Types of LLM

Apps built using LLM

Introduction to RAG Let’s understand what is RAG….. RAG stands for Retrieval Augmentation Generation . RAG is an AI method that helps language models find useful information before answering, making responses more accurate and reliable. Bridges the gap between search and generation. R ( Retrieval ) : Finding the information you need from a collection of documents or data. A ( Augmentation ) : Adding the retrieved information to enhance the AI's knowledge . G ( Generation ) : Creating a new, helpful response based on the retrieved information.

How RAG works

How RAG works

RAG - Architecture

Types of RAG Basic RAG Reranking RAG Multi-Vector RAG Self Query RAG Agent RAG

Types of RAG - Basic

Types of RAG - Reranking

Types of RAG – Multi-Vector

Types of RAG – Self-Query

RAG Frameworks LangChain - Popular framework for building RAG applications with components for document processing, embedding, retrieval, and generation. LlamaIndex - Specialized framework for connecting LLMs to external data Semantic Kernel - Microsoft's framework for AI orchestration including RAG Chroma/FAISS/Pinecone/ Weaviate - Vector databases for storing and retrieving embeddings Sentence-Transformers - Framework for creating text embeddings Unstructured - Library for extracting text from various document formats Weaviate /Milvus/ Qdrant - Vector databases with hybrid search capabilities

Apps using RAG Microsoft Copilot for Microsoft 365 : Retrieves information from your documents, emails, meetings, and chats to provide contextual assistance. Zendesk AI : Retrieves relevant support documentation to assist agents and customers. Klarity : Analyzes contracts and legal documents using RAG to extract and validate information. Glean : Enterprise search that uses RAG to provide more relevant results from company documents. Claude with Claude Artifact Search : Retrieves information from uploaded documents to augment responses. GitHub Copilot X : Uses RAG to pull information from documentation and codebase context.

Let's Build AI-Powered Document Intelligence Prerequisites Azure Resources: Azure Account - An active Azure subscription Azure OpenAI Service - With API access and a deployed model Azure AI Search - For document indexing and retrieval Azure Storage Account - For storing your documents (e.g., Blob Storage) Technical Requirements: Python 3.8+ - The application is written in Python Required Python Packages: azure-search-documents azure-core openai python- dotenv Configuration Azure OpenAI Configuration – Endpoint, API key and Model name Azure AI Search Configuration – Endpoint, API key and Index name

Let's Build AI-Powered Document Intelligence

Let's Build AI-Powered Document Intelligence

Let's Build AI-Powered Document Intelligence

M uralidharan D eenathayalan 🚀 Director – Solution Architect & Technology, 🌍 www.ryvalx.com 📖 Blog : https://blogs.codingfreaks.net 💼 LinkedIn : https://www.linkedin.com/in/muralidharand Stay In-touch