introductiontoragretrievalaugmentedgenerationanditsapplication-240312101523-64ccae07.pdf

Introduction to RAG
and It's Application
Presented By: Aayush Srivastava

Lack of etiquette and manners is a huge turn off.
KnolX Etiquettes
▪Punctuality
Join the session 5 minutes prior to the session start time. We start on
time and conclude on time!
▪Feedback
Make sure to submit a constructive feedback for all sessions as it is very
helpful for the presenter.
▪Silent Mode
Keep your mobile devices in silent mode, feel free to move out of session
in case you need to attend an urgent call.
▪Avoid Disturbance
Avoid unwanted chit chat during the session.

Agenda
1.Introduction
▪What is LLM
▪What is RAG?
2.LLM And It's Limitation
▪WhyRAG is important?
3.RAGArchitecture
4.How Does RAG Work
5.RAG Vs Fine-Tuning
6.Benefits Of RAG
7.Applications
8.Demo

Introduction

What is LLM
•Alargelanguagemodel(LLM)isatypeofartificialintelligenceprogramthatcanrecognizeandgeneratetext,amongother
tasks.
•LLMareverylargemodelsthatarepre-trainedonvastamountsofdata.
•Builtontransformerarchitectureisasetofneuralnetworkthatconsistofanencoderandadecoderwithself-attention
capabilities.
•Itcanperformcompletelydifferenttaskssuchasansweringquestions,summarizingdocuments,translatinglanguagesand
completingsentences.
Open AI's GPT-3 model has 175 billion parameters. Also it can take inputs up to 100K tokens in each prompt

What is LLM
•Insimplerterms,anLLMisacomputerprogramthathasbeenfedenoughexamplestobeabletorecognizeandinterpret
humanlanguageorothertypesofcomplexdata.
•QualityofthesamplesimpactshowwellLLMswilllearnnaturallanguage,soanLLM'sprogrammersmayuseamore
curateddataset.

LLM's And It's Limitations
•NotUpdatedtothelatestinformation:GenerativeAIuseslargelanguagemodelstogeneratetextsandthesemodels
haveinformationonlytodatetheyaretrained.Ifdataisrequestedbeyondthatdate,accuracy/outputmaybecompromised.
•Hallucinations:Hallucinationsrefertotheoutputwhichisfactuallyincorrectornonsensical.However,theoutputlooks
coherentandgrammaticallycorrect.Thisinformationcouldbemisleadingandcouldhaveamajorimpactonbusiness
decision-making.
•Domain-specificmostaccurateinformation:LLM'soutputlacksaccurateinformationmanytimeswhenspecificityis
moreimportantthangeneralizedoutput.Forinstance,organizationalHRpoliciestailoredtospecificemployeesmaynotbe
accuratelyaddressedbyLLM-basedAIduetoitstendencytowardsgenericresponses.
•SourceCitations:InGenerativeAIresponses,wedon’tknowwhatsourceitisreferringtogenerateaparticularresponse.
Socitationsbecomedifficultandsometimesitisnotethicallycorrecttonotcitethesourceofinformationandgivedue
credit.
•UpdatestakeLongtrainingtime:informationischangingveryfrequentlyandifyouthinktore-trainthosemodelswith
newinformationitrequireshugeresourcesandlongtrainingtimewhichisacomputationallyintensivetask.
•Presentingfalseinformationwhenitdoesnothavetheanswer.

What is RAG?
•RAGstandsforRetrieval-AugmentedGeneration
•It'sanadvancedtechniqueusedinLargeLanguageModels(LLMs)
•RAGcombinesretrievalandgenerationprocessestoenhancethecapabilitiesofLLMs
•InRAG,themodelretrievesrelevantinformationfromaknowledgebaseorexternalsources
•Thisretrievedinformationisthenusedinconjunctionwiththemodel'sinternalknowledgetogeneratecoherentand
contextuallyrelevantresponses
•RAGenablesLLMstoproducehigher-qualityandmorecontext-awareoutputscomparedtotraditionalgenerationmethods
•Essentially,RAGempowersLLMstoleverageexternalknowledgeforimprovedperformanceinvariousnaturallanguage
processingtasks
RetrievalAugmentedGeneration(RAG)isanadvancedartificialintelligence(AI)techniquethatcombines
informationretrievalwithtextgeneration,allowingAImodelstoretrieverelevantinformationfromaknowledge
sourceandincorporateitintogeneratedtext.

Why is Retrieval-Augmented Generation important
▪You can think of the LLM as an over-enthusiastic new employee who refuses to stay informed with current events but
willalways answer every question with absolute confidence.
▪Unfortunately, such an attitude can negatively impact user trust and is not something you want your chatbots to emulate!
▪RAG is one approach to solving some of these challenges. It redirects the LLM to retrieve relevant information from
authoritative, pre-determined knowledge sources.
▪Organizations have greater control over the generated text output, and users gain insights into how the LLM generates the
response.

RAG Architecture

Generalized RAG Approach
Let'sdelveintoRAG'sframeworktounderstandhowitmitigatesthesechallenges.

How Does RAG Work?

Overview
•Retrieval Augmented Generation (RAG) can be likened to a detective and storyteller duo. Imagine you are trying to solve a
complex mystery. The detective's role is to gather clues, evidence, and historical records related to the case.
•Once the detective has compiled this information, the storyteller designs a compelling narrative that weaves together the facts and
presents a coherent story. In the context of AI, RAG operates similarly.
•The Retriever Componentacts as the detective, scouring databases, documents, and knowledge sources for relevant information
and evidence. It compiles a comprehensive set of facts and data points.
•The Generator Componentassumes the role of the storyteller.Taking the collected information and transforming it into a
coherent and engaging narrative, presenting a clear and detailed account of the mystery, much like a detective novel author.
ThisanalogyillustrateshowRAGcombinestheinvestigativepowerofretrievalwiththecreativeskillsoftextgenerationtoproduce
informativeandengagingcontent,justasourdetectiveandstorytellerworktogethertounravelandpresentacompellingmystery.

RAG Components
•RAGisanAIframeworkthatallowsagenerativeAImodeltoaccessexternalinformationnotincludedinitstrainingdataor
modelparameterstoenhanceitsresponsestoprompts.
▪RAGseekstocombinethestrengthsofbothretrieval-basedandgenerativemethods.
▪Ittypicallyinvolvesusingaretrievercomponenttofetchrelevantpassagesordocumentsfromalargecorpusof
knowledge.
•Theretrievedinformationisthenusedtoaugmentthegenerativemodel’sunderstandingandimprovethequalityof
generatedresponses.
RAGComponents
•Retriever
▪Ranker
▪Generator
▪ExternalData
WhatisaPrompt.
Apromptistheinputprovidedbytheusertogeneratearesponse.Itcouldbeaquestion,astatement,oranytextthatserves
asthestartingpointforthemodeltogeneratearelevantandcoherentcontinuation

RAG Components
Let'sunderstandeachcomponentindetail
Externaldata
ThenewdataoutsideoftheLLM'soriginaltrainingdatasetiscalledexternaldata.Itcancomefrommultipledatasources,
suchasaAPIs,databases,ordocumentrepositories.Thedatamayexistinvariousformatslikefiles,databaserecords,or
long-formtext.
Vectorembeddings
▪MLmodelscannotinterpretinformationintelligiblyintheirrawformatandrequirenumericaldataasinput.Theyuseneural
networkembeddingstoconvertreal-wordinformationintonumericalrepresentationscalledvectors.
▪Vectorsarenumericalvaluesthatrepresentinformationinamulti-dimensionalspace
▪Embeddingvectorsencodenon-numericaldataintoaseriesofvaluesthatMLmodelscanunderstandandrelate.Example:
TheConference(Horror,2023,Movie) TheConference(1.2,2023,20.0)
TalesfromtheCrypt(Horror,1989,TVShow,Season7) TalesfromtheCrypt(1.2,1989,36.7)
▪Thefirstnumberinthevectorcorrespondstoaspecificgenre.AnMLmodelwouldfindthatTheConferenceandTalesfromthe
Cryptsharethesamegenre.Likewise,themodelwillfindmorerelationships

RAG Components
VectorDB
▪VectorDBisadatabasethatstoresembeddingsofwords,phrases,ordocumentsalongwiththeircorrespondingidentifiers.
▪Itallowsforfastandscalableretrievalofsimilaritemsbasedontheirvectorrepresentations.
▪VectorDBsenableefficientretrievalofrelevantinformationduringtheretrievalphaseofRAG,improvingthecontextual
relevanceandqualityofgeneratedresponses.
▪DataChunking:Beforetheretrievalmodelcansearchthroughthedata,it'stypicallydividedintomanageable"chunks"or
segments.
▪VectorDBExamples:Chroma,Pinecone,Weaviate,Elasticsearch

RAG Components
RAGRetriever
▪Thenextstepistoperformarelevancysearch.Theuserqueryisconvertedtoavectorrepresentationandmatchedwiththe
vectordatabases
▪Theretrievercomponentisresponsibleforefficientlyidentifyingandextractingrelevantinformationfromavastamountofdata.
Forexample,considerasmartchatbotforhumanresourcequestionsforanorganization.Ifanemployeesearches,"Howmuch
annualleavedoIhave?"thesystemwillretrieveannualleavepolicydocumentsalongsidetheindividualemployee'spastleave
record.Thesespecificdocumentswillbereturnedbecausetheyarehighly-relevanttowhattheemployeehasinput.Therelevancy
wascalculatedandestablishedusingmathematicalvectorcalculationsandrepresentations

RAG Components
RAGRanker
▪TheRAGrankercomponentrefinestheretrievedinformationbyassessingitsrelevanceandimportance.Itassignsscoresor
rankstotheretrieveddatapoints,helpingprioritizethemostrelevantones.
▪Theretrievercomponentisresponsibleforefficientlyidentifyingandextractingrelevantinformationfromavastamountofdata.
Forexample,considerasmartchatbotthatcananswerhumanresourcequestionsforanorganization.Ifanemployee
searches,"HowmuchannualleavedoIhave?"thesystemwillretrieveannualleavepolicydocumentsalongsidetheindividual
employee'spastleaverecord.
AugmenttheLLMprompt
▪Next,theRAGmodelaugmentstheuserinput(orprompts)byaddingtherelevantretrieveddataincontext.Thisstepuses
promptengineeringtechniquestocommunicateeffectivelywiththeLLM.
▪Theaugmentedpromptallowsthelargelanguagemodelstogenerateanaccurateanswertouserqueries.

RAG Components
RAGGenerator
▪TheRAGgeneratorcomponentisbasicallytheLLMModelsucha(GPT)
▪TheRAGgeneratorcomponentisresponsiblefortakingtheretrievedandrankedinformation,alongwiththeuser's
originalquery,andgeneratingthefinalresponseoroutput.
▪Thegeneratorensuresthattheresponsealignswiththeuser'squeryandincorporatesthefactualknowledgeretrievedfrom
externalsources.
Updateexternaldata
▪Tomaintaincurrentinformationforretrieval,asynchronouslyupdatethedocumentsandupdateembeddingrepresentationof
thedocuments.
▪Automated Real-time Processes: Updates to documents and embeddings occur in real-time as soon as new information
becomes available. This ensures that the system always reflects the most recent data.
▪Periodic Batch Processing: Updates are performed at regular intervals (e.g., daily, weekly) in batches. This approach may be
more efficient for systems with large volumes of data or where real-time updates are not necessary.

RAG Based ChatApplication
SimplifiedsequencediagramillustratingtheprocessofaRAGchatapplication
Step1-Usersendsquery:Theprocessbeginswhentheusersendsaqueryormessagetothechatapplication.

Understanding RAG Architecture
Step2-ChatAppforwardsquery:Uponreceivingtheuser'squery,thechatapplication(ChatApp)forwardsthisqueryto
theRetrievalAugmentedGeneration(RAG)modelforprocessing.
Step3-RAGretrieves+generatesresponse:TheRAGmodel,whichintegratesretrievalandgenerationcapabilities,processes
theuser'squery.Itfirstretrievesrelevantinformationfromalargecorpusofdata,usingtheLLMtogenerateacoherent
andcontextuallyrelevantresponsebasedontheretrievedinformationandtheuser'squery.
Step4-LLMreturnsresponse:Oncetheresponseisgenerated,theLLMsendsitbacktothechatapplication(ChatApp).
Step5-ChatAppdisplaysresponses:Finally,thechatapplicationdisplaysthegeneratedresponsetotheuser,completing
theinteraction.

RAG Vs Fine Tuning
▪Objective
−Fine-tuningaimstoadaptapre-trainedLLMtoaspecifictaskordomainbyadjustingitsparametersbasedontask-
specificdata.
−RAGfocusesonimprovingthequalityandrelevanceofgeneratedtextbyincorporatingretrievedinformationfromexternal
sourcesduringthegenerationprocess.
▪TrainingData
−Fine-tuningrequirestask-specificlabeleddata/examplestoupdatethemodel'sparametersandoptimize,leadingtomore
time&cost
−RAGreliesonacombinationofpre-trainedLLMandexternalknowledgebases
▪Adaptability
−Fine-tuningmakestheLLMmorespecializedandtailoredtoaspecifictaskordomain
−RAGmaintainsthegeneralizabilityofthepre-trainedLLMbyleveragingexternalknowledgeallowingittoadapttoawide
rangeoftasks
▪ModelArchitecture
−Fine-tuningtypicallyinvolvesmodifyingtheparametersofthepre-trainedLLMwhilekeepingitsarchitectureunchanged.
−RAGcombinestheretrievalandgenerationcomponents,withthestandardLLMarchitecturetoincorporatetheretrieval
mechanism.

RAG Benefits
•EnhancedRelevance:
•Incorporatesexternalknowledgeformorecontextuallyrelevantresponses.
•ImprovedQuality:
•Enhancesthequalityandaccuracyofgeneratedoutput.
•Versatility:
•Adaptabletovarioustasksanddomainswithouttask-specificfine-tuning.
•EfficientRetrieval:
•Leveragesexistingknowledgebases,reducingtheneedforlargelabeleddatasets.
•DynamicUpdates:
•Allowsforreal-timeorperiodicupdatestomaintaincurrentinformation.
•TrustandTransparency
•Accurateandreliableresponses,underpinnedbycurrentandauthoritativedata,significantlyenhanceusertrustinAI-driven
applications.
•CustomizationandControl:
•OrganizationscantailortheexternalsourcesRAGdrawsfrom,allowingcontroloverthetypeandscopeofinformationintegratedinto
themodel’sresponses
•CostEffective

Applications
•ConversationalAI:
•RAGenableschatbotstoprovidemoreaccurateandcontextuallyrelevantresponsestouserqueries..
•AdvancedQuestionAnswering:
•RAGenhancesquestionansweringsystemsbyretrievingrelevantpassagesordocumentscontaininganswerstouserqueries.
•ContentGeneration:
•Incontentgenerationtaskssuchassummarization,articlewriting,andcontentrecommendation,RAGcanaugmentthegeneration
processwithretrievedinformation,incorporatingrelevantfacts,statistics,andexamplesfromexternalsources.
•Healthcare:
•RAGcanassisthealthcareprofessionalsinaccessingrelevant/latestmedicalliterature,guidelines.

Demo

Thank you

introductiontoragretrievalaugmentedgenerationanditsapplication-240312101523-64ccae07.pdf

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

introductiontoragretrievalaugmentedgenerationanditsapplication-240312101523-64ccae07.pdf

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......