Large Language Models Updates and Hands On

Tools Jupyter Notebook / Google Colab

What will we learn Updates on LLM Prompt Techniques Hands-on Prompt Techniques Serving LLM for end-user: Chainlit Supervised Fine Tuning Langsmith

LLM

Artificial Intelligence Large language models (LLMs) are a category of foundation models trained on immense amounts of data making them capable of understanding and generating natural language and other types of content to perform a wide range of tasks. (IBM) Large Language Model Definition

LLM for Bahasa Indonesia (or SEA languages) Komodo-7B , Yellow.ai : EN, ID, 11 Regional Language Wiz.ai 13B : Bahasa Indonesia SEA-LION 3B/7B, AI Singapore : SEA Languages

Popular LLMs GPT 4 Claude Gemini Llama 3.1

GPT 4 Estimated parameters : 1.76 trillion Max. Context Window : 128k token GPT 4 Launch Date : March 14 2023 GPT-4 Turbo and GPT-4 Turbo with Vision : November 2023 GPT-4o (omni) Launch Date : May 13 2024 GPT-4o mini Launch Date : July 18 2024

GPT 4

GPT 4o the same high intelligence as GPT-4 Turbo but is much more efficient—it generates text 2x faster and is 50% cheaper. multimodal (accepting text or image inputs and outputting text) GPT-4o has the best vision and performance across non-English languages of any of OpenAI models gpt-4o-2024-08-06 provides 16,384 max output token (from previously only 4096 tokens)

GPT 4o mini OpenAI’s most advanced model in the small models category, and their cheapest model yet multimodal (accepting text or image inputs and outputting text), has higher intelligence than gpt-3.5-turbo but is just as fast. It is meant to be used for smaller tasks, including vision tasks. We recommend choosing gpt-4o-mini where you would have previously used gpt-3.5-turbo as this model is more capable and cheaper.

Claude Feature : Advanced Reasoning, Vision Analysis, Code Generation, Multilingual Processing Latest : Claude 3.5 Sonnet (Jun 21 2024) Context Window : 200k+ tokens (about 500 pages of text). Claude Model Family Haiku : Light and Fast Sonnet : Optimal Opus : Powerful

Gemini Different Sizes: Gemini Nano (1.0) : on-device Gemini Flash (1.5) : lightweight Gemini Pro (1.5) : best model for general performance Gemini Ultra (1.0) : largest model Gemini Pro (1.5) has 2 Million (!) context window

Llama 3.1 Llama trained on 2048 A100 Llama 3.1 trained on 16000 H100 GPU 3 different parameter size : 7B, 80B, 405B Context Window : 128k Capabilities : Code , Multilinguality , Math and Reasoning , Long Context , Tool Use, Factuality , Steerability

LLM Comparison Pricing | https://huggingface.co/spaces/philschmid/llm-pricing Leaderboard | https://tatsu-lab.github.io/alpaca_eval/

Pricing for “Large” Models

Pricing for “Medium” Models

Pricing for “Small” Models

AlpacaEval GPT 4 still leads Llama 3.1 (8B Instruct) is readily available at Telkom to try (ranked 29)

Cost of Training LLM 2024 AI Index Report by Stanford

“Open” AI Llama 3.1 405B vs GPT-4 Comparison for similar number of parameters

Telkom Internal Deployed LLM https://api.postman.com/collections/15922710-c4fb5d7c-55bc-4eb4-9929-c0a2d26418cf?access_key=PMAT-01J4GY8WP86NFW4XRSQEAKCWYW https://telkom-ai-dag.api.apilogy.id/Telkom-LLM/0.0.3/chat/completion/telkomai key-auth key jadi x- api -key For this demo, RAHASIA = gRwiTy4K1RnHEdT89pTFLSy3WtuvpAW6

LLM Optimization Flow Highlights consideration of LLM Optimization

LLMOps LLMOps allows for the efficient deployment, monitoring and maintenance of large language models.

Prompt Techniques

Paper that we’re referring to Survey of Prompting Techniques This presentation will only cover English text prompting techniques Alternative : https://www.promptingguide.ai/

Definition Prompt : A prompt is an input to a Generative AI model, that is used to guide its output. Prompt Template : Prompts are often constructed via a prompt template. A prompt template is a function that contains one or more variables which will be replaced by some media (usually text) to create a prompt.

Definition Prompting : The process of providing a prompt to a GenAI , which then generates a response. Prompt Engineering : the iterative process of developing a prompt by modifying or changing the prompting technique that you are using. Prompting Technique : prompting technique is a blueprint that describes how to structure a prompt, prompts, or dynamic sequencing of multiple prompts.

Components of A Prompt Component Explanation Example Directive Intent of Prompt Tell me five good books to read. Examples act as demonstrations that guide the GenAI to accomplish a task Night: Noch Morning: Output Formatting Instruction to output in certain formats {PARAGRAPH} Summarize this into a CSV Style Instructions modify the output stylistically rather than structurally Write a clear and curt paragraph about llamas. Role “Persona” Pretend you are a shepherd and write a limerick about llamas. Additional Information “Context” if the directive is to write an email, you might include information such as your name and position so the GenAI can properly sign the email

In Context Learning (ICL) ICL refers to the ability of GenAIs to learn skills and tasks by providing them with exemplars and or relevant instructions within the prompt, without the need for weight updates/retraining Skills can be learned from exemplars and/or instructions

1. Zero Shot Zero-Shot Prompting uses zero exemplars. Can be combined with other concept : Chain of Thought A few examples include Role Prompting Style Prompting Emotion Prompting System 2 Attention SimToM Rephrase and Respond ( RaR ) Re-reading (RE2) Self-Ask

Example : Role Prompting Role Prompt prompts that assign the role to the LLM “You are [role]” Audience Prompt prompts that specify the audience of the conversation “You are talking to a [role]” Interpersonal Prompt prompts that connotate the relationship between the speaker and listener. “You are talking to your [role]”

2. Few-Shot Prompting GenAI learns to complete a task with only a few examples (exemplars).

Few-Shot Prompting Design Decision Design Decision Explanation Exemplar Quantity Increasing the quantity of exemplars in the prompt generally improves model performance, particularly in larger models However, in some cases, the benefits may diminish beyond 20 exemplars Exemplar Ordering The order of exemplars affects model behavior Exemplar Label Distribution if 10 exemplars from one class and 2 exemplars of another class are included, this may cause the model to be biased toward the first class Exemplar Label Quality Despite the general benefit of multiple exemplars, the necessity of strictly valid demonstrations is unclear. Some research shows that exemplars with incorrect labels may negatively diminish performance, but some research don’t support this. Exemplar Format Common format : “Q: {input}, A: {label}” Exemplar Similarity Selecting exemplars that are similar to the test sample is generally beneficial for performance

3. Thought Generation Zero Shot CoT Contain 0 Exemplars Thought Inducer "Let’s think step by step.“ "Let’s work this out in a step by step way to be sure we have the right answer“ Characteristic No example Task Agnostic Few Shot CoT Contain multiple exemplars

3. Thought Generation

4. Ensembling In GenAI , ensembling is the process of using multiple prompts to solve the same problem, then aggregating these responses into a final output.

5. Self Criticism When creating GenAI systems, it can be useful to have LLMs criticize their own outputs

6. Decomposition decomposing complex problems into simpler sub-questions.

Hands On

Hands-on : Langsmith

LangSmith Sign up for LangSmith

LangSmith User API Keys Settings > API Keys

LangSmith Projects Settings > API Keys

LangSmith Dataset

Hands-on : Supervised Fine Tuning

Supervised Fine Tuning Use GPU for Google Colab Environment. Example : T4 Using LLM : Mistral-7B-Instruct-v0.2 With limited time, only limited epoch and data will be tuned on

Hands-on : Chainlit

Ngrok Sign up for ngrok : https://dashboard.ngrok.com/signup navigate to your dashboard and locate the Authtoken

That’s it, Thank you for joining!

Large Language Models Updates and Hands On

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Large Language Models Updates and Hands On

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx