Introducing Llama Family: Alpaca, Vicuna and LLaVA

YanXu646657 266 views 27 slides May 18, 2024
Slide 1
Slide 1 of 27
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27

About This Presentation

Introducing Llama Family: Alpaca, Vicuna and LLaVA


Slide Content

Alpaca, Vicuna, LLaVA LLM Reading Group Houston Machine Learning Meetup April 26, 2024 LLaMa Family

LLaMA Family Ref: https:// blackbearlabs.ai /blog-detail/open-large-language-models-history-2023-report Vicuna LLaVA

Overview: LLAMA 1, 2 &3 Pretrained only Pretrained + Instruction Fine-tuned (Llama2-chat) LLAMA 3 15T 8B, 70B, 400B+. 8k pretrained and instruction-fine-tuned

Alpaca: A Strong, Replicable Instruction-Following Model There are two important challenges to training a high-quality instruction-following model under an academic budget : A strong pretrained language model: Llama H igh-quality instruction-following data: self-instruct with a strong LLM Released Mar 13, 2023

Alpaca: Framework

Self-Instruct: Aligning Language Models with Self-Generated Instructions

Step 1. Instruction generation

Step 2. Classification task identification

Step 3. Instance generation

Step 3. Instance generation

Results High-quality

Alpaca: Results For our initial run, fine-tuning a 7B LLaMA model took 3 hours on 8 80GB A100s, which costs less than $100 on most cloud compute providers. We note that training efficiency can be improved to further reduce the cost. We performed a blind pairwise comparison between text-davinci-003 and Alpaca 7B , and we found that these two models have very similar performance: Alpaca wins 90 versus 89 comparisons against text-davinci-003. We are releasing the following assets today: Demo : an interactive demo for everyone to try out Alpaca. Data :  52K demonstrations  used to fine-tune Alpaca. Data generation process : the code for  generating the data . Training code : for  fine-tuning  the model using the Hugging Face API.

Vicuna We introduce Vicuna-13B, an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT . Preliminary evaluation using GPT-4 as a judge shows Vicuna-13B achieves more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90% *  of cases. The cost of training Vicuna-13B is around $300.  Released Mar 30, 2023

Model Comparison

Model Performance While this proposed framework based on GPT-4 shows a potential to automate chatbot assessment,  it is not yet a rigorous approach . Building an evaluation system for chatbots remains an open question requiring further research. 

Releases Vicuna demo and Chatbot Arena website

LLaVA: Large Language and Vision Assistant Bridge the gap between the vision and LLM Vicuna

LLaVA: Large Language and Vision Assistant pre-trained CLIP visual encoder ViT-L/14 Vicuna Results

Pre-training for Feature Alignment. pre-trained CLIP visual encoder ViT-L/14 Vicuna FROZEN FROZEN Language model and Vision Encoder are frozen, only updating projection matrix W.

Fine-tuning End-to-End pre-trained CLIP visual encoder ViT-L/14 Vicuna FROZEN Vision Encoder is frozen, continuing to update both the W and the Language Model in LLaVA

Fine-tuning Datasets Leverage ChatGPT/GPT-4 for multimodal instruction-following data collection: Input to GPT-4: Captions and Bounding boxes prompt GPT-4 to curate a list of questions with the intent to instruct the assistant to describe the image content We collect 158K unique language-image instruction-following samples in total, including 58K in conversations, 23K in detailed description, and 77k in complex reasoning respectively.

The prompt to generate image-based conversation from ChatGPT/GPT-4

Results

Results: LLaVA -Bench GPT-4 as the evaluator

Results: Science QA CoT : Chain of Thought

LLaMA Family Ref: https:// blackbearlabs.ai /blog-detail/open-large-language-models-history-2023-report Vicuna LLaVA

LLaMA series LLaMA: Open and Efficient Foundation Language Models, Feb 2023 LLaMA 2: Open Foundation and Fine-Tuned Chat Models, July 2023 Variants of LLaMA: Alpaca, Vicuna, LLaVA Hands-on session: Fine-tune a LLaMA model