[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning

DonghwanLee4 288 views 24 slides Jul 01, 2024
Slide 1
Slide 1 of 24
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24

About This Presentation

이 세션에서는 SageMaker Training Jobs / SageMaker Jumpstart를 사용하여 Foundation Model 을 Pre-Triaining 하거나 Fine Tuing 하는 방안을 제시합니다. 이 세션을 통해 아래 3가지가 소개됩니다.
1. 파운데이션 모델을 처음부터 Training
2. 오픈 소스 모...


Slide Content

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
JUNE 2024
Customizing Foundation Models on
Amazon SageMaker
Miron Perel
Principal AI/ML GTM Specialist
Kristine Pearce
Principal AI/ML GTM Specialist

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Today’s Agenda
1.GenAI Trends
2.Gen AI on Amazon SageMaker
3.SageMaker Training Demo
4.Getting Started
GENERATIVE AI

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Financial
Services
Healthcare and
Life Sciences
Automotive Manufacturing
Media &
Entertainment
Telecom Energy
Generative AI is transforming all industries

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 4
Innovations at
Amazon
SCALE
How can I
scale this?
Rufus
An expert
shopping assistant
Customer
reviews
On Amazon.com
Amazon
Pharmacy
faster prescriptions and
more helpful support

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
More than use Amazon SageMaker for GenAI/ML

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Primary reasons cited is due to
control and customization for
a given use case
Prefer
fine-tuning or training
models for specific needs
Enterprises increasing
open source usage or
switching in 2024 and onward
80% 78%90%
GenAI Trends in the Enterprise
Ways Enterprises are Building, Buying and Optimizing FMs (a16z)
Source: https://a16z.com/generative-ai-enterprise- 2024/

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cost differentials can skyrocket with high end LLMs at scale
The Problem of Sustained Inference
Price to generate a million
tokens with a open source fine
tuned LLM
Price to generate a million
tokens with a closed source LLM
$60 $3
* Upwards of
** For example
***
Source: https://medium.com/@ja_adimi/comparison-cost-analysis-should-
we-invest-in-open-source-or-closed-source-llms-bfd646ae1f74

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common approaches for customizing FMs
Zero-shot Few-shots Retrieval
Augmented
Generation
(RAG)
Fine-tuning
Re-Training
Complexity,
Quality,
Cost,
Time
8

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Common approaches for customizing FMs
Zero-shot Retrieval
Augmented
Generation
(RAG)
Fine-tuning
P/re-Training
Complexity,
Quality,
Cost,
Time
9
Few-shots

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 10
Automatically generate
a customer response
with the knowledge of
your products
Make faster decisions
by automatically
extracting data from
documents
PDF
Extraction
Customer
Support
Automating process
of manually
categorizing
documents and
content
Classification
Use an LLM to
understand how
customers feel about
products and services
Customer
Sentiment
72% of enterprises are fine-tuning Domain/Task adapted LLMs balancing customization with cost and efficiency
Use Case Examples of Domain & Task Adapted LLM

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. 11
Parameter efficient
re-training of FM
parameters with new
training data
Adjusting pre- trained
FMs on a specific
dataset with labeled
examples
Supervised Fine
Tuning (SFT)
Re-Training
Develop more precise
context-aware FMs
based on un/semi-
supervised learning
Domain Adaptation
Fine Tuning (DAFT)
FM is fine- tuned on the
data, with specific
instructions or
guidelines, and formats
Instruction
Fine-Tuning (IT)
72% of enterprises are fine-tuning Domain/Task adapted LLMs balancing customization with cost and efficiency
Fine Tuning & Re-Training Techniques
Fine-tune any open source LLM for your Domain Adapted LLMs (finance, healthcare, programming) OR
your Task Adapted LLMs(data summarization, PDF extraction, sentiment analysis, audio transcription)

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
FM Customization approaches using Amazon SageMaker
Foundation
model
Customize Task-
specific FM
WHY YOU USE IT
•Faster time to market
•Maximize accuracy for specific tasks
•Achieve domain adaptation
CustomizingFMs for task and
domain specific use cases
WHY YOU USE IT
•Build a proprietary or re-train open- source
•Commercialize or deploy for internal use
•Drive revenue and cut operational costs
Build TrainDeploy
Build/ReTrain FMs
from scratch

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Select Evaluate Customize Deploy
Amazon
SageMaker
Build, train, and
deploy ML models at
scale, including FMs Jumpstart Clarify
Ground Truth
Training
Studio
Inference
SageMaker Generative AI for ML Practitioners
Customizing FMs for Task & Domain Specific Use Cases
Customization
and control with
TTM / TCO

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Discover
Fine tune,
evaluate,
and deploy
Select from the broadest and latest selection of foundation models
AVAILABLE IN AMAZON SAGEMAKER JUMPSTART

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Unique challenges to manage hardware resources
efficiently for large scaleFM training
Strategies for
distributed training
Infrastructure
stability
Clusters provision &
management
15
Collect data

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker for Large Scale FM Training:
Self-healing clusters
reduce training time
by up to 20%
Resilient
environment
SageMaker distributed
training libraries
improve performance
by up to 20%
Streamline distributed
training
Control over compute
environment and
workload scheduling
Optimized resources
utilization (SMHP)
Focus on ML without
the need to manage
infrastructure
Managed training
environment(SMTJ)
ORAND AND

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo video

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SageMaker Training Jobs for FM Training

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.© 2023, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Financial
Services
Healthcare and
Life Sciences
Automotive Manufacturing
Media &
Entertainment
Telecom Energy

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
GenAI Success Stories
HELPING ENTERPRISES TO USE FOUNDATION MODELS
50%
lower costs for
hosting FMs
7 months
reduced time- to-
value from 12-18
months
80%
reduction in inference latency
66%
cost savings with GPU utilization
Enhance Customer Experience
Enhance Customer Experience Boost Employee Productivity
Streamline Business Processes

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Start your generative AI journey today
Additional Resources
View
product page
Watch deep
dive demo videos
Read
blog posts
Read the
technical docs
SageMaker
Training
Example

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!