Intel® Tiber™ Developer Cloud Overview (30 min).pdf

cheerchain 180 views 21 slides May 22, 2024
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

Intel® Tiber™ Developer Cloud Overview 自去年 9 月全面發布以來, Intel® Tiber™ Developer Cloud 已成為評估新Intel® 硬體和部署人工智慧工作負載的廣泛使用的平台。

它基於橫向擴展基礎設施和包括oneAPI在內的開放軟體基礎,支援用於評...


Slide Content

Intel ConfidentialDepartment or Event Name 11
Product Overview

Intel ConfidentialDepartment or Event Name 22
Intel
®
Tiber

Developer Cloud
DevelopersProvides aneasy path for developer community,
research & educationto access and use Intel-
optimized AI software on Intel’s latest CPUs, GPUs
& deep learning accelerators.
Companies Offers adelivery vehicleto bring new products
based on Intel technologies & products to
market faster. A platform tomonetizeAI
services.
Partners Provides performance &cost-optimized AI
compute servicesto internal & external AI SaaS
providers
Ready Access to Intel’s AI Portfolio
Evaluate performance on 4
th
& 5
th
Gen Intel
®
Xeon
®
processors with Intel
®
AMX, Intel
®
Gaudi
®
2 AI accelerators, Intel
®
Max Series CPUs & GPUs, & Intel
®
Flex Series GPUs
Reference & solution designs for OEM & ISV enablement, AI training &
inferencing, & benchmark testing
Increased productivity
Build on top of genAI/LLM software stack optimized for Intel on all major ML
frameworks & popular data science libraries
Start from Intel-optimized Model Zoo popular models, fine-tune to
domain/customer-specific datasets
No lock-in, same models can be used on-prem or in 3
rd
party CSP
Accelerated TTM & Scale
Agility to adapt to rapidly-evolving technology with new models/services –
LLMs, genAI, multi-modal models
Choose the right-sized platform for AI phase requirements, training, fine-
tuning, transfer learning or inferencing
Apla?orm for technology adop?on & to accelerate solu?ons using Intel products.
Built for enterprises & developers to access the latest Intel CPUs, GPUs & AI accelerators.
Companies use it to develop & deploy Intel-powered AI solutions.
BoutiqueCloud Services for AI

3 3
lntel compute services
for third-party AI SaaS
Cloud interface(s)
Cloud Services
Cloud Infrastructure
Intel
®
Tiber

Developer Cloud
Portfolio for Customers
For Developers
Get started with Intel hardware and
software
Intel Cloud Portal & APIsAI Infrastructure & Platform Services (BMaaS, VMs, K8)
For Companies
Benchmark and Deploy AI at scale
Core Services (ticketing, observability, service metering, & chargeback)
Sales Enablement Ecosystem Development Ecosystem Development
Hardware Services
New platform evaluation & software bringup &
porting (e.g. 4
th
& 5
th
Gen Intel
®
Xeon
®
processor
accelerators: Intel® AMX, DSA, IAA, QAT etc.)
Enterprise AI benchmark evaluation on Intel
®
Gaudi
®
2 AI accelerators
HPC benchmark testing on Intel
®
Data Center
GPU Max Series
Ecosystem partners (e.g., OEMs, CSPs, ISVs)
enabling
Large enterprise customers (data center)
AI Infrastructure Services
LLM model training & optimization
AI model training & inferencing deployment
AI model deployment via CLI/SSH automation
AI container deployment via k8s APIs
Hosting platform for deploying AIaaS
AI disruptors (startups)
Established AI-savvy enterprises
Register atcloud.intel.com
Hardware qualification
and benchmarking
AI training and inference
production workloads
Evaluation
Education
Development
Research
For Partners
AI Infrastructure services for AI SaaS
providers

4 4
Intel AI Hardware Portfolio
Scope & Positioning of Intel
®
Tiber

Developer Cloud
Edge and Network AI Inference Client AI Usages
General
Purpose
Available on
Intel
®
Tiber™
Developer Cloud
Deep Learning
Acceleration
General
Acceleration
General
Purpose
Gaudi: Dedicated Deep Learning Training & Inference
Cloud Gaming, VDI, Media Analytics, Real-
time Dense Video
Parallel Compute, HPC, AI for HPC Real-time, Medium Throughput, Low
Latency, and Sparse Inference
Medium-to-Small Scale Training
& Fine-tuning

5 5
Intel
®
Tiber

Developer Cloud
What’s New –April 2024 update
Intel
®
Kubernetes Service was launched for optimizing and running cloud-native AI/ML training and inference workloads.
During the past months, compute capacity for enterprise customers steadily increased with additional U.S. regions, new state of the art
hardware platforms, and new service capabilities.
Bare metal as a service (BMaaS) offerings now include large-scale Intel
®
Gaudi
®
2 (128 node) and Intel
®
Max Series GPU 1100/1550
clusters, and virtual machines (VMs) running on Gaudi 2 AI accelerators.
Storage as a service (StaaS) including file and object storage is available.
SeeRelease Notesfor more.
Providing access to multiarchitecture systems on the latest Intel® hardware is truly unique—
today, no other cloud service provider provides this.

Intel ConfidentialDepartment or Event Name 66
Customer Experience & Onboarding

7 7
Register your account:cloud.intel.com Access documentation & training:console.cloud.intel.com
Intel
®
Tiber

Developer Cloud
How It Works -Access
Developer Cloud How-to Videos
Get Started
Gaudi2
Using Jupyter Notebooks
GenAI Notebooks& more…

8 8
Login to your account:cloud.intel.com
Select Intel hardware:console.cloud.intel.com/hardware
If help is needed, review documentation & training.
For service/technical support:supporttickets.intel.com
Intel
®
Tiber

Developer Cloud
How It Works –Access to Resources

9 9
Self-service paying customers Talk to your Intel account lead
Intel
®
Tiber

Developer Cloud
User Account Types
Standard Premium Enterprise Enterprise ELA
Registration Contact information Contact information
Credit card
Contact information
Invoicing setup
Contact information
Invoicing setup
Contract & Order form
Cloud services Free tier services
(freemium with guard rails)
Paid-tier services with
hourly rate (list price)
Paid-tier services with
hourly rate (list price)
Paid-tier services with fixed monthly
discounted rate
( ELA contract)
Target user(s)
Individual developer
One-time/infrequent use
Individual/Enterprise developer
Ongoing use
Multi-enterprise developers
Ongoing use
Multi-enterprise developers
Ongoing use
Support Free
Community support
Bundled standard support
Business Hours
Web Ticketing
8-hour response time
Bunded plus support
$29.99/monthly
Business Hours
Phone, Chat, Web Ticketing
4-hour response time
Bunded plus support
10% of contract – billed monthly
24 x 7
Phone, Chat, Web Ticketing
1-hour response time
Cloud credits may be available to facilitate initial use & ramp-up – talk to your Intel account manager
Competitive On-demand Pricing & Discounted Reserved Pricing

10 10
Intel
®
Tiber

Developer Cloud
Deploying a Dedicated System

11 11
Intel
®
Tiber

Developer Cloud
Online Documentation in the Platform
Popular topics
User account types
How to use SSH keys
Tutorials
Guides — Developer Cloud documentation

12 12
Intel® Developer Cloud
Online Documentation
Popular topics:
User account types
How to use SSH keys
Tutorials
Guides — Developer Cloud Docs documentation (intel.com)

13 13
Differentiators & Customer Use Cases

14 14
Differentiators
Intel
®
Tiber

Developer Cloud Proof Points
Average discount of20-40%for
bare-metal AI accelerator / GPU SKUs,
compared to CSP offerings
ELA subscription option with reserved
3, 6, & 12-month bare-metal SKU pricing with
further 10-30% discounts
Lower Cost Success Stories Intel
®
Gaudi
®
2 availability (at scale)
Heterogenous deployment environment w/
AI accelerators, GPUs & Xeon CPUs
Ready-to-use Intel-optimized software stacks
with accelerator libraries & optimized AI
frameworks & models
Unique Capabilities
Intel Confidential
thousands of users… competitive cost savings, continuing growth

15 15
OVERVIEW
Prediction Guardprovides an API service to
enterprises enabling them to adopt AI models
and harness LLMs in GenAI applications, and
check for content factuality and toxicity. For its
customers, this reduces risks around reliability,
variability, and security.
PAIN POINTS
Limited service uptime due to costs
WORKLOAD
LLM aaS using Hugging Face models
OLD TECH STACK
GPU: Nvidia A100s
Models: Hugging Face
SOLUTION/NEW TECH STACK
Intel
®
Gaudi
®
2 AI acceleratoronIntel
®
Tiber

Developer CloudModels: Hugging Face
Optimum Habana Library
Intel® Extension for PyTorch*
Intel® Extension for Transformers*
The company was able to leverage Gaudi 2 in an
hour with similar performance.
BUSINESS VALUE/RESULTS
Reduced costs by~80%by moving from GPU to Gaudi accelerators.
Lower costs mean a better price for its customers
For certain models, costs decreased while throughput increased 2x—
which helps customers using LLMs for GenAI applications generate
outputs faster and more cost-effectively.
2xincrease in performance on 7B parameter models
Reduced latency from 8-9 seconds to 4.5 seconds.
Greater speed, security, scalability, and reliability for its solution in
Intel’s cloud. Prediction Guard is able to protect privacy and
mitigate security and trust issues such as hallucinations, harmful
outputs and prompt injections.
With its solution hosted in Intel?s cloud, its customers use the
platform (as a backend), which brings revenue to both Prediction
Guard and Intel.
Use of Intel?s cloud has a measurable impact on the bottom line.
It has a better view of its fixed costs and can conduct more
accurate cost planning.
It can flexibly adjust the models it runs.
Prediction Guard: Derisking LLM Applications, Reducing Cloud
Costs by 80%
LEARN MORE
Prediction Guard De-Risks LLM Applicationscase
study "ACC, our valued customer, achieved an impressive milestone by leveraging Prediction Guard. They've efficiently reduced the need for two full-time employees dedicated to customer experience
during flash sales to just one person working half-time. This strategic optimization has been instrumental, especially during their most critical periods.“
“Our participation in Intel Liftoff…allowed us to explore the art of the possible with Intel Gaudi 2 processors and demonstrate early on that our system could run effectively on Intel Developer Cloud
at the speed and scale we needed.”
Daniel Whitenack, Prediction Guard’s founder/CEO

16 16
OVERVIEW
Advances the marketing/advertising industry through trusted AI
content evaluation with its LLM platform, development toolset
(SeekrFlow) & content analysis/scoring tool (Seekr Align).
PAIN POINTS
Networking, infrastructure, AI workloads required immense
compute capacity, high infrastructure & energy costs.
WORKLOAD
Building, training & deploying advanced LLMs for audio
transcription & diarization, video.
OLD TECH STACK
Regional colocation provider to host several GPU- and CPU-
enabled systems/on-prem Nvidia GPU A-100s systems (upgrading
to newer GPUs was too expensive).
NEW TECH STACK
Intel
®
Tiber

Developer Cloudwith different implementations:
Intel
®
Gaudi
®
2 AI acceleratorsfor inference (audio scoring)Intel® Max Series GPU 1550 for audio transcription &
diarization, optimized by Intel® Extension for PyTorch
Intel Xeon processors (4
th
current/5
th
gen in future)
Hugging Face Optimum Habana & Habana PyTorch APIs for
training & inference; Intel® Kubernetes Service
Seekr: Building Trustworthy LLMs for Evaluating
& Generating Content at Scale
BUSINESS VALUE
Cost Reduction & Efficiency:Cost savings through
Intel’s cloud for running AI workloads ranged from
~40%to400%
1
Increased Performance:20% faster AI training &
50% faster AI inferencecompared to on-prem
system with current-generation hardware.
1
; GPU
use for audio was significantly faster than on-prem
(more in notes)
Profitability:Seekr’s customers (ie SimpliSafe) are
using our cloud as a backend solution, bringing
revenue to Seekr & Intel.
Scalability:Following integration, Seekr will
leverage large-scale Gaudi 2 clusters with Intel®
Kubernetes Service to train its LLM model and add
CPU/GPU compute capacity.
Strategic Collaboration:offers SeekrAlign & soon
SeekrFlow to developer cloud customers
LEARN MORE:Seekr/Intel case study|Blog|CIO.comarticle
1
| Seekr press release "This strategic collaboration with Intel allows Seekr to build foundation models at the best price & performance using a super-computer of 1,000s of the latest Intel Gaudi chip,
all backed by high bandwidth interconnectivity.” -Rob Clark, Seekr’s President & CTO
“The Seekr-Intel collaboration unlocks the capabilities of running AI on hybrid compute. Intel GPUs & Intel Gaudi 2 chips are leveraged to train & serve at scale, large- & medium-size transformers.
We saw improved performance for model training & inference compared to other chips in the market. -Stefanos Poulis, Seekr’s Chief of AI Research & Development
Benchmark Source: Seekr
Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.

17 17
More Resources, Software Optimizations
for Intel
®
Xeon
®
Processors &
Intel
®
Gaudi
®
2 AI Accelerators

Intel ConfidentialDepartment or Event Name 18SMG Software SaaS/SATG Software Products & Ecosystem 18Intel Confidential – Internal Use Only
Intel
®
Tiber

Developer Cloud -Additional Resources
Intel Blogs
Trusted AI in Intel
®
Developer Cloud
Democratizing AI Access with Intel
®
Developer Cloud
GenAI Essentials
Part 1:LLM Chatbots with Camel 5B & Open Llama 3B v2 on Latest Intel GPU
Part 2:Text-to-Image Stable Diffusion with Stability AI & CompVis models on
Intel GPU
Part 3:Image-To-Image Stable Diffusion with Runway ML & Stability AI on
Intel GPU
Text-to-SQL Generation Using LLMs Fine-Tuned with QLoRA on Intel GPU
Unleash Yor Creativity: Exploring GenAI and Jupyter Notebooks on IntelTiber
Developer Cloud
IDC Infograph: Accelerating Cloud-native AI Development
Intel
®
Tiber

Developer Cloud site- cloud.intel.com
Marketing/lead gen siteProduct Brief
Release Notes+documentation, educational sessions are in the cloud platform,
accessible with an account
How-to Videos
Get Started with Intel Tiber Developer Cloud
Gaudi2 Running Hugging-Face on Intel Tiber Developer CloudGaudi2 on Intel Tiber Developer CloudGet Started Using Jupyter Notebooks on Intel Tiber Developer CloudGenerative AI Notebooks on Intel Tiber Developer Cloud
Article & Video:Touring the Intel AI Playground – Inside Intel Developer Cloud-
STH, by Patrick Kennedy, Nov. 2023
Customer Use Cases- case study, articles, videos
Customer Use Cases deck(internal)
Prediction Guard:Derisking LLM for EnterpriseSeekr:Trustworthy LLMs for Evaluating & Generating Content at Scale case study|
Rob Clark video|Pat Condo & Mark Castleman video|Markus Flierl video4/16/24 Seekr/Intel strategic collaboration 3
rd
party press release
Selection Technologies: AI personal assistantSiteMana:Revolutionizing Email Generation

19 19
Software Optimization Resources to Boost Deep Learning & GenAIon
Intel
®
Xeon
®
Processors
Intel
®
Extension for PyTorch* Optimizations for CPUs
Intel
®
Extension for PyTorch for CPUs & GPUsPerformance Data for Training & Inference Using Key Models on 5
th
Gen Intel
®
Xeon
®
processorsIntel Transfer Learning–Real-world use cases in computer vision & natural language processing (NLP)
Classical ML–Intel AI Model Library,pre-trained models, sample scripts, best practices, & step-by-step tutorials
AI Reference Kits–Open source kits that help enterprises target specific challenges in energy/utilities, financial services, health & life sciences, retail,
semiconductor, & more industries.
Introduce AI into applications easier & accelerate deployment with a shorter workflow & performance improvements
Includes model code, training data, instructions for ML pipelines, libraries & oneAPI components - built on AI application tools data scientists & developers
Optimize AI Inferencing & Computer Vision, Increase Performance, Streamline Deployment
Intel
®
OpenVINO

toolkit–Accelerates AI inference with lower latency & higher throughput while maintaining accuracy, reducing model footprint, & optimizing
hardware use
OpenVINO toolkit 2024 –Getting started
OpenVINO toolkit examples–A collection of ready-to-run Jupyter notebooks for learning & experimenting with OpenVINO toolkit
Optimize Intel
®
Advanced Matrix Extensions (Intel
®
AMX) Performance
Tuning Guide for AI on 4
th
Gen Intel Xeon processors
Accelerate PyTorch* Training & Inference Performance Using Intel AMXTuning Guide for BERT-based AI Inference with Intel AMX on 4
th
Gen Intel Xeon processors

20 20
SynapseAI Software Suite: Designed for Performance & Ease of Use
forIntel
®
Gaudi
®
2 AI Accelerators
Shared software suite for training & inference
Start running on Gaudi accelerators with minimal code changes
Integrated with PyTorch* & TensorFlow*
Rich library of performance-optimized kernels
Advanced users can write their custom kernels
Docker container images&how to install
Software Support Matrix Gaudi Developer Site&Getting Started documentation Gaudi Developer Forum Habana/Gaudi AI reference models Training & Inference Performance of reference modelsMemory-Efficient Training on Gaudi with DeepSpeedCreate you own kernels with TPCOptimum Habana for Hugging Face transformers
Port Code Easily from CUDA* to SynapseAI >PyTorch Model Porting
Resources to bootstrap developers on Gaudi 2 with container images, reference models & published performance