KubeCon & CloudNative Con 2024 Artificial Intelligent

EmreGndodu 231 views 42 slides Jun 20, 2024
Slide 1
Slide 1 of 42
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42

About This Presentation

Cloud Native Compute Foundation and KubeCon 2024 - Paris

Cloud Native Artifical Intelligenet (CNAI)


Slide Content

About Me
Kişisel Bilgilerim
<<12 Yıllık Tecrübeye Sahip Yazılım Mühendisiyim. Siemens AŞ de Cloud Mimarı >>
Eğitim Bilgilerim
<<Kocaeli Üniversitesi Bilgisayar Mühendisliği – 2011 >>
İletişim Bilgilerim
<<[email protected] - [email protected]>>
Hobilerim
<<Basketbol Oynamak , Takip Etmek>>
Skills Language
<<Software - English>>

Kubecon &
CloudNative
Conf 2024 -
Paris

Linux Foundation
900
Opensource
Project
3M+
Developers
Trained
777K
Developeres
Contributing
Code
51M
Lines of Code
added weekly
17K
Contributing
Organization

The Cloud Native
Computing Foundation
(CNCF)
185 Projects239K
Contributors
16.2M
Contributions
190
Countries

Projects
SANDBOX
The CNCF Sandbox is the entry
point for early stage projects
INCUBATING
Graduated and incubating
projects are considered stable
and are used successfully in
production environments
GRADUATED
Graduated and incubating
projects are considered stable
and are used successfully in
production environments

Projects
GRADUATED
INCUBATING

Recap of KubeCon 2024 - Paris
ATTENDEE OVER
12,000 PEOPLE
ARTIFICIAL
INTELLIGENT
PLATFORM
ENGINEERING
GREENER
COMPUTING

Key Components of an AI Model
Data
The foundation of
AI models. Models
are trained on large
datasets to learn
patterns and
relationships
Algorithms
The set of rules or
procedures the
model uses to
process data and
make decisions.
Training
The process of
feeding data into
the model and
adjusting its
parameters to
minimize errors and
improve accuracy
Inference
The phase where
the trained model
makes predictions
or decisions based
on new, unseen
data.

What We
Talk on
Today ?
Training an
AI Model
•Gather and preprocess data relevant to the task
Data Collection
•Choose the appropriate algorithm or architecture for the problem
Model Selection
•Feed the data into the model, adjust parameters using optimization
techniques (e.g., gradient descent)
Training
•Assess the model’s performance using metrics such as accuracy, precision,
recall, F1 score
Evaluation
•Implement the trained model in a real-world application for inference
Deployment

CloudNative Artificial
Intelligence (CNAI)
•Cloud Native
•Cloud Native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach. These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.
•Cloud Native Artificial Intelligence (CNAI)
Refers to approaches and patterns for building and deploying AI applications and workloads using the principles of Cloud Native. Enabling repeatable and scalable AI-focused workflows allows AI practitioners to focus on their domain

Evolution of
Artificial
Intelligence
Discriminative AI
Generative AI
Convolutional Neural Networks
Transformers

Large Language Model
arge anguage odel – LLM is
just a larger version of a
language model
•Why LLM ?
: Number of
Parameters, billion
parameters
• : Self-supervised
learning
Language Model
Large
Language
Model

Level of LLMs
Prompt Engineering
Model Fine Tuning
Build Your Own LLM

Step for Building LLM Application
•Identifiy Problem to
Solve
PLAN
•Choose the LLM
•Customize the LLM
•Setup the
Application
Architecture
BUILD•Implement evalution
and feedback
RUN

One Way to Deploy your LLM in CloudNative
Model Definition
Model Consumption (Local or API)
Package LLM
Containerize
Serve Multiple Model

Model Definition
Define Your Problem
Conversational Chatbot
•Text Summarization
•Classification
•Question Answering
Pick Your Model
Strategy
•Foundational Model
(General Knowledge)-
70B/7B
•Fine tune a model
(Context knowledge)
Own Data
•Retrival Augumented
Generation -RAG
Find Tools
•Hugging Face
•LangChain

Model Consumption
LocalExternal

Package LLMs
Business
needs a
unified way to
interact with
models
Business needs
different types of
LLMs
Each model has
different
compute/storage
requirements
Each model has a
different way to
interact

Exposing LLMs: LangChain
LangChain: A framework for building apps
powered by LLMs
•Python and JS/Typescript library
•Native support for 80+ LLMs, open source
models supported by templates
•Supports RAG pipelines, 75+ vector stores
•LangServe: Deploy LangChain chains as REST
API
•LangSmith: Developer platform

LLMLogic: Local
Tell me about K8S
Pompt Template
CHAIN
Model
(Llama2
~26GB)
Pipeline

LLMLogic: Local Optimized
Tell me about K8S
Pompt Template
CHAIN
Pipeline
Llama Optimized
Model (~7GB)
chain = prompt|pipeline
question = "Tell me about K8s"
result = chain.invoke({"query": question})

LLMLogic: External
Tell me about K8S
Pompt Template
CHAIN
External
LLM
API
Model Client
chain = prompt|pipeline
question = "Tell me about K8s"
result = chain.invoke({"query": question})

Integrating Multiple ModelsUsers
UI
LLM PROXY
LLAMA2
Finance Fine
Tunned Model
Other Local
LLAMA2
Legal Fine
Tuned
LLAMA2_Optimi
zed
Private
General
Knowledge
Model
(Small/Big)
External LLM API
General
Knowledge
Model

Integrated Multiple Model:
Multipod

Recap of Demo:
Pre - Local Model Downloaded
at each container launch
UI (Frontend) send messages to
the LLM Proxy
LLM_proxy send messages to
the selected LLM (INVOKE)
After processed the answer
send back to the UI

DEMO

Merging of
Cloud Native
and Artificial
Intelligence

Predictive and generative AI needs
across computing, networking, and
storage
Challenges/NeedGenerative AI Predictive AI
Computational PowerExtremely high. Requires specialized hardware.Moderate to high. General-purpose hardware can suffice.
Data Volume and DiversityMassive, diverse datasets for training.Specific historical data for prediction
Model Training and Fine-
tuning
Complex, iterative training with specialized compute.Moderate training.
Scalability and ElasticityHighly scalable and elastic infrastructure (variable and
intensive computational demands)
Scalability is necessary but lower elasticity demands. Batch
processing or event-driven tasks
Storage and ThroughputHigh-performance storage with excellent throughput.
Diverse data types. Requires high throughput and
lowlatency access to data.
Efficient storage with moderate throughput. It focuses
more on data analysis and less on data generation; data is
mostly structured.
NetworkingHigh bandwidth and low latency for data transfer and
model synchronization (e.g., during distributed training).
Consistent and reliable connectivity for data access.

Enabling Tools and Techniques

LinuxFoundation AI Landscape
Distributed
Training
Kubeflow Training
Operator
Pytorch DDP
Torchx
Tensorflow Distributed
So on …
General
Orchestratio
n
Kubernetes
Volcana
Armada
Kuberay
Nvidia Nemo
Yunikorn
Kueue
So on..
ML Serving
Kserve
Seldon
VLLM
So on..
CI/CD
KubeFlow Pipeline
ML FLow
TFX
BentoML
MLRun
so on ...
Data Science
Juypter
Kubeflow Notebook
Pytorch
TensorFlow
Apache Zeeplin
so on ..
Workload
Observabilit
y
Prometheus
Graphana
InfluxDB
Open Telemetery
so on ...
Auto ML
Hyperopt
Optuna
Kubeflow Katib
NNI
so on ...
Governance
& Policy
Kyverno
Optuna
OPA/ Gateway
so on ...
Data
Architecture
Clickhouse
Apache Pinot
Apache Druid
Cassandra
Hadoop HDFS
Apache HBASE
Apache Spark
Apache Flink
Apache Pulsar
Vektor
Database
Milvus
Chroma
Quadrant
Pinecone
LLM
Observabilit
y
Trulens
Langfuse
OpenLLMetry
Distributed
Training
Kubeflow Training
Operator
Pytorch DDP
Torchx
Tensorflow Distributed
OpenMPI
Deepspeed
Megatron

CHALLENGES FOR
CLOUD NATIVE
ARTIFICIAL
INTELLIGENCE
The typical ML pipeline is comprised of:
•Data Preparation (collection, cleaning/pre-processing, feature engineering)
•Model Training (model selection, architecture, hyperparameter tuning)
•CI/CD, Model Registry (storage)
•Model Serving
•Observability (usage load, model drift, security)

Benefits of Kubernetes for ML
REPEATABILITYPIPELINEPORTABILITYSCALING

Right Tools for ML/AI Jobs
GPU
Huge amount of core
Good for Lightweight task
Design for graphics computation task
CPU
Small amount of Core
Good for Heavy Task
Design for Common Computation

How GPUs
work with
K8s

K8S GPU Worker Node

How GPUs are Actually Used
Training environment
Notebooks
MLOps pipelines
Data processing
Tests
Inference environment
ML model serving
Online operations
Data pre-processing
DPUs

AI Landscape & Ecosystem
Modeling
Deployment
Versioning
Orchestration
Compute
Data
How Infrastructure
is Needed
How Much Data Scientist Care

AI Landscape & Ecosystem

Cloud Native Production-Ready AI Platform

Summary
What is LF , CNCF and Kubecon2024 Paris
Cloud Native Artifical Intelligence
What is LLM ?
Multi Model LLM Demo on Kubernetes
Cloud Native Production Ready AI Platform Components

References
https://www.cncf.io/reports/cloud-native-artificial-intelligence-whitepaper/
https://www.youtube.com/watch?v=1u5LtsJqyrA&list=PLj6h78yzYM2N8nw1YcqqKveySH6_0VnI0
https://www.youtube.com/watch?v=Ek0eU_H9AoQ&list=PLj6h78yzYM2PWGv34W6w5ssq1b1meRmY7
https://huggingface.co/
https://www.langchain.com/langchain
https://ollama.com/

Q & A
Thank you