Understanding MLOps in the Era of Gen AI

gdgsurrey 95 views 12 slides Sep 04, 2024
Slide 1
Slide 1 of 12
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12

About This Presentation

MLOps or ML Ops is a paradigm that aims to deploy and maintain machine learning models in production reliably and efficiently. The word is a compound of "machine learning" and the continuous delivery practice (CI/CD) of DevOps in the software field.


Slide Content

Copyright © Dell Inc. All Rights Reserved.1
An Overview….
MLOps in the era of GenAI
Raghavendra Guttur
Technical Director – GenAI
Dell Technologies, Singapore
MLOps

Agenda
•Understanding the Modern app & data platforms
•DevSecOps framework
•DORA metrics for DevSecOps
•Market drivers for MLOps
•GenAI MLOps framework
•Google’s Vertex AI Platform
•Vertex AI MLOps Stages
•Vertex AI Pipelines & Code Samples
•Questions

The modern apps & data evolution
Architecture
micro-
service
1
micro-
service
2
micro-
service
3
micro-
service
4
micro-
service
5
micro-
service
6
micro-
service
7
micro-
service
8
micro-
service
9
Monolith → Microservices
Kubernetes
Processes
Waterfall / ITIL → Agile / DevOps
DevOps
Agile
Development
Service
delivery
Lines of
business
Hybrid Cloud
platforms
Deployment
Multicloud
Datacenter → Multicloud

DevSecOps framework for Modern applications

DevSecOps – DORA Metrics
For the primary application or service,
you work on, what is your lead time for
changes (that is, how long does it take to
go from code committed to code
successfully running in production)?
For the primary application or service
you work on,what percentage of
changes to production or released to
users result in degraded service(for
example, lead to service impairment or
service outage) and subsequently
require remediation (for example, require
a hotfix, rollback, fix forward or patch)?
For the primary application or service, you
work on,how often does your organization
deploy codeto production or release it to
end users?
How long does it generally take to restore
serviceafter a change to production or
release to users results in degraded service
(for example, lead to service impairment or
service outage) and subsequently requires
remediation (for example, require a hotfix,
rollback, fix forward, or patch)?
Lead Time Change Failure Rate
Deployment Frequency Recovery Time

Copyright © Dell Inc. All Rights Reserved.6
Understanding the market drivers for MLOps
89
%
of organizations report that they
are planning or implementing
a MLOps transformation.
1

143
%
The portion of IT budget dedicated
to GenAI & MLOps is to increase
over the next 2-3 years.
AI/GenAI Mainstream App / Model
Placement
On-demand
fulfilment
Cost
Optimization
Performance
Optimization
Container
workloads
80
%
of organizations consistently
self-report increased or continued
investment in MLOps initiatives.
2

2
Gartner: Top Strategic Technology Trends for 2022: Hyperautomation

MLOps framework for GenAI stack & applications

Google’s Vertex AI

Vertex AI – MLOps Stages
Vertex AI
Feature Store
Vertex AI
Model Registry
Vertex AI
Model Evaluation
Vertex AI
Model Labels
Vertex AI
Orchestrator
Vertex AI
Metadata Analyzer
Vertex AI
Model Monitoring
Vertex AI
Model Experiments
1
2
3
4
5
6
7
8

Without MLOps v/s With MLOps
Manual processes, without MLOps
Error-prone manual steps & Infra provisioning
Heterogenous tools
Time-consuming reinvention
Complex, high cost of maintenance
Vertex AI or OpenSource MLOps
Fast to develop & deploy
Scalable (Container-first approach)
Declarative Pipeline Builder (Infra & Models)
Easy to maintain
</script>
</script>
</script>

PEOPLE PROCESS TECHNOLOGY
MLOps Touchpoints within the Org’s landscape
“We’re very excited about the integration with MLOps for Kubernetes
platforms. This is the most important piece because it’s one thing to have
doing MLOps/DevSecOps at Scale by integrating the on-demand infra
provisioning code along with MLOps/DevSecOps pipelines”
Leading provider of Crypto Services, Singapore
The leading provider of crypto asset
data reduced their digital footprint
by 80% with the intelligent
performance of MLOps/DevSecOps
framework & IaC modules

Questions?