Kubernetes running on the edge edge - KubeCon EU

ssuserfb6acb 16 views 11 slides Jun 14, 2024
Slide 1
Slide 1 of 11
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11

About This Presentation

Kubernetes on the Edge


Slide Content

Myriam Fentanes - Principal Prod Manager, Red Hat Jacqueline Koehler - Sr Manager Engineering, Red Hat Demo of Kubernetes at the edge for edge models deployment

Lifecycle of the Model Push Test Prepare Data Review Trusty AI Inner loop Outer loop ‹#› Experiment Train Gather Data Build Deploy Serve Model Monitor Retrain/Tune Physical Virtual Private cloud Public Cloud Edge Consistent observability and management

The management challenge ‹#› “72% of businesses cited manageability as the biggest obstacle in adopting edge computing” 1 1 Source: Omdia: 2021 Trends to Watch in Cloud Computing , Jan 2021

Centralized lifecycle management # of model servers Complexity of AI apps Inconsistent deployment interfaces Disconnected operations Compliance Challenge GitOps Declarative approach Approval process Version Control Eventually consistent

Centralized deployment management for models at the edge - Myriam Code Repository Distribution repository Download Dependencies Test Containerize Register AI Edge Inference Service container image Review Repository Manifest Kubernetes Kubernetes Kubernetes Controller Controller Controller Edge

Centralized deployment management for models at the edge - Myriam Code Repository Distribution repository Download Dependencies Test Containerize Register AI Edge Inference Service container image Review Repository Manifest Kubernetes Kubernetes Kubernetes Controller Controller Controller Edge

Centralized deployment management for models at the edge - Myriam Repository Distribution repository Download Dependencies Test Containerize Register AI Edge Inference Service container image Review Repository Manifest Kubernetes Kubernetes Kubernetes Controller Controller Controller Edge

Observability at the edge Core (public/private cloud, data center) Edge Node OTEL Collector Data Data Prometheus Pagerduty Dynatrace Edge Node OTEL Collector

Core OCM Hub 3) Retrieve Model 4) Build/test inference service container image 5) Push inference service container image Edge (Near/Far) Update? Model Storage - Trained Models GitOps repo 1) Store trained models 2) Trigger pipeline a) merge PR b) watches pulls updated manifest Image registry c) pulls inference container image d) deploys pods Deploy AI Edge inference service container Build AI Edge inference service container image AI Edge inference service container image(s) Inference metadata data Inference service container OCM Spoke Prometheus OTEL 6) PR with latest metadata Piecing it all Together

Thank you!
Tags