“Testing Cloud-to-Edge Deep Learning Pipelines: Ensuring Robustness and Efficiency,” a Presentation from Instrumental

embeddedvision 36 views 32 slides Sep 30, 2024
Slide 1
Slide 1 of 32
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32

About This Presentation

For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/09/testing-cloud-to-edge-deep-learning-pipelines-ensuring-robustness-and-efficiency-a-presentation-from-instrumental/

Rustem Feyzkhanov, Staff Machine Learning Engineer at Instrumental, presents the “Testin...


Slide Content

Testing Cloud-to-Edge Deep
Learning Pipelines: Ensuring
Robustness and Efficiency
RustemFeyzkhanov
Senior Staff Machine Learning Engineer
Instrumental

•Introduction to cloud-to-edge deep learning pipelines
•Testing strategies for cloud-to-edge pipelines
•ML pipeline dev tests
•ML pipeline internal tests
•Stage tests
•Summary
Agenda
2© 2024 Instrumental

Example –corgi/not corgi model
3© 2024 Instrumental
Business
understanding
Data
acquisition
Modeling Deployment
Customer
acceptance
Model
Model
corgi
not corgi

Example –corgi/not corgi model
4© 2024 Instrumental
Business
understanding
Data
acquisition
Modeling Deployment
Customer
acceptance

Example –corgi/not corgi model
5© 2024 Instrumental
Business
understanding
Data
acquisition
Modeling Deployment
Customer
acceptance

Example –corgi/not corgi model
6© 2024 Instrumental
Business
understanding
Data
acquisition
Modeling Deployment
Customer
acceptance

Example –corgi/not corgi model
7© 2024 Instrumental
Business
understanding
Data
acquisition
Modeling Deployment
Customer
acceptance
corgi


Why did it happen?
Prediction
service
Prediction
service

•Data preprocessing is different between cloud and edge
•Inference framework has a different version and doesn’t support model
•NPU drivers are not backward compatible and drop some of the tensors
•Data drift happened on edge
•Training set wasn’t comprehensive enough
=> proper testing can solve most of the above issues (not all though)
Possible edge cases with model on edge
8© 2024 Instrumental

9© 2024 Instrumental
Cloud-to-edge deep learning pipeline evolution

Cloud-to-edge deep learning pipeline –phase 1
10© 2024 Instrumental
Data extraction
and analysis
Data
preparation
Model training
Model
evaluation and
validation
Trained model
Model registry Model serving
Prediction
service
Research
Production
From https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

ML training pipeline
Cloud-to-edge deep learning pipeline –phase 2
11© 2024 Instrumental
Data extraction
Data
preparation
Model training
Model
evaluation
Data validation
Model
validation
From https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

Cloud-to-edge deep learning pipeline –phase 2
12© 2024 Instrumental
Source code
Model registry Model serving
Prediction
service
Research
Production
Data
Automated ML training
pipeline
Experimental ML training
pipeline
From https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

Cloud-to-edge deep learning pipeline –phase 3
13© 2024 Instrumental
Source code
Model registry Model serving
Prediction
service
Research
Production
Data
Automated ML training
pipeline
Experimental ML training
pipeline
CI: Build/test/
package
Packages/
artifacts
Performance monitoringTrigger
From https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

Phase 3
•Pros
•Fully automated model training
•Quick releases/model retraining
•Cons
•New model may run into errors on the edge device
•New model may perform on the edge differently from the cloud
Cloud-to-edge deep learning pipeline –phase 3
14© 2024 Instrumental

15© 2024 Instrumental
Cloud-to-edge deep learning pipeline testing

Cloud-to-edge deep learning pipeline –model lifecycle
16© 2024 Instrumental
Source code
Development
/experiment
Pipeline
continuous
integration
Packages
Pipeline
continuous
delivery
Automated
pipeline
Continuous
training
Trained model
Model
continuous
delivery
Prediction
service
New data
From https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
Let’s unwrap the pipeline into ML model lifecycle

Cloud-to-edge deep learning pipeline –model lifecycle
17© 2024 Instrumental
Source code
Development
/experiment
Pipeline
Continuous
Integration
Packages
Pipeline
Continuous
Delivery
Automated
pipeline
Continuous
training
Trained model
Model
Continuous
Delivery
Prediction
service
New data
Pipeline development
Training pipeline
Model serving
From https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

Cloud-to-edge deep learning pipeline –model lifecycle
18© 2024 Instrumental
Source code
Development
/experiment
Pipeline
Continuous
Integration
Packages
Pipeline
Continuous
Delivery
Automated
pipeline
Continuous
training
Trained model
Model
Continuous
Delivery
Prediction
service
New data
ML Pipeline dev tests
Pre-train tests
Post-train tests
Stage tests
training
Stage tests
prediction
From https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

Categories of tests
19© 2024 Instrumental
ML pipeline
tests
Run in local dev
and CI
Test pipeline
code before
packages are
generated
Pre-train
tests
Run before
training in the
cloud
Checks our
assumptions
about data
Post-train
tests
Run after
training in the
cloud
Checks our
assumptions
about model
Stage test –
training
Run in stage env
after the training
package is
deployed
Checks for
regression in the
cloud
Stage test –
prediction
Run in stage env
after the
inference
package is
deployed
Checks for
regression on
edge

20© 2024 Instrumental
ML pipeline tests

ML pipeline tests
21© 2024 Instrumental
Unit tests
Check
individual
components
of the
pipeline
Regression
tests
Check that
change
didn’t affect
existing
functionality
and artifacts
Smoke
tests
Cheap tests
to check
that model
performs as
expected on
simple
dataset
Integration
tests
Check that
components
work
together in
an expected
way

ML pipeline tests examples
22© 2024 Instrumental
Unit tests
-Data
split/preparation
-Naive cases for
training
-Training-serving
skew
Regression
tests
-Backward
compatibility
with old models
-Data shift due
to preprocessing
library change
Smoke
tests
-Does model
correctly predict
sample from
train?
-Does model
return result in
expected format?
Integration
tests
-Test that
pipeline works
together
-Test that there
is no discrepancy
between train
and serving env

23© 2024 Instrumental
ML pipeline internal tests

Cloud-to-edge deep learning pipeline
24© 2024 Instrumental
From https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
ML Training Pipeline
Data extraction
Data
preparation
Model training
Model
evaluation
Data validation
Model
validation

Pre-train and post-train tests
25© 2024 Instrumental
Pre-train tests
Data validation
Explicit checks
for data which
we will use for
training
Tests that identify
data/pipeline
assertion failure
and fail early
Post-train tests
Model evaluation
Performance on
a validation or
test dataset
Tests that model
was able to
converge and reach
acceptable
performance
Model validation
Invariance tests
Test that specific
data perturbations
don’t affect model
output
Directional
expectation tests
Test that specific
data perturbations
do affect model
output
Minimum
functionality
tests
Ensure that model
works in critical
scenarios/failure
modes/edge cases

Pre-train and post-train tests
26© 2024 Instrumental
Pre-train tests
Data validation
Explicit checks
for data which
we will use for
training
-Incorrect labels
-Incorrect
features
Post-train tests
Model evaluation
Performance on
a validation or
test dataset
-Low accuracy
Model validation
Invariance tests
-Rotated/shifted
image have
similar
predictions
Directional
expectation tests
-Apples are at
the top when
searching for
apples
Minimum
functionality
tests
-Blurry images
-Overexposed
images

27© 2024 Instrumental
Stage tests

How cloud-to-edge deep learning pipeline looks like
28© 2024 Instrumental
Source code
Development
/experiment
Pipeline
Continuous
Integration
Packages
Pipeline
Continuous
Delivery
Automated
pipeline
Continuous
training
Trained model
Model
Continuous
Delivery
Prediction
service
New data
ML Pipeline dev tests
Pre-train tests
Post-train tests
Stage tests
training
Stage tests
prediction
From https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning

•Run on stage (could be dev environment)
•Tests should expose bugs with more realistic data and load
•Tests could be more complex
•Examples
•Shift in prediction scores
•Shift in predictions themselves
•CPU/RAM utilization (to catch memory leak)
Stage tests
29© 2024 Instrumental

30© 2024 Instrumental
Summary

•Tests are a way to ensure predictable and transparent model behavior
•Tests are the best way to catch ML bugs early
•ML tests ≠classic software engineering tests, but similar mindset could be
applied:
•Unit tests to check for bugs
•Testing specific scenarios
•Testing expected edge cases
Summary
31© 2024 Instrumental

•Posts and presentations
•https://arseny.info/reliable_ML
•https://www.jeremyjordan.me/testing-ml/
•https://eugeneyan.com/writing/testing-ml/
•https://krokotsch.eu/cleancode/2020/08/11/Unit-Tests-for-Deep-Learning.html
•https://cloud.google.com/solutions/machine-learning/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning
•GithubRepositories
•https://github.com/marcotcr/checklist
•https://github.com/great-expectations/great_expectations
•https://github.com/HypothesisWorks/hypothesis
Resources
32© 2024 Instrumental