Productionalizing Machine Learning Models: The Good, The Bad and The Ugly

IrinaKukuyevaPhD 1,239 views 27 slides Apr 26, 2018

Slide 1 of 27

About This Presentation

Data science teams tend to put a lot of thought into developing a predictive model to address a business need, tackling data processing, model development, training, and validation. After validation, the model then tends to get rolled out -- without much testing -- into production. While software en...

Size: 1.94 MB

Language: en

Added: Apr 26, 2018

Slides: 27 pages

Slide Content

Productionalizing
Machine Learning Models:
Good, Bad and Ugly

Irina Kukuyeva, Ph.D.

SoCal PyData
April 26, 2018

2
Fashion
Consulting
IoT
Healthcare
Media & Entertainment
Finance
CPG
Retail
Video Game Publisher
Online Advertising
Healthcare
Ph.D.
My Background

3
Spectrum of (Select) Production Environments
Love, love, love
leather/faux leather
jackets! Especially the
blue!

Positive sentiment
Fashion
Customer Retention
Near-real time

Consulting
Revenue $3M+/year
ASAP
Online Advertising*
Revenue $1M+/year
Near-real time

Finance
Automation
Daily

What ML Model Lifecycle Really Looks Like
what customer
described
what data
looked like
what DS
began to build
what budget
could buy
what pivot
looked like
what code
got reused
what got
tested [1]
what got
pushed to prod
what got
documented
what customer
wanted
4

Agenda
what customer
described
what data
looked like
what DS
began to build
what budget
could buy
what pivot
looked like
what code
got reused
what got
tested
what got
pushed to prod
what got
documented
Step 1
(Appendix)
what customer
wanted
Step 2:
Data QA
(Other Talks)
Step 3:
ML
Development
(Other Talks)
Step 4:
Pre-prod
Step 5:
Prod
Step 5:
Prod
Step 1
(Appendix)
5
Step 4:
Pre-prod
[1]

Step 4: Pre-Production

6

7
Pre-Production: Pay Down Tech Debt
Technical Debt — Borrowing against future to trade-off between quality of
code and speed of delivery now [2], [3]

•Incur debt: write code, including ML pipelines [4], [21]
•Pay down debt: extensively test and refactor pipeline end-to-end
[5]

￫ Test 1: Joel’s Test: 12 Steps to Better Code [6]
•Spec?
•Source control? Best practices? [7]
•One-step build? Daily builds? CI?
•Bug database? Release schedule? QA?
•Fix bugs before write new code?
8
Pre-Production: Pay Down Tech Debt

[8]

9
Consulting
Daily stand-up
Sprint planning
Version control*
Fix bugs first
Bugs emailed/db
Release on bugfix
One-step build
Atlassian suite
Virtual machines

Test 1: Joel’s Test: 12 Steps to Better Code … in practice:
Pre-Production: Pay Down Tech Debt
Fashion
Weekly stand-up
—
Version control
Fix bugs first
—
Release on bugfix
One-step build
PRD, Trello
Virtual env, Docker,
CircleCI, cookiecutter
Online Advertising*
Daily stand-up
Sprint planning
—
—
Bug database
—
—
—
—

Finance
—
—
Version control*
Fix bugs first
—
—
One-step build
Trello
—

10
Pre-Production: Pay Down Tech Debt
Test 1: Joel’s Test: 12 Steps to Better Code … in practice:

￫ Test 2: ML Test Score [9], [10]
•Data and feature quality
•Model development
•ML infrastructure
•Monitoring ML

11
Pre-Production: Pay Down Tech Debt
[11]

12
￫ Other tips — ML:
•Choose simplest model, most appropriate for task and prod env
•Test model against (simulated) “ground truth” or 2nd implementation [12]
•Evaluate effects of floating point [12]
•Model validation beyond AUC

[13]
Pre-Production: Pay Down Tech Debt

￫ Other tips — Code:
•What is the production environment?
•Set-up logging
•Add else to if or try/except + error
•DRY ￫ refactor
•Add regression tests
•Comment liberally (explain “why”) + up-to-date [20]
•Lint

Pre-Production: Pay Down Tech Debt
[14]
13

Consulting
Minimal time to add new
feature
Unsuitable features
excluded
Test 2: ML Test Score … in practice:
– Data and Feature Quality –
Pre-Production: Pay Down Tech Debt

– Model Development –
Baseline model
Bias correction

Proxy + actual metrics
Simulated ground truthBaseline model + 2
nd

implementation
Rolling refresh
Performance overall + those
most likely to click
Proxy + actual metrics

Code review (PR)
Hyperparameter
optimization
(sklearn.GridSearchCV)
14
Online Advertising*

Fashion
Test input features
(typing, pytest)

Finance
Minimal time to add
new feature
Privacy built-in

Consulting
Loosely coupled fcns
Central repo for clients
Regression testing
One-step build, prod

Pre-Production: Pay Down Tech Debt

Missing data check

Logging
Software + package
versions check
Data availability check

Logging (logging)
Evaluates empty +
factual responses
Local = prod env
(virtualenv, Docker)
Comp time (timeit)
15
Online Advertising*

Streaming

Fashion
Loosely coupled fcns
Streaming API (sanic)
Integration test (pytest)
One-step build, prod

Finance
Loosely coupled fcns
Streaming
One-step build, prod*
Reproducibility of training

Test 2: ML Test Score … in practice:
– ML Infrastructure –
– ML Monitoring –

16
Test 2: ML Test Score … in practice (cont’d):
Pre-Production: Pay Down Tech Debt

Step 5: Production

17

18
Production: Deploy Code and Monitor Performance
￫ One-button push to prod branch/repo
￫ Model Rollout
￫ Monitoring
[15]

Step 6: Post-Production

19

20
￫ Documentation + QA
￫ Debugging, debugging, debugging
￫ Bugfix vs. Feature
￫ Container Management
￫ Post-mortem
￫ Use the product
￫ Support and Training

Post-Production: Keep Code Up and Running
[16]

21
Post-Production: Align Business and Team Goals
￫ Team targets: deadlines and revenue goals
￫ Team competitions

[17]

22
Key Takeaways
￫ Communication, tooling, logging, documentation, debugging
￫ Automatically evaluate all components of ML pipeline
￫ High model AUC is not always the answer
￫ Scope down, then scale up
[18]

23
You Did It!
Code is in prod!
Celebrate!

24
You Did It!
Code is in prod!
Celebrate!
… But not too hard. Tomorrow you start on v2.

Questions?

25
https://goo.gl/DjkCBn

[ 1] http://www.projectcartoon.com/cartoon/1
[ 2] https://research.google.com/pubs/pub43146.html
[ 3] https://www.linkedin.com/pulse/when-your-tech-debt-comes-due-kevin-scott
[ 4] https://www.sec.gov/news/press-release/2013-222
[ 5] http://dilbert.com/strip/2017-01-03
[ 6] https://www.joelonsoftware.com/2000/08/09/the-joel-test-12-steps-to-better-code/
[ 7] https://solidgeargroup.com/wp-content/uploads/2016/07/tower_cheatsheet_white_EN_0.pdf
[ 8] http://geek-and-poke.com/geekandpoke/2014/2/23/dev-cycle-friday-evening-edition
[ 9] https://www.eecs.tufts.edu/~dsculley/papers/ml_test_score.pdf
[10] http://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf
[11] https://www.slideshare.net/Tech_InMobi/building-machine-learning-pipelines
[12] https://www.researchgate.net/publication/262569275_Testing_Scientific_Software_A_Systematic_Literature_Review
[13] https://classeval.wordpress.com/introduction/basic-evaluation-measures/
[14] https://xkcd.com/1024/ [15] https://xkcd.com/1319/
[16] https://www.devrant.io/search?term=debugging
[17] https://marketoonist.com/2015/03/hackathons.html
[18] https://s-media-cache-ak0.pinimg.com/originals/9c/25/08/9c25082f5c4d3477124356e45673d426.png
[19] https://www.pinterest.com/pin/177258935306351629/
[20] http://kellysutton.com/2017/09/01/comment-drift.html
[21] http://www.safetyresearch.net/blog/articles/toyota-unintended-acceleration-and-big-bowl-%E2%80%9Cspaghetti%E2%80%9D-code
26
References

27
Appendix: Establish Business Use Case
￫ Kick-off meeting with stakeholders:
•Discuss use case, motivation and scope
•Brainstorm and discuss potential solutions
•Format of deliverable? Prod env (if appropriate)
•Iron-out deadlines, checkpoints and ongoing support structure
•Scope down, then scale up
•Close meeting with recap of action items

Key Takeaways: communication + clear expectations

[19]

Productionalizing Machine Learning Models: The Good, The Bad and The Ugly

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Productionalizing Machine Learning Models: The Good, The Bad and The Ugly

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx