MLflow: A Platform for Production Machine Learning
matei
1,027 views
15 slides
Dec 14, 2019
Slide 1 of 15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
About This Presentation
Presentation about MLflow and the ML Platforms / MLOps class of software systems at the NeurIPS 2019 ML Systems workshop.
Size: 4.38 MB
Language: en
Added: Dec 14, 2019
Slides: 15 pages
Slide Content
: A Platform for
Production Machine Learning
Matei Zaharia
Databricks and Stanford University
@matei_zaharia
2
ML Research & CoursesML Products
ML in Production is Different from ML Research
Focus: reliably solving a business problem
Data is often the top challenge
(for models, try many common ones)
Must continuously deploy, monitor &
retrain models to maintain quality
Need new tools to enable this process!
(reproducibility, monitoring, …)
Focus: designing a good model
Data is provided and ready to use
(e.g. benchmark dataset)
No need to deploy, monitor, retrain
Tools for model design & evaluation
(e.g. TensorFlow, PyTorch, …)
Response: ML Platforms
Facebook FBLearner, Uber Michelangelo, Google TFX, …
+Standardize the data prep / training / deploy cycle:
if you work within the platform, you get these!
–Limited to a few algorithms or frameworks
–Tied to each company’s infrastructure
Can we provide similar benefits in an openmanner?
Open source machine learning platform
•Works with any ML library, algorithm, language, etc
•Open interfacedesign(use with any code you already have)
Tracking
Record and query
experiments: code,
data, confs, results
Projects
Packaging format
for reproducible
runs and workflows
Models
General format
that standardizes
deployment paths
Model Registry
Centralized model
management,
review& sharing
new
Community
158 contributors from >50 companies
•Integrated in RStudio, Azure ML, Faculty.ai, Neptune, Splice
900k downloads/month on PyPI
MLflowModel Registry
GitHub-like environment for organizing & reviewing models
Model Registry
MODEL
DEVELOPER
DOWNSTREAM
USERS
REST SERVINGREVIEWERS,
CI/CD TOOLS
10
11
Released in MLflow1.4
Interesting MLflowUse Cases
1) Massive number of independent models
•Company wants to train a separate model for each {facility,
chemical processing machine, household, …}
•Solution:large Spark job that runs an AutoMLlibrary for each task
+ MLflowformanaging & selecting models
•ML scientists can’t look at each model ⇒need hands-free ML!
Example:
Millions of models trained on terabytesof data/day
Interesting MLflowUse Cases
2) Big data analytics on model training results
•ML developer wants to analyze the result of multiple runs
interactively, possibly slicing across data points
•Solution:Pandas & SQL interfaces to MLflowtracking data
df = mlflow.search_runs(experiment_id, “metrics.loss< 2.5”)
Conclusion
Turning ML into reliable products ishardandrequiresanew
classofsystems (MLPlatforms)
Try MLflow at mlflow.org
Join the MLOpsworkshop at MLSys2020