PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)
RebeccaBilbro
66 views
30 slides
Jun 19, 2024
Slide 1 of 30
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
About This Presentation
To honor ten years of PyData London, join Dr. Rebecca Bilbro as she takes us back in time to reflect on a little over ten years working as a data scientist. One of the many renegade PhDs who joined the fledgling field of data science of the 2010's, Rebecca will share lessons learned the hard way...
To honor ten years of PyData London, join Dr. Rebecca Bilbro as she takes us back in time to reflect on a little over ten years working as a data scientist. One of the many renegade PhDs who joined the fledgling field of data science of the 2010's, Rebecca will share lessons learned the hard way, often from watching data science projects go sideways and learning to fix broken things. Through the lens of these canon events, she'll identify some of the anti-patterns and red flags she's learned to steer around.
Size: 3.64 MB
Language: en
Added: Jun 19, 2024
Slides: 30 pages
Slide Content
Mistakes Were Made
Lessons Learned ~15 Years In
PyData London X | Dr. Rebecca Bilbro
What is a mistake?
Classification Errors
Overtraining
Dubious Data Sourcing
Is dunking on genAI in
bad faith?
(The first ~10
years of data
science)
Low ROI Is an existential threat
“The ROI in AI
(and How to Find It)”
“Those who cannot remember
the past are condemned to
repeat it.”
George Santayana
So let’s get vulnerable
Types of mistakes in data science
embarrassing costly
not following
PEP8
using jupyter
notebooks in
production
using up
non-renewable
energy
vendor lock
everything
else
Fig 1. How ancient data scientists
conceived ML architectures.
What it actually takes to
build an ML product:
●Model microservices
●Rules-based agents
●Multi-modal models
●Expert systems
●Agentic AI
●“Crews”
●Mixture of experts (MoE)
●TBD
More vulnerable
Synthetic Oversampling
Active Learning
Visual Hallucinations & The
Crowding Problem
Even more vulnerable
310 miles / 499 km
?????? 4 hr 50 min
(one way)
Inspection targeting
Setting: OSHA, the Obama years
Business goal: Make the most of a
small budget for health and safety
inspections and hopefully save more
lives.
Data science angle:
●Use ML to identify features of
companies where workers get
sick/hurt/killed.
●Find companies that match the
characteristics and haven’t
already been inspected in the
last 3 years.
●Fix government with data
science.
In the US, the Occupational Safety and Health
Administration can only do inspections under the following
circumstances:
-Fatalities and/or serious injuries
-Worker complaints
-Random selection
Machine learning is not a clear fit.
Setting: Department of Commerce, still the Obama years
Business goal: Unlock greater economic opportunity for American exporters
Data science angle:
-Use ML to identify features of companies that export their goods and services.
-Find US-based companies that match the characteristics but aren’t exporting yet.
-Fix government with data science.
Imports and Exports
-Used LASSO to identify most
informative features.
-Added a random forest for binary
classification.
-This was a dumb idea
Imports and Exports
Attempt #2
-Found a huge number of companies within
the same industry (NAICS 488510) that were
not yet exporting.
-Presented results to domain experts,
who all looked at me like I was born
yesterday. The industry was “Freight
Transportation Arrangement”.
-The feedback was gentle but clear –
you can’t export freight logistics,
dumdum. Had to go back to the
drawing board again.
Setting: Analytics for sales enablement
Business goal: Link profiles across social
media platforms and enhance with
demographics
Data science angle:
-Use graph-based entity resolution and
network analysis to infer linkages.
-Build classifiers to label incoming text
data with metadata for sales
enablement.
REDACTED
-The initial models were “thrown
over the wall” and it took months to
build the inference API
-Discovered some models had been
trained on ethically questionable
data.
-The data science team had to build
a production-grade ingestion
system.
Setting: Customer service organization
Business goal: Use AI to interpret customer
service requests to route them more efficiently
Data science angle:
-Train a neural network on historic CSR
data to do multiclass classification.
-Use human CSR routing decisions as the
“labels”
-Serialize the model and use an inferencing
API to route incoming queries.
The company said they wanted “AI”. We took what they
said at face value.
We picked a preliminary model that was way too
complex
-Tensorflow architecture with many layers and hyperparameters.
-Training took a long time.
-Tuning was difficult.
-We didn’t do a good job of capturing and versioning our experiments.
Attempt #2
-A simpler model
performed better but
still underwhelmed
leadership
-.68 overall F1 score was
not exciting, even
though some per-class
scores were good
-Presented
visualizations that
made sense to us but
not to them
Attempt #3:
-Finally got a model that was “good enough”.
-Team went straight from research to deployment.
-Deployed Jupyter notebooks as code
-No packaging
-No versioning
-No linting
-No tests
-No plan for model drift detection or retraining
-Incoming customer requests after March 2020 did not
match the training data.
-Everyone was suddenly working from home and
experiencing issues that did not match any known labels.
-“Production is noise”
In engineering, we call these
kinds of things “anti-patterns”
We need to talk more about the
anti-patterns of data science.
SOME
Anti-patterns from the
first 10 years
-Self-indulgent experimentation
-Lack of understanding and/or
interest in the downstream user
-Disregard for data provenance
-Estrangement from engineering
culture
-Over reliance on batch wise
training
-Ignorance of application
contexts
Practice using 1st person when
talking about mistakes.
“I made a mistake when…”
Actually, your mistakes make you.
“An expert is a person who has
made all the mistakes that can be
made in a very narrow field.”
- Niels Bohr
Dr. Rebecca Bilbro
Co-Founder/CTO @ Rotational Labs
Applied Text Analysis with Python (O’Reilly)
Scikit- Yellowbrick