This presentation dives into the practical applications of machine learning within Google's operations, providing a comprehensive overview of how to leverage AI technologies to solve real-world business challenges.
Key Points Covered:
- Introduction to Machine Learning at Google: Discussion on...
This presentation dives into the practical applications of machine learning within Google's operations, providing a comprehensive overview of how to leverage AI technologies to solve real-world business challenges.
Key Points Covered:
- Introduction to Machine Learning at Google: Discussion on the role of ML and its evolution in enhancing Google's operational efficiency.
- Experience Sharing: Insights into the team's long-term engagement with machine learning projects and the impacts on Google’s operational strategies.
- Practical Applications: Real-world examples of ML applications within Google’s daily operations, providing a blueprint to adapt similar strategies.
- Challenges and Solutions: Discussion on the challenges faced during the implementation of ML projects and the strategic solutions employed to overcome them.
- Future of ML at Google: Insights into future trends in machine learning at Google and how they plan to continue integrating AI into their ecosystem.
Size: 1015.54 KB
Language: en
Added: May 21, 2024
Slides: 24 pages
Slide Content
Keyvan Azami
Enterprise AI, Google
ML for the
Enterprise
UXDX
Conference
2024
Applied AI research as
enterprise competitive
advantage
Our Mission
Google's IT landscape:
●Billions of users globally using products and
services
●Millions of support inquiries raised each year
●Thousands of help articles
Self Service Automation goal:
Identify critical self help offerings and make them
intuitive, fast, accessible, and effective
Our goal with this project:
Surface relevant support articles directly to internal
users when they open a ticket so that they can solve
their problems without hands-on support
Case Study Problem Statement
Impact
Thousands of hours saved
in Googler wait time
Many issues solved on first
touch
Ideation and Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Data Acquisition
Gather or create data, verify usability and access
Data Exploration
Understand the data and domain, create features and response
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Productionization
Deploy automated pipeline and A/B test model
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
ML Project Lifecycle
Surface relevant support articles directly to
users when they open a ticket so that
they can solve their problems without
hands-on support
What's the business impact?
Tens of thousands of hours of efficiency gains
Is ML a good solution for the problem?
Yes it is. Lots of data, automated decisions, natural language data,
retrieval problem (Q/A)
Can we measure success for solving the problem?
Yes. Because we are able to tie model metrics to estimated business impact
Ideation & Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Project Impact Complexity
Article Suggestion High Medium
Tech Support Trends High High
Anomaly Detection of
Machine Logs
Medium High
What data do we have?
●Millions of historical resolved support tickets (title, description, location)
●10% of those tickets are resolved with a link to a help article
●Thousands of articles (title and description) in our knowledge base
What data would we want?
Validated responses from users on whether or not the linked article solved the ticket
Data Acquisition
Gather or create data, verify usability and access
Can't remote into my machine
I can't remote connect into my
desktop, which is turned on, and
I've checked the connection and it
doesn't seem to be the problem,
Common SSH connection issues
My network connection is poor
and ruining my SSH session
Changing the following settings
may help. Run:
...
3 hour wait
Features
Ticket
●All tickets have titles and most have descriptions
●10% of those tickets are resolved with a link
●Some of the tickets have templates
●Some of the tickets are machine generated
Article
●Articles have meaningful titles and descriptions
●There is a long tail of articles that solve tickets
Response
90% of tickets in our data don't have a self-service
solution
Data Exploration
Understand the data and domain, create features and response
Does my model not perform well enough but I have
significant room for experimentation?
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Ticket Article
Start with a simple model
TF-IDF for both ticket and article
Does my model not perform well enough because of my
features?
Have I framed the problem incorrectly?
Precision 40% Recall 15%
Then move to embeddings and then contextual
embeddings
Wait...
I thought this was supposed to move linearly?
Ideation and Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Data Acquisition
Gather or create data, verify usability and access
Data Exploration
Understand the data and domain, create features and response
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Productionization
Deploy automated pipeline and A/B test model
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
ML Project Lifecycle
Really focus on the impact we are trying to achieve
Ideation & Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Focus on returning the right article only when there's an article to return (precision)
Reframe as classification and reduce the number of classes that we predict
Can we simplify the problem?
Do we really need to suggest from that entire long tail of articles?
Does my model not perform well enough but I have
significant room for experimentation?
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Then move to embeddings and then contextual
embeddings
Does my model not perform well enough because of my
features?
Precision 50% Recall 25%
Article 1
Article k
No Article
Ticket embedding
Start with a simple model
TF-IDF for the tickets -- both summary and description
(Articles are now the classes, so no encoding)
We're still short of our goals
Time to debug with a quick demo to our partner
team
Out of Office Notifier
showing with exception
Download a new version
Ticket
Model says...
No article
Data says...
OK, that seems like a
real mistake...
I can see why the
model would link this
ticket to the article,
but OK
Help setting up monitors
Choosing the correct
monitor cable
Ticket
Model says...
No article
Data says...
My machine certificate
expired and I cannot connect
google
Installing certificates [a very
detailed article]
Ticket
Model says...
No article
Data says...
This seems like an
error in our data
We definitely have an
under-labeling problem
Ideation and Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Data Acquisition
Gather or create data, verify usability and access
Data Exploration
Understand the data and domain, create features and response
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Productionization
Deploy automated pipeline and A/B test model
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
ML Project Lifecycle
Data Acquisition
Gather or create data, verify usability and access
2,000 total tickets
Positives: 800 (mix of solved and reasonable suggestions with coverage across articles)
Negatives: 1,200
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Precision Recall
Dual encoder, all articles 80% 21%
TF-IDF classification, top 100 80% 34%
Embeddings + CNN, top 100 80% 29%
LM no fine-tuning, top 100 80% 39%
LM with fine-tuning, top 100 80% 43%
Considerations for model productionization
Human-in-the-loop evaluation for initial A/B testing
Do production results line up with prototyping validation?
Yes! Extensive validation and golden set definitely helped with making this smooth
Model Productionization
Deploy automated pipeline and A/B test model
Monitor data drift
Are the features changing? (do tickets look materially different over time?)
Are the class labels shifting? (are more articles being added / changed)
Monitor model drift
How is retraining the model affecting the P/R curve?
Is performance degrading significantly?
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
Takeaways and other considerations
However, always be asking yourself questions and be ready to iterate!
Have a structured approach to your machine learning projects
Takeaways
Leverage the right tools for the job in each stage -- notebooks -> experiment
managers -> full production framework
Make sure to start with impactful problems that are good fits for machine
learning