Strategic AI Integration in Engineering Teams

Keyvan Azami
Enterprise AI, Google

ML for the
Enterprise

UXDX
Conference
2024

Applied AI research as
enterprise competitive
advantage
Our Mission

Google's IT landscape:
●Billions of users globally using products and
services
●Millions of support inquiries raised each year
●Thousands of help articles

Self Service Automation goal:
Identify critical self help offerings and make them
intuitive, fast, accessible, and effective

Our goal with this project:
Surface relevant support articles directly to internal
users when they open a ticket so that they can solve
their problems without hands-on support

Case Study Problem Statement
Impact

Thousands of hours saved
in Googler wait time

Many issues solved on first
touch

Ideation and Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Data Acquisition
Gather or create data, verify usability and access
Data Exploration
Understand the data and domain, create features and response
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Productionization
Deploy automated pipeline and A/B test model
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
ML Project Lifecycle

Surface relevant support articles directly to
users when they open a ticket so that
they can solve their problems without
hands-on support

What's the business impact?
Tens of thousands of hours of efficiency gains

Is ML a good solution for the problem?
Yes it is. Lots of data, automated decisions, natural language data,
retrieval problem (Q/A)

Can we measure success for solving the problem?
Yes. Because we are able to tie model metrics to estimated business impact
Ideation & Prioritization
Identify problems, estimate impact, assess feasibility, prioritize

Project Impact Complexity
Article Suggestion High Medium
Tech Support Trends High High
Anomaly Detection of
Machine Logs
Medium High

What data do we have?
●Millions of historical resolved support tickets (title, description, location)
●10% of those tickets are resolved with a link to a help article
●Thousands of articles (title and description) in our knowledge base

What data would we want?
Validated responses from users on whether or not the linked article solved the ticket

Data Acquisition
Gather or create data, verify usability and access
Can't remote into my machine

I can't remote connect into my
desktop, which is turned on, and
I've checked the connection and it
doesn't seem to be the problem,
Common SSH connection issues

My network connection is poor
and ruining my SSH session

Changing the following settings
may help. Run:
...

3 hour wait

Features
Ticket
●All tickets have titles and most have descriptions
●10% of those tickets are resolved with a link
●Some of the tickets have templates
●Some of the tickets are machine generated

Article
●Articles have meaningful titles and descriptions
●There is a long tail of articles that solve tickets

Response
90% of tickets in our data don't have a self-service
solution
Data Exploration
Understand the data and domain, create features and response

Does my model not perform well enough but I have
significant room for experimentation?
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Ticket Article
Start with a simple model
TF-IDF for both ticket and article
Does my model not perform well enough because of my
features?
Have I framed the problem incorrectly?
Precision 40% Recall 15%
Then move to embeddings and then contextual
embeddings

Wait...

I thought this was supposed to move linearly?

Ideation and Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Data Acquisition
Gather or create data, verify usability and access
Data Exploration
Understand the data and domain, create features and response
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Productionization
Deploy automated pipeline and A/B test model
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
ML Project Lifecycle

Really focus on the impact we are trying to achieve
Ideation & Prioritization
Identify problems, estimate impact, assess feasibility, prioritize

Focus on returning the right article only when there's an article to return (precision)
Reframe as classification and reduce the number of classes that we predict
Can we simplify the problem?
Do we really need to suggest from that entire long tail of articles?

Does my model not perform well enough but I have
significant room for experimentation?
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics

Then move to embeddings and then contextual
embeddings
Does my model not perform well enough because of my
features?
Precision 50% Recall 25%
Article 1
Article k
No Article
Ticket embedding
Start with a simple model
TF-IDF for the tickets -- both summary and description
(Articles are now the classes, so no encoding)

We're still short of our goals

Time to debug with a quick demo to our partner
team

Out of Office Notifier
showing with exception
Download a new version

Ticket
Model says...
No article

Data says...
OK, that seems like a
real mistake...

I can see why the
model would link this
ticket to the article,
but OK
Help setting up monitors
Choosing the correct
monitor cable
Ticket
Model says...
No article

Data says...

My machine certificate
expired and I cannot connect
google
Installing certificates [a very
detailed article]
Ticket
Model says...
No article

Data says...
This seems like an
error in our data
We definitely have an
under-labeling problem

Ideation and Prioritization
Identify problems, estimate impact, assess feasibility, prioritize
Data Acquisition
Gather or create data, verify usability and access
Data Exploration
Understand the data and domain, create features and response
Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Productionization
Deploy automated pipeline and A/B test model
Monitoring and Maintenance
Monitor data and model drift, report performance and impact
ML Project Lifecycle

Data Acquisition
Gather or create data, verify usability and access
2,000 total tickets
Positives: 800 (mix of solved and reasonable suggestions with coverage across articles)
Negatives: 1,200

Model Prototyping
Iterate on hypotheses, validate results against model and business metrics
Model Precision Recall
Dual encoder, all articles 80% 21%
TF-IDF classification, top 100 80% 34%
Embeddings + CNN, top 100 80% 29%
LM no fine-tuning, top 100 80% 39%
LM with fine-tuning, top 100 80% 43%

Considerations for model productionization
Human-in-the-loop evaluation for initial A/B testing

Do production results line up with prototyping validation?
Yes! Extensive validation and golden set definitely helped with making this smooth

Model Productionization
Deploy automated pipeline and A/B test model

Monitor data drift
Are the features changing? (do tickets look materially different over time?)
Are the class labels shifting? (are more articles being added / changed)

Monitor model drift
How is retraining the model affecting the P/R curve?
Is performance degrading significantly?
Monitoring and Maintenance
Monitor data and model drift, report performance and impact

Takeaways and other considerations

However, always be asking yourself questions and be ready to iterate!
Have a structured approach to your machine learning projects
Takeaways
Leverage the right tools for the job in each stage -- notebooks -> experiment
managers -> full production framework

Make sure to start with impactful problems that are good fits for machine
learning

Thank You!

Strategic AI Integration in Engineering Teams

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Strategic AI Integration in Engineering Teams

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx