[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf

DataScienceConferenc1 28 views 22 slides Apr 27, 2024
Slide 1
Slide 1 of 22
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22

About This Presentation

Building big scale data product doesn't rely only on sophisticated modeling. It also requires an agile methodology, iterative research & development process, versatile big data stack, and a value-oriented mindset. I'll discuss how we -at Dsquares- build big-scale AI product that leverage...


Slide Content

Practical use cases in
building valuable data
products at scale
DATA PRODUCT
MANAGEMENT

Lead Data Scientist at Dsquares
Abdelrahman Ghallab

What is Data Product
Management?
1
Managing a Data Science Research
Process
2
Creating a Healthy Innovation
Process
3
Data Product Considerations
4
Data in cloud vs in-house
5
Contents

The process of building
intelligent products that
leverages Data & AI
What is Data Product
Management?

Product Management
is about
Defining the
Right Problem
Finding the
Right Solution

Finding the Right Solution

Finding the Right Solution
Explainability?
Revealing the Mystery

Cornerstones of Explainability
Transparency
Simulatability,
decomposability,
algorithmic
transparency
Evaluation Criteria
fidelity, generality,
accuracy, scalability, etc
Explanations
visual, simplification,
feature relevance, etc

Managing a Data Science Research Process

Challenges with Managing a Data Science
Research Process
1 - Research-Based
Most of the times, data
projects are attempting
to solve new, industry-
specific problems
2 - Open-Ended
Experimentation
Most projects require
multiple experiments
with no promise of
success
3 - Iterative Process
Partial & full experiment
re-iteration will always
be there
4 - Scoping &
Estimation
Scoping & estimation
for data science tasks
are usually tough to
keep

Let's bundle it all
together in one
process!

When Building a Data Product
Follow a Hypothesis-
Driven Development
Validate your
assumptions about
persona, JTBD, &
solution
Follow the process, &
iterate
Test a lot & Quickly
engage in user-testing
since early phasess
Setup the
necessary metrics
success criteria,
definition of done, early
stopping, etc.
Get Rigorously
innovative

AI Rolling Out
How can we safely rollout our
model?
Considerations for Data Projects
AI Life Cycle
How often do we train the model?
How can we avoid any model
surprises?
Data Quality & Pipeline
How to consistently validate the
data & assumptions?
How can we avoid data surprises?
Data Governance & Privacy
How to ensure against any Personal
Identifiable Info leak?

When Rolling out a new AI
model that does one of the
following:
Replaces a manual process
Replaces a legacy tech
process
Replaces an older model
1.Rolling out an
AI model
Shadow Deployment
Canary Deployment
Blue/ Green Deployment
A/B Deployment
BigBang Deployment

Continuous assessment for the
model performance is
required for:
Data Drift
Concept Drift
2. Monitoring the
AI model

Continuous assessment for
the data pipeline is
necessary
Set-up a data observability
framework
3. Monitoring the
Data Pipeline

4. Data Governance
& Privacy
Follow the Country’s regulations
for data privacy (i.e: GDPR)
Protect Privacy by design
Anonymizing Personal
Identifiable Information
Using advanced techniques
such as differential privacy
Follow guidelines for:
Restricting & Documenting
data access
Documenting data lineage ,
pipeline & transformations

AI on cloud Vs On-Premises
Depending on the use case, we might need to develop our product
either on cloud or on-premises

DSQ AI On-Cloud
Solution
Cloud tools can help you
build & scale quickly
Utilize out of the box
models and tools

DSQ AI On-Premises
Solution
AI on-prem can be
challenging
infrastructure limitations
have the final say

THANK YOU FOR
LISTENING!
Q&A

RESOURCES
Alex Cowan (Sep 2022), Hypothesis-Driven Development: A Guide to Smarter Product
Management. Available at: https://www.alexandercowan.com/hdd-book-ref/
Evidently AI (April 2024). Machine Learning in Production Guide: Concept Drift Chapter.
Available at: https://www.evidentlyai.com/ml-in-production/concept-drift
Yashaswi Nayak (July 2022).ML Model Deployment Strategies: An illustrated guide to
deployment strategies for ML Engineers Available at: https://towardsdatascience.com/ml-
model-deployment-strategies-72044b3c1410
Regulation (EU) 2016/679 (General Data Protection Regulation). Available at: https://gdpr-
info.eu/
Belle, Vaishak, and Ioannis Papantonis. “Principles and Practice of Explainable Machine
Learning.” Frontiers, May 26, 2021.
https://www.frontiersin.org/articles/10.3389/fdata.2021.688969/full.
Marktab. “What Is the Team Data Science Process? - Azure Architecture Center.” Azure
Architecture Center | Microsoft Learn. https://learn.microsoft.com/en
us/azure/architecture/data-science-process/overview.
Tags