2024-10-17 PyData-Valais-01 Raphael Luthi SHAP

RaphalLthi 198 views 23 slides Oct 16, 2024
Slide 1
Slide 1 of 23
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23

About This Presentation

From Data to Insights With the SHAP Explainable AI Library


Slide Content

From Data to Insights With the SHAP
Explainable AI Library
Raphaël Lüthi
PyData Valais Meetup
17
th
October 2024

Who am I ?
Machine Learning & AI Lead
Data Scientist
Data Scientist
Engineering MSE
Raphaël Lüthi
Machine Learning & AI Lead @ Groupe Mutuel
2016
2019
2020
2010
Today
LinkedIn

How Does a Data Scientist Create Value ?
Value
Actionable
Insights
Automation
Data Driven
Strategy &
Management
Tabular Data
Ingestion
Pre-
Processing
Feature
Engineering
Tidy
Dataset
Machine
Learning
MLOps Software Personalised
Experience
CoPilot Tool
boosting experts’
productivity
Data Viz
Explainable AI (XAI)
Causal ML

SHAP
(SHapley Additive
exPlanations)

How to Extract Insights From Your Data ?
Features
Target
Tidy
Dataset
Insights
1.Capture complex data patterns with
Machine Learning
(Gradient Boosted Decision Trees)
2.Reveal the patterns with SHAP
(SHapley Additive exPlanations)
3.Profit !
Tabular
Data

How to Reveal the Complex Patterns Your
Machine Learning Algorithm Picked Up ?
Shapley Additive exPlanations
Sources: https://www.nature.com/articles/s42256-019-0138-9; https://shap.readthedocs.io/en/latest/index.html

Explainable AI (XAI)
Source: https://www.nature.com/articles/s42256-019-0138-9
Local Explanation with SHAP Values
Lundberg et al. 2020 :

Lundberg et al. 2020 :
Global Explanation with SHAP Values
Source: https://www.nature.com/articles/s42256-019-0138-9

Lundberg et al. 2020 :
Global Insights with SHAP Values
Source: https://www.nature.com/articles/s42256-019-0138-9

Lundberg et al. 2020 :
SHAP Summary Plots
Source: https://www.nature.com/articles/s42256-019-0138-9

Magnitude
&
Direction
Prevalence
Low
frequency
but high
impact
effects
Lundberg et al. 2020 :
SHAP Summary Plots
Source: https://www.nature.com/articles/s42256-019-0138-9

Lundberg et al. 2020 :
SHAP Dependence Plot (or Scatter Plot)
Source: https://www.nature.com/articles/s42256-019-0138-9
Complex Nonlinear
InteractionsThreshold

Lundberg et al. 2020 :
SHAP Explanation Embedding and Clustering
Matrix of Numerical
Values !
Explanation Embedding
Explanation Clustering (semi-supervised)
Source: https://www.nature.com/articles/s42256-019-0138-9

Beyond the Black-Box ML –Going Deeper…
1.More on SHAP values :
oExplaining Machine Learning Models: A Non -Technical Guide to Interpreting SHAP Analyses
| Aidan Cooper
oHow Shapley Values Work | Aidan Cooper
oSupervised Clustering: How to Use SHAP Values for Better Cluster Analysis | Aidan Cooper
oUsing SHAP to Understand Trends in the Transient Stability Limit | Hamilton and
Papadopoulos (research paper)
oOn SHAP Interpreting Machine Learning Models With SHAP | Christoph Molnar (book)
2.More on explainable AI :
oInterpretable Machine Learning | Christoph Molnar
ohttps://github.com/interpretml/interpret

10’ Discussion

BACKUP SLIDES

Return on Experience
Valuable Business Applications
1.Diving into a new dataset
2.Explaining the drivers behind a behaviour of
interest (i.e. app usage, churn, …)
3.Debugging machine learning algorithms
4.Gaining stakeholders' trust

Capture Nonlinear Patterns With Gradient
Boosted Decision Trees
Gradient Boosted
Decision Trees
Decision Tree
Rooms <= 3
Med. Income <= 4K
Age <= 20
Rooms <= 3
Med. Income <= 4K
Age <= 20
+Best algorithm for most tabular
datasets (benchmarks & Kaggle)
+Captures nonlinear patterns
+Great black-box predictors
-Difficult to explain
More info: Pedro Tabacof - Unlocking the
Power of Gradient-Boosted Trees (using
LightGBM) | PyData London 2022
Popular open-source implementations:

Insights from XAI: SHAP Summary Plot
Source: https://shap.readthedocs.io/en/latest/index.html

Insights from XAI: SHAP Dependence Plots

Insights from XAI: SHAP Dependence Plots

Insights from XAI: SHAP Dependence Plots

Managing the Complexity & Extracting
Business Relevant Insights
Brainstorming session :
Raw Insights
1.Geography is
important
2.Median income
interacts with
house size
Resulting
Story
A zone’s median
income changes how
other variables
impact house
prices.
TODO: Next
Iteration
1.X
2.Y
3.Z

Beyond the Black-Box ML –Going Deeper…
1.More on SHAP values :
oExplaining Machine Learning Models: A Non -
Technical Guide to Interpreting SHAP Analyses
| Aidan Cooper
oHow Shapley Values Work | Aidan Cooper
oSupervised Clustering: How to Use SHAP Values
for Better Cluster Analysis | Aidan Cooper
oUsing SHAP to Understand Trends in the
Transient Stability Limit | Hamilton and
Papadopoulos (research paper)
oOn SHAP Interpreting Machine Learning Models
With SHAP | Christoph Molnar (book)
2.More on explainable AI :
oInterpretable Machine Learning | Christoph
Molnar
ohttps://github.com/interpretml/interpret
3.More on Causal machine learning :
oWhy Machine Learning Is Not Made for Causal
Estimation | Quentin Gallea
oCausal Inference for The Brave and True |
Matheus Facure (book)
4.More on Bayesian machine learning :
ohttps://github.com/pymc-devs/pymc
ohttps://github.com/pymc-labs/pymc-marketing
5.More on digital twins & domain informed
AI :
oTrustworthy AI for Power Systems | Spyros
Chatzivasileiadis
oPhysics-Informed Neural Networks (Wikipedia)
Tags