From Data to Insights With the SHAP Explainable AI Library
Size: 993.74 KB
Language: en
Added: Oct 16, 2024
Slides: 23 pages
Slide Content
From Data to Insights With the SHAP
Explainable AI Library
Raphaël Lüthi
PyData Valais Meetup
17
th
October 2024
Who am I ?
Machine Learning & AI Lead
Data Scientist
Data Scientist
Engineering MSE
Raphaël Lüthi
Machine Learning & AI Lead @ Groupe Mutuel
2016
2019
2020
2010
Today
LinkedIn
How Does a Data Scientist Create Value ?
Value
Actionable
Insights
Automation
Data Driven
Strategy &
Management
Tabular Data
Ingestion
Pre-
Processing
Feature
Engineering
Tidy
Dataset
Machine
Learning
MLOps Software Personalised
Experience
CoPilot Tool
boosting experts’
productivity
Data Viz
Explainable AI (XAI)
Causal ML
…
SHAP
(SHapley Additive
exPlanations)
How to Extract Insights From Your Data ?
Features
Target
Tidy
Dataset
Insights
1.Capture complex data patterns with
Machine Learning
(Gradient Boosted Decision Trees)
2.Reveal the patterns with SHAP
(SHapley Additive exPlanations)
3.Profit !
Tabular
Data
How to Reveal the Complex Patterns Your
Machine Learning Algorithm Picked Up ?
Shapley Additive exPlanations
Sources: https://www.nature.com/articles/s42256-019-0138-9; https://shap.readthedocs.io/en/latest/index.html
Explainable AI (XAI)
Source: https://www.nature.com/articles/s42256-019-0138-9
Local Explanation with SHAP Values
Lundberg et al. 2020 :
Lundberg et al. 2020 :
Global Explanation with SHAP Values
Source: https://www.nature.com/articles/s42256-019-0138-9
Lundberg et al. 2020 :
Global Insights with SHAP Values
Source: https://www.nature.com/articles/s42256-019-0138-9
Lundberg et al. 2020 :
SHAP Summary Plots
Source: https://www.nature.com/articles/s42256-019-0138-9
Magnitude
&
Direction
Prevalence
Low
frequency
but high
impact
effects
Lundberg et al. 2020 :
SHAP Summary Plots
Source: https://www.nature.com/articles/s42256-019-0138-9
Lundberg et al. 2020 :
SHAP Explanation Embedding and Clustering
Matrix of Numerical
Values !
Explanation Embedding
Explanation Clustering (semi-supervised)
Source: https://www.nature.com/articles/s42256-019-0138-9
Beyond the Black-Box ML –Going Deeper…
1.More on SHAP values :
oExplaining Machine Learning Models: A Non -Technical Guide to Interpreting SHAP Analyses
| Aidan Cooper
oHow Shapley Values Work | Aidan Cooper
oSupervised Clustering: How to Use SHAP Values for Better Cluster Analysis | Aidan Cooper
oUsing SHAP to Understand Trends in the Transient Stability Limit | Hamilton and
Papadopoulos (research paper)
oOn SHAP Interpreting Machine Learning Models With SHAP | Christoph Molnar (book)
2.More on explainable AI :
oInterpretable Machine Learning | Christoph Molnar
ohttps://github.com/interpretml/interpret
10’ Discussion
BACKUP SLIDES
Return on Experience
Valuable Business Applications
1.Diving into a new dataset
2.Explaining the drivers behind a behaviour of
interest (i.e. app usage, churn, …)
3.Debugging machine learning algorithms
4.Gaining stakeholders' trust
Capture Nonlinear Patterns With Gradient
Boosted Decision Trees
Gradient Boosted
Decision Trees
Decision Tree
Rooms <= 3
Med. Income <= 4K
Age <= 20
Rooms <= 3
Med. Income <= 4K
Age <= 20
+Best algorithm for most tabular
datasets (benchmarks & Kaggle)
+Captures nonlinear patterns
+Great black-box predictors
-Difficult to explain
More info: Pedro Tabacof - Unlocking the
Power of Gradient-Boosted Trees (using
LightGBM) | PyData London 2022
Popular open-source implementations:
Insights from XAI: SHAP Summary Plot
Source: https://shap.readthedocs.io/en/latest/index.html
Insights from XAI: SHAP Dependence Plots
Insights from XAI: SHAP Dependence Plots
Insights from XAI: SHAP Dependence Plots
Managing the Complexity & Extracting
Business Relevant Insights
Brainstorming session :
Raw Insights
1.Geography is
important
2.Median income
interacts with
house size
Resulting
Story
A zone’s median
income changes how
other variables
impact house
prices.
TODO: Next
Iteration
1.X
2.Y
3.Z
Beyond the Black-Box ML –Going Deeper…
1.More on SHAP values :
oExplaining Machine Learning Models: A Non -
Technical Guide to Interpreting SHAP Analyses
| Aidan Cooper
oHow Shapley Values Work | Aidan Cooper
oSupervised Clustering: How to Use SHAP Values
for Better Cluster Analysis | Aidan Cooper
oUsing SHAP to Understand Trends in the
Transient Stability Limit | Hamilton and
Papadopoulos (research paper)
oOn SHAP Interpreting Machine Learning Models
With SHAP | Christoph Molnar (book)
2.More on explainable AI :
oInterpretable Machine Learning | Christoph
Molnar
ohttps://github.com/interpretml/interpret
3.More on Causal machine learning :
oWhy Machine Learning Is Not Made for Causal
Estimation | Quentin Gallea
oCausal Inference for The Brave and True |
Matheus Facure (book)
4.More on Bayesian machine learning :
ohttps://github.com/pymc-devs/pymc
ohttps://github.com/pymc-labs/pymc-marketing
5.More on digital twins & domain informed
AI :
oTrustworthy AI for Power Systems | Spyros
Chatzivasileiadis
oPhysics-Informed Neural Networks (Wikipedia)