Der Spagat zwischen BIAS und FAIRNESS (2024)

_openknowledge 63 views 97 slides Apr 25, 2024
Slide 1
Slide 1 of 97
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85
Slide 86
86
Slide 87
87
Slide 88
88
Slide 89
89
Slide 90
90
Slide 91
91
Slide 92
92
Slide 93
93
Slide 94
94
Slide 95
95
Slide 96
96
Slide 97
97

About This Presentation

SPAGAT ZWISCHEN BIAS UND FAIRNESS

KI soll fair sein. Da sind wir uns alle einig. Entsprechend gilt es, eine „Voreingenommenheit“ der eigenen KI-Lösung zu vermeiden.

Leichter gesagt als getan, denn Bias kann sich an verschiedenen Stellen innerhalb des AI/ML-Lifecycles einschleichen – vom ini...


Slide Content

Der Spagat zwischen
Bias und Fairness
#WISSENTEILEN
Lars Röwekamp |@mobileLarson

@mobileLarson
CIO New Technologies
OPEN KNOWLEDGE
Lars Röwekamp
(Architecture, Microservices, Cloud, AI & ML)

„The good,
the bad,
and
the ugly?“

https://www.bmc.com
„... is a phenomenon that
skews the result of an algorithm
in favor or against an idea.“
Bias

Y
X
Bias-Variance Tradeoff
BIAS: Difference between the
average prediction of the model
and the correct value.
VARIANCE: The variablility of
model prediction for a given
data point.

Y
X
Bias-Variance Tradeoff
HIGH BIAS & LOW VARIANCE:
Oversimplified and under-fitted
model. High error on training
and high error on test data.

Y
X
Bias-Variance Tradeoff
LOW BIAS & HIGH VARIANCE:
Complex and over-fitted model.
Performes well on training data
but high error rates on test data.

Y
X
Bias-Variance Tradeoff
LOW BIAS & LOW VARIANCE:
Sweet spot where the model is
optimal with most optimal error
rate and not too complex model.

Y
Model Complexity
Bias-Variance Tradeoff
Prediction
Error
High Bias
Low Variance
Low Bias
High Variance
Test Sample
Training Sample

Y
Model Complexity
Bias-Variance Tradeoff
Prediction
Error
High Bias
Low Variance
Test Sample
Training Sample
under-fitted
Low Bias
High Variance

Y
Model Complexity
Bias-Variance Tradeoff
Prediction
Error
High Bias
Low Variance
Test Sample
Training Sample
under-fittedover-fitted
Low Bias
High Variance

Y
Model Complexity
Bias-Variance Tradeoff
Prediction
Error
High Bias
Low Variance
Test Sample
Training Sample
under-fittedover-fitted
„You want
to be here“
Low Bias
High Variance
( Total Error = Bias2+ Variance + Irreducable Error )
Total Error

Conclusion:
Bias is not bad in general, but …
Bias-Variance Tradeoff

https://preply.com/en/question/what-does-bias-mean-in-english-49251#
“… simply means inclination or prejudice
for or against one person or group,
especially in a way
considered to be unfair.“
Bias in sense of „Fairness“

„A model should not make its decisions
based on sensitive attributes!“*
*Ethnic or social origin, gender, age, income, marital status,
sexual orientation, educational background or religious affiliation.
Bias in sense of „Fairness“

„A model should not make its decisions
based on sensitive attributes!“*
*Ethnic or social origin, gender, age, income, marital status,
sexual orientation, educational background or religious affiliation.
Bias in sense of „Fairness“
height & weightzip
Be also aware of proxy attributes!

Rule #1„Be aware
of the
possibility
of bias.“

https://research.aimultiple.com/ai-bias/
“… is an anomaly in the output of
machine learning algorithms, due
to the prejudiced assumptions made during
the algorithm development process or
prejudices in the training data.“
Bias

ActionOutcomeAI/ML
Model
ML based Process

ActionOutcomeAI/ML
Model
ML based Process. Is it fair?

ActionOutcome
Fair compared to … ?
Current
(Human)
Decision

Unfair
Action
Unfair
Outcome
Unfair
(Human)
Decision
Fair compared to … ?

DataAI/ML
Pipeline
Human
ReviewActionWorld
Sources of Bias

Example: grant / deny Loan
Goal: Provide loans while balancing repayment rates for bank loans
Adult income data*
•size: 48.843
•task: income > $50K / year
•sensitive attributes
•gender
•race
https://archive.ics.uci.edu/ml/datasets/Adult

DataAI/ML
Pipeline
Human
ReviewActionWorld
Sources of Bias
Goal: Provide loans while balancing repayment rates for bank loans
Historical loans and
payments, credit
reporting data,
background checks.
Build model to
predict risk of
not repaying on
time.
Deny loan or
increase interest
rate and/or
penalties
Manual review
by the personal
customer
advisor.
Current and
potential bank
customers

AI/ML
Pipeline
Human
Review
Sources of Bias
Historical Bias
Data ActionWorld
Historical Bias:In the past, the husband kept the salary account and the wife the household account.

AI/ML
Pipeline
Human
Review
Sources of Bias
Sample Bias
Measurement Bias
Label Bias
Historical Bias
Data ActionWorld
Sample Bias:
Measurement Bias:
Label Bias:
Grouping of nationality by German/EU and others.
Overdraft as a proxy for credit risk.
Same scale for feature “income" low/medium/high for man and woman.

AI/ML
Pipeline
Human
Review
Sources of Bias
Sample Bias
Measurement Bias
Label Bias
Historical Bias
Features Bias
Learning Bias
Evaluation Bias
Data ActionWorld
Feature Bias:
Learning Bias:
Evaluation Bias:
Feature „income“ is less meaningful for people with irregular income.
Model is not suitable for the data and amplifies the BIAS effect.
Data is not representative for small online loans (wrong benchmark).

AI/ML
Pipeline
Human
Review
Sources of Bias
Sample Bias
Measurement Bias
Label Bias
Historical Bias
Features Bias
Learning Bias
Evaluation Bias
Supportive Bias
Conservatism Bias
Zero-Risk Bias
Data ActionWorld
Supportive Bias:
Conservatism Bias:
Zero-Risk Bias:
Clerk overrules AI based on "experience" or "I know him, he's creditworthy".
Clerk cannot evaluate the risk of a new business idea and rejects loan.
Clerk only approves loans with zero risk because his bonus depends on it.

AI/ML
Pipeline
Human
Review
Sources of Bias
Sample Bias
Measurement Bias
Label Bias
Historical Bias
Features Bias
Learning Bias
Evaluation Bias
Action Bias
Intervention Bias
Deployment Bias
Data ActionWorld
Supportive Bias
Conservatism Bias
Zero-Risk Bias
Action Bias:
Intervention Bias:
Deployment Bias:
Not approving a credit can have different implications for different income groups.
Loan disbursement delay can cause problems.
Credit AI is also used to decide on account opening / account fees individually.

AI/ML
Pipeline
Human
Review
Sources of Bias
Sample Bias
Measurement Bias
Label Bias
Historical Bias
Features Bias
Learning Bias
Evaluation Bias
Supportive Bias
Conservatism Bias
Zero-Risk Bias
Action Bias
Intervention Bias
Deployment Bias
ActionDataWorld

DEFINE
(equity)
DETECT
(bias)
MITIGATE
(impact)
MONITOR
(effect)
Lots of Bias ;-(
How do we make the overall system
and outcomes (more) fair ?

Rule #2„Define
fairness in
context of
your goal.“

There are many different ways of defining
what constitutes a fair machine learning
(ML) model.
Typically, an ML model can not be fair in all
aspects at the same time.
Define Fairness

ACCURACY Parity
DEMOGRAPHIC Parity
EQUAL Opportunity
EQUALIZED Odds
GROUP Unaware

Define Fairness
Income
POSITVE
grant loan
NEGATIVE
don‘t grant loan
Paid loan in fullDefaulted

Define Fairness
People who should
be approved and
are approved by
the model.
People who should
be approved and
are denied by
the model.
People who should
be denied and
are approved by
the model.
People who should
be denied and
are denied by
the model.
PREDICTED
approveddenied
denied
approved
TRUE
TPFN
FPTN
TP
TN
FP
FN
True Positive
True Negative
False Positive
False Negative

ACC = 7/10 = 70%
Group B
PREDICTED
TRUE
approve
deny
approve
deny
63
01
PREDICTED
TRUE
approve
deny
approve
deny
Group A
Paid loan in full
Defaulted
Income
Accuracy Parity
P(Ŷ=Y|A=0) = P( Ŷ=Y|A=1)

ACC = 7/10 = 70%
Group B
20
35
PREDICTED
TRUE
approve
deny
approve
deny
63
01
PREDICTED
TRUE
approve
deny
approve
deny
Group A
Paid loan in full
Defaulted
Income
Accuracy Parity
ACC = 7/10 = 70%
P(Ŷ=Y|A=0) = P( Ŷ=Y|A=1)

Group BGroup A
Paid loan in full
Defaulted
Income
Accuracy Parity
„What could
go wrong?“
„Lost opportunities
in group A vs.
high risk in
group B!“
P(Ŷ=Y|A=0) = P( Ŷ=Y|A=1)

Demographic Parity
PR = 4/8 = 50%
Group B
PREDICTED
TRUE
approve
deny
approve
deny
30
14
PREDICTED
TRUE
approve
deny
approve
deny
Group A
Paid loan in full
Defaulted
Income
P(Ŷ|A=0) = P( Ŷ|A=1)

Demographic Parity
PR = 4/8 = 50%
Group B
10
34
PREDICTED
TRUE
approve
deny
approve
deny
30
14
PREDICTED
TRUE
approve
deny
approve
deny
Group A
PR = 4/8 = 50%
Paid loan in full
Defaulted
Income
P(Ŷ|A=0) = P( Ŷ|A=1)

Demographic Parity
Group BGroup A
Paid loan in full
Defaulted
Income
P(Ŷ|A=0) = P( Ŷ|A=1)
„What could
go wrong?“
„Oops! Do we have
a plan for this? “

TPR = 2/4 = 50%
Group B
PREDICTED
TRUE
approve
deny
approve
deny
22
14
PREDICTED
TRUE
approve
deny
approve
deny
Group A
Paid loan in full
Defaulted
Income
Equal Opportunities
P(Ŷ=1|A=0, Y=1) = P( Ŷ=1|A=1, Y=1)

TPR = 2/4 = 50%
Group B
11
23
PREDICTED
TRUE
approve
deny
approve
deny
22
14
PREDICTED
TRUE
approve
deny
approve
deny
Group A
TPR = 1/2 = 50%
Paid loan in full
Defaulted
Income
Equal Opportunities
P(Ŷ=1|A=0, Y=1) = P( Ŷ=1|A=1, Y=1)

Group BGroup A
Paid loan in full
Defaulted
Income
Equal Opportunities
„What could
go wrong?“
TPR = 9/12 = 75%
„Wow! Lots of
False Positives!“
TPR = 3/4 = 75%
P(Ŷ=1|A=0, Y=1) = P( Ŷ=1|A=1, Y=1)

TPR = 2/4 = 50%
FPR = 1/4 = 25%
Group B
PREDICTED
TRUE
approve
deny
approve
deny
22
13
PREDICTED
TRUE
approve
deny
approve
deny
Group A
Paid loan in full
Defaulted
Income
Equalized Odds
P(Ŷ=1|A=0, Y=y) = P( Ŷ=1|A=1, Y=y), y ∈ {0,1}

TPR = 2/4 = 50%
FPR = 1/4 = 25%
Group B
11
PREDICTED
TRUE
approve
deny
approve
deny
22
PREDICTED
TRUE
approve
deny
approve
deny
Group A
Paid loan in full
Defaulted
Income
Equalized Odds
TPR = 1/2 = 50%
FPR = 1/4 = 25%
P(Ŷ=1|A=0, Y=y) = P( Ŷ=1|A=1, Y=y), y ∈ {0,1}
1313

Equalized Odds„What could
go wrong?“
P(Ŷ=1|A=0, Y=y) = P( Ŷ=1|A=1, Y=y), y ∈ {0,1}

What is fair?
Accuracy Parity?
Demographic Parity?
Equal Opportunity?
Equalized Odds?

What is fair?
Accuracy Parity?
Demographic Parity?
Equal Opportunity?
Equalized Odds!

What is fair?
When to use Equalized Odds …
•strong emphasis onpredicting the positive outcome correctly
e.g.: correctly identifying who should get a loan drives profit
•strongly care about minimising costly False Positives
e.g.: reducing the grant of loans to people who would not be able to pay back
•the reward function of the model is not heavily compromised
e.g.: revenue or profit function for the business remains high
•the target variable isnot considered subjective

Rule #3„Measure
and interpret
bias related
KPIs.“

Measure & interpret KPIs
Step 1: Look for unbalanced datasets
86%68%
32%

Measure & interpret KPIs
68%
32%
https://towardsdatascience.com/why-feature-correlation-matters-a-lot-847e8ba439c4
Step 1: Look for unbalanced datasets, the WHY
correlation matrix of unbalanced datasetcorrelation matrix of rebalanced dataset

Measure & interpret KPIs
68%
32%
https://imbalanced-learn.org/stable/introduction.html
Step 1: Look for unbalanced datasets, the WHY

Measure & interpret KPIs
Step 2: Define protected features
1
0
Previliged Group = 1
Unpriveliged Group = 0
1
0

Measure & interpret KPIs
Step 3: Use prevalence as metrics*
10
26.2%15.8%
31.2%11,4%Sex
Race
10
32.4%22.4%
12.2%7.6%
1
0
Race
Sex
Earning more than $50K / year? Earning more than $50K / year?
*overall prevalence =24.8%.

DIY? NO!
https://aif360.mybluemix.nethttps://aequitas.dssg.iohttps://pair-code.github.io/what-if-tool/
AIF360AequitasWhat-if

Rule #4„Mitigate
to a fair
outcome.“

Mitigate Bias
Quantitative Approach: Making adjustment
to data, model, or predictions.
Non-Qualitative Approach: Step back from
the computer and look at bias / fairness with
a wider lens.

Quantitative Approach
Pre-Processing
In-Processing
Post-Processing
Modify data before processing
Modify algorithm that is trained
Modify prediction of model

Quantitative ApproachPre-Processing
Pre-Processing methods try to remove bias
in data before it is used to train the model.

Quantitative ApproachPre-Processing
#1: Handle unbalanced datasets
See also
for implementations

Quantitative ApproachPre-Processing
#1: … via Undersampling / Oversampling
Original datasetOriginal Dataset
Samples of
majority class
UndersamplingOversampling
copies or
synthetics of
minority class

Quantitative ApproachPre-Processing
#1: … via Undersampling / Oversampling
Original datasetOriginal Dataset
Samples of
majority class
UndersamplingOversampling
copies or
synthetics of
minority class
Oversampling with different MethodsOversampling Pitfalls

Quantitative ApproachPre-Processing
#1: .. via Grouping
White
Black
Asian-Pacific
Hispanic
White
Others

Quantitative ApproachPre-Processing
#2: Disperate impact removal
DIR

Quantitative ApproachPre-Processing
#3: Elimination of proxy variables

Quantitative ApproachPre-Processing
#4: Fair Representation Learning: „A fair & rich Z“
Input

Vector

Fair
Representation
X, AZ
Ŷ
Â
Predictor
Adversary
negative
gradient
g
max I(X;Z)
min I(A;Z)
-- Rich Zemel --
https://www.cs.toronto.edu/~toni/Papers/icml-final.pdf

Quantitative ApproachIn-Processing
In-Processing methods work by adjusting the
objective to also consider fairness. This can be
done by changing the cost function or by
imposing constraints on the model.

Quantitative ApproachIn-Processing
https://towardsdatascience.com/approaches-for-addressing-unfairness-in-machine-learning-a31f9807cf31

Quantitative ApproachPost-Processing
Post-Processing methods work by changing
the predictions made by a model if indicated
to be unfair. E.g by setting different tresholds
for priviliged and unpriviliged groups.

Quantitative ApproachPost-Processing
https://towardsdatascience.com/approaches-for-addressing-unfairness-in-machine-learning-a31f9807cf31

Non-Quantitative Approach
Awareness of
the problem
Don‘t use
ML at all
Limit the use
of ML
Address the
root cause
Undestand
the model
Give
explanations
Give
opportunities
Support
team diversity

Non-Quantitative Approach
Awareness of
the problem
Don‘t use
ML at all
Limit the use
of ML
Address the
root cause
Undestand
the model
Give
explanations
Give
opportunities
Support
team diversity

Non-Quantitative Approach
Awareness of
the problem
Don‘t use
ML at all
Limit the use
of ML
Address the
root cause
Undestand
the model
Give
explanations
Give
opportunities
Support
team diversity

Non-Quantitative Approach
Awareness of
the problem
Don‘t use
ML at all
Limit the use
of ML
Address the
root cause
Undestand
the model
Give
opportunities
Give
explanation
Support
team diversity

AIF360

AIF360

AIF360

Rule #5„Monitor
for silent
Failures.“

Monitoring
What to look for?
•Performance Drift

Monitoring
Best-Case: Ground truth is immediately accessible

Monitoring
Not-so-good-Case: Ground truth is postponed in time

Monitoring
Worst-Case: Absent ground Truth

Monitoring
What to look for?
•Performance Drift*
•Input Data Drift
•Prediction Drift
•Concept Drift
*ground truth required

Monitoring
How to find out?
•Model Performance Metrics
(e.g. confusion matrix, acc, recall, F1 score, ROC-AUC, …)
•Descriptive Stastitics
(e.g. min, max, mean, uniqueness, correlations, …)
•Distribution Change
(e.g. PSI, MMD, LSDD, KL Divergence, Jensen-Shannon, …)

Monitoring
Which tool to use for free?
•Evidently
•Alibi Detect
•NannyML
•whyLogs
•…

Monitoring
Which tool to use for money?
•Neptune.ai
•Arize AI
•WhyLabs
•Qualdo
•…

Rules1.Be aware
2.Define
3.Measure
4.Mitigate
5.Monitor

Zeit für
Fragen?
Immer!

Vielen
Dank!
#WISSENTEILEN
by open knowledge GmbH
@_openKnowledge |@mobileLarson
Lars Röwekamp, CIO New Technologies

BILDNACHWEIS
Folie 01: © brizmaker, iStockphoto.com
All other pictures, drawings and icons originate from
•pexels.com, pixabay.com, unsplash.com,
•flaticon.com
or were made by my own.