Receiver Operating Characteristics SIB-ROC.ppt

ssuserb53446 14 views 27 slides Sep 04, 2024
Slide 1
Slide 1 of 27
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27

About This Presentation

Receiver Operating Characteristics SIB-ROC


Slide Content

Darlene Goldstein 29 January 2003
Receiver Operating Characteristic
Methodology

Outline
•Introduction
•Hypothesis testing
•ROC curve
•Area under the ROC curve (AUC)
•Examples using ROC
•Concluding remarks

Introduction to ROC curves
•ROC = Receiver Operating Characteristic
•Started in electronic signal detection
theory (1940s - 1950s)
•Has become very popular in biomedical
applications, particularly radiology and
imaging
•Also used in machine learning applications to
assess classifiers
•Can be used to compare tests/procedures

ROC curves: simplest case
•Consider diagnostic test for a disease
•Test has 2 possible outcomes:
–‘postive’ = suggesting presence of disease
– ‘negative’
•An individual can test either positive or
negative for the disease
•Prof. Mean...

Hypothesis testing refresher
•2 ‘competing theories’ regarding a population
parameter:
–NULL hypothesis H (‘straw man’)
–ALTERNATIVE hypothesis A (‘claim’, or
theory you wish to test)
•H: NO DIFFERENCE
–any observed deviation from what we
expect to see is due to chance variability
•A: THE DIFFERENCE IS REAL

Test statistic
•Measure how far the observed data are from
what is expected assuming the NULL H by
computing the value of a test statistic (TS)
from the data
•The particular TS computed depends on the
parameter
•For example, to test the population mean ,
the TS is the sample mean (or standardized
sample mean)
•The NULL is rejected fi the TS falls in a
user-specified ‘rejection region’

True disease state vs. Test result
not rejected rejected
No disease
(D = 0)

specificity
X
Type I error
(False +) 
Disease
(D = 1)
X
Type II error
(False -) 

Power 1 - ;
sensitivity
Disease
Test

Specific Example
Test Result
Pts with Pts with
diseasedisease
Pts without Pts without
the diseasethe disease

Test Result
Call these patients “negative”Call these patients “positive”
Threshold

Test Result
Call these patients “negative”Call these patients “positive”
without the disease
with the disease
True Positives
Some definitions ...

Test Result
Call these patients “negative”Call these patients “positive”
without the disease
with the disease
False
Positives

Test Result
Call these patients “negative”Call these patients “positive”
without the disease
with the disease
True
negatives

Test Result
Call these patients “negative”Call these patients “positive”
without the disease
with the disease
False
negatives

Test Result
without the disease
with the disease
‘‘‘‘-’’-’’ ‘‘‘‘+’’+’’
Moving the Threshold: right

Test Result
without the disease
with the disease
‘‘‘‘-’’-’’ ‘‘‘‘+’’+’’
Moving the Threshold: left

T
r
u
e

P
o
s
i
t
i
v
e

R
a
t
e






(
s
e
n
s
i
t
i
v
i
t
y
)
0%
100%
False Positive Rate
(1-specificity)
0%
100%
ROC curve

T
r
u
e

P
o
s
it
iv
e

R
a
t
e
0
%
100%
False Positive Rate
0
%
100%
T
r
u
e

P
o
s
it
iv
e

R
a
t
e
0
%
100%
False Positive Rate
0
%
100%
A good test:
A poor test:
ROC curve comparison

Best Test:
Worst test:
T
r
u
e

P
o
s
it
iv
e

R
a
t
e
0
%
100%
False Positive
Rate
0
%
100
%
T
r
u
e

P
o
s
it
iv
e

R
a
t
e
0
%
100%
False Positive
Rate
0
%
100
%
The distributions
don’t overlap at all
The distributions
overlap completely
ROC curve extremes

‘Classical’ estimation
•Binormal model:
–X ~ N(0,1) in nondiseased population
–X ~ N(a, 1/b) in diseased population
•Then
ROC(t) = (a + b
-1
(t)) for 0 < t < 1
•Estimate a, b by ML using readings from
sets of diseased and nondiseased patients

ROC curve estimation with
continuous data
•Many biochemical measurements are in fact
continuous, e.g. blood glucose vs. diabetes
•Can also do ROC analysis for continuous (rather
than binary or ordinal) data
•Estimate ROC curve (and smooth) based on
empirical ‘survivor’ function (1 – cdf) in
diseased and nondiseased groups
•Can also do regression modeling of the test
result
•Another approach is to model the ROC curve
directlyas a function of covariates

Area under ROC curve (AUC)
•Overall measure of test performance
•Comparisons between two tests based on
differences between (estimated) AUC
•For continuous data, AUC equivalent to Mann-
Whitney U-statistic (nonparametric test of
difference in location between two
populations)

T
r
u
e

P
o
s
it
iv
e

R
a
t
e
0
%
100%
False Positive
Rate
0
%
100
%
T
r
u
e

P
o
s
it
iv
e

R
a
t
e
0
%
100%
False Positive
Rate
0
%
100
%
T
r
u
e

P
o
s
it
iv
e

R
a
t
e
0
%
100%
False Positive
Rate
0
%
100
%
AUC = 50%
AUC = 90%
AUC = 65%
AUC = 100%
T
r
u
e

P
o
s
it
iv
e

R
a
t
e
0
%
100%
False Positive
Rate
0
%
100
%
AUC for ROC curves

Interpretation of AUC
•AUC can be interpreted as the probability
that the test result from a randomly chosen
diseased individual is more indicative of
disease than that from a randomly chosen
nondiseased individual: P(X
i
 X
j
| D
i
= 1, D
j
= 0)
•So can think of this as a nonparametric
distance between disease/nondisease test
results

Problems with AUC
•No clinically relevant meaning
•A lot of the area is coming from the range of
large false positive values, no one cares what’s
going on in that region (need to examine
restricted regions)
•The curves might cross, so that there might
be a meaningful difference in performance
that is not picked up by AUC

Examples using ROC analysis
•Threshold selection for ‘tuning’ an already
trained classifier (e.g. neural nets)
•Defining signal thresholds in DNA microarrays
(Bilban et al.)
•Comparing test statistics for identifying
differentially expressed genes in replicated
microarray data (Lönnstedt and Speed)
•Assessing performance of different protein
prediction algorithms (Tang et al.)
•Inferring protein homology (Karwath and King)

Homology Induction ROC

Concluding remarks – remaining
challenges in ROC methodology
•Inference for ROC curve when no ‘gold standard’
•Role of ROC in combining information?
•Incorporating time into ROC analysis
•Alternatives to ROC for describing test
accuracy?
•Generalization of positive/negative predictive
value to continuous test?
(+/-) predictive value = proportion of patients with
(+/-) result who are correctly diagnosed
= True/(True + False)
Tags