ROC & AUC, LIFT

Post on 07-Jan-2016

114 views 0 download

Tags:

description

ROC & AUC, LIFT. ד"ר אבי רוזנפלד. Introduction to ROC curves. ROC = R eceiver O perating C haracteristic Started in electronic signal detection theory (1940s - 1950s) Has become very popular in biomedical applications, particularly radiology and imaging גם בשימוש בכריית מידע. - PowerPoint PPT Presentation

Transcript of ROC & AUC, LIFT

ROC & AUC, LIFT

רוזנפלד" אבי ר ד

Introduction to ROC curves

• ROC = Receiver Operating Characteristic

• Started in electronic signal detection theory (1940s - 1950s)

• Has become very popular in biomedical applications, particularly radiology and imaging

• מידע בכריית בשימוש גם

False Positives / Negatives

P N

P 20 10

N 30 90

Predicted

Actu

al

Confusion matrix 1

P N

P 10 20

N 15 105

Predicted

Actu

al

Confusion matrix 2

FN

FP

Precision (P) = 20 / 50 = 0.4Recall (P) = 20 / 30 = 0.666F-measure=2*.4*.666/1.0666=.5

4

Different Cost Measures• The confusion matrix (easily generalize to multi-class)

• Machine Learning methods usually minimize FP+FN • TPR (True Positive Rate): TP / (TP + FN) = Recall• FPR (False Positive Rate): FP / (TN + FP) = Precision

Predicted class

Yes No

Actual class

Yes TP: True positive

FN: False negative

No FP: False positive

TN: True negative

Specific Example

Test Result

People with disease

People without disease

Test Result

Call these patients “negative”

Call these patients “positive”

Threshold

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

True Positives

Some definitions ...

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

False Positives

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

True negatives

Test Result

Call these patients “negative”

Call these patients “positive”

without the diseasewith the disease

False negatives

Test Result

without the diseasewith the disease

‘‘-’’

‘‘+’’

Moving the Threshold: left

Tru

e P

osi

tive R

ate

(R

eca

ll)

0%

100%

False Positive Rate (1-specificity)

0%

100%

ROC curve

ה שינוי של הגרף THRESHOLDההשפעה על

Figure 5.2 A sample ROC curve.

של שונים גרפים ROCסוגים

Area under ROC curve (AUC)

כללי • מדד

לגרך • מתחת ROCהשטח

•0.50 , רנדומאלי מחירה .1.0הוא מושלם הוא

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

AUC = 50%

AUC = 90% AUC =

65%

AUC = 100%

True

Pos

itive

Rat

e

0%

100%

False Positive Rate0%

100%

AUC for ROC curves

Lift Charts

• X axis is sample size: (TP+FP) / N• Y axis is TP

40% of responses for 10% of costLift factor = 4

80% of responses for 40% of costLift factor = 2Model

Random

Lift factor

0

0.5

1

1.5

2

2.5

3

3.5

4

4.55 15 25 35 45 55 65 75 85 95

Lift

Sample Size

Lift

Val

ue

המדדים בין הקשר

התרגיל ...לקראת

ואז מודל על ימני לחצןCost / Benefit Analysis for Wood

ה את לראות וגם הסף את לשנות אפשרCONFUSION MATRIX

ה את גם לראות וגם Liftאפשרמחיר השפעת