Segmentation uncertainty and error estimation in medical ... · Manual delineation study...

41
Prof. Leo Joskowicz CASMIP Lab, School of Computer Science and Engineering The Hebrew University of Jerusalem, ISRAEL Joint with: D. Cohen, Dr. N. Caplan, Prof. J. Sosna Dept. of Radiology, Hadassah Univ. Medical Center, Jerusalem, Israel Segmentation uncertainty and error estimation in medical imaging @ Copyright L. Joskowicz 2018

Transcript of Segmentation uncertainty and error estimation in medical ... · Manual delineation study...

Page 1: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Prof.LeoJoskowiczCASMIPLab,SchoolofComputerScienceandEngineering

TheHebrewUniversityofJerusalem,ISRAEL

Jointwith:D.Cohen,Dr.N.Caplan,Prof.J.SosnaDept.ofRadiology,HadassahUniv.MedicalCenter,Jerusalem,Israel

Segmentationuncertaintyanderrorestimationinmedicalimaging

@ Copyright L. Joskowicz 2018

Page 2: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Whydowemeasure?

“Measure what you can measure, andmakemeasureablewhatisnotmeasurable”

GalileoGalilei

“Ascienceisasmatureasitsmeasurementtools”

LouisPasteur

Page 3: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

MotivationSegmentationofanatomicalstructuresandpathologies

inmedicalimagesisimportant!Clinical� Identification and quantification of structures

Livertumors Kidneystructure Carotidstenosis

Page 4: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Jan 25, 2017 July 8, 2018

Baseline Follow-up

MotivationSegmentationofanatomicalstructuresandpathologies

inmedicalimagesisimportant!Clinical� Diagnosis and disease progression

Page 5: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

external fixationinternalfixation

MotivationSegmentationofanatomicalstructuresandpathologies

inmedicalimagesisimportant!Clinical� Treatment planning

Radiosurgery Femurfracturesurgery

Page 6: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

3Dprintedpatientspecificsurgicalguides

MotivationSegmentationofanatomicalstructuresandpathologies

inmedicalimagesisimportant!Clinical� Treatment delivery -- surgery

Roboticsurgery

SpineAssist

Page 7: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

MotivationSegmentationofanatomicalstructuresandpathologies

inmedicalimagesisimportant!TechnicalFundamental problem inmedical image processing� Structuremodeling� Atlas construction� Registration� Simulation� Big data radiology� …

Page 8: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Segmentationisahardproblem!

segmentation leak

Fuzzy/non-existingboundaries,lowcontrast,partialvolumeeffect,…

Segmentationerrorsandtheircorrectionsignificantlyhamper3Dmodelsuseintheclinic

Motivation

Page 9: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

MotivationClinical� Definitionofsegmentationqualitywrt ideal/consensus� Segmentationresultsvalidationistime-consuming!� Noconfidencelevel isprovided� Ground-truthgenerationistediousandtime-consuming� Observervariabilityquantification

Technical� Algorithmdevelopment– essentialmeasure!� Comparisonbetweenmethods� Segmentationerrordetectionandcorrection

Segmentationvalidationisessential!

Requiresgroundtruth!

Page 10: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Quality =f(Error)

SegmentationQuality

Image Segmentation

SegmentationError

Groundtruth

Segmentationevaluation

• Groundtruthgiven• Onlyerrorestimation

Error

Error

SegmentationGroundtruth

Currentpractice

Howdowecreatethegroundtruth?

Page 11: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Radiologist 5Radiologist 4Radiologist 3

All Radiologists

Radiologist 2

Segmentationvariability:example

Radiologist 1

CT scan

tumor volumetumor boundary?

Page 12: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Segmentationvariability:exampleManualdelineationsby10radiologists

Lowvariability

Highvariability

Onecolorperradiologist

Page 13: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

ObservationThereisnosinglegoldstandardorground-truth!

� Measurement/delineationisintrinsicallyuncertain!� Theobserverdelineationvariabilityis theuncertainty� Sourcesofobserverdelineationvariability:oSubjective observer-dependent• Manualhand-eyecoordinationskills• Attentivenessandthoroughness• Expertiseandknowledge

oObjective observer-independent• Imaging: scanning protocol, resolution, contrast, ...• Intrinsic: structure characteristics, fuzzy contoursdue to partial volume effects, neighboring structures, …

Page 14: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Image Segmentation

SegmentationError

Groundtruth

Segmentationqualityevaluation

…Observer1 Observer2 Observern

Meancontour

Error

Variability

Segmentation

Observers

SegmentationQuality

Quality =f(Error,Variability)

Severalobservers

Page 15: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Quality =f(Error,Variability)

SegmentationQuality

Image Segmentation

SegmentationError

Segmentationqualityevaluation

• Variabilityestimationfromseveralobservers

• Impractical!

Groundtruth

Meancontour

…Observer1 Observer2 Observern

ObserverVariability

Severalobservers

Page 16: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

PreviousworkClinical• Literature on delineation observer variability is limited!• Measures are generic – not scan and structure specific• No variability range is provided: tumor volume 23±4cc

Technical• A dozen relevant works on segmentation evaluationwith no ground truth: 2010-18 (Top 2011, Grady 2014, Saad 2010,…)

• No generic segmentation variability model, too specific• Require ad-hoc models, e.g. probabilistic segmentation• Small-scale validation

Page 17: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Previouswork

Page 18: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

GoalsQuantifyandestimatesegmentationvariability

forcommonstructuresandpathologies

1. Understand: large-scale manual delineation study toquantify observer variability

2. Estimate: automatic method for estimating the variabilityof a delineation without ground truth

3. Detect and correct: automatic methods for segmentationerrors correction and detection ß Notinthistalk

Page 19: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Quality =f(Error,Variability)

SegmentationQuality

Image Segmentation

Segmentationqualityevaluationnogroundtruth,noobservers!

SegmentationError

ObserverVariability

Segmentationpriors

• Automaticvariabilityestimation• NoErrorà onlyVariability!

Variability

Error

Variabilityestimation

Page 20: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Segmentationvariability:illustration

Lowvariability

Highvariability

Estimatedvariabilitywithoutground truth

Isthispossibleatall?

Page 21: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Segmentationvariability:definition� 𝑆 𝐼 = {𝑠& 𝐼 ,… , 𝑠) 𝐼 } setofN DelineationsofimageI� Setofvoxelsinsideadelineationforwhich:oAt least one annotator agrees à PossibleoAll annotators agree à ConsensusoDifference between Possible,Consensusà Variability

Delineations

Consensus

Possibledifference = union – intersection

Variability

+𝑠, 𝐼)

,.&

/𝑠𝑖(𝐼))

,.&

+𝑠, 𝐼 −/𝑠,

)

,.&

)

,.&

(𝐼)

Page 22: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Segmentationvariability:properties� Patient, structure, and scan-specific� Depends on: which annotators and howmany of them� One annotatorà no variability� As the number of annotators increases:

oPossible increases – voxels are addedoConsensus decreases – voxels are removedoVariability increases – voxels added to the difference

� After sufficient annotators, no more voxels areadded/deletedè Variability converges

� What is the actual variability across all annotators?

Page 23: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Manualdelineationstudy

� Collected 18 representative CT scans from 4 structures� Recruited 11 annotators: 4 residents, 2 mid-career,4 experts, 1 neuro-radiologist. Paid them by the hour.

� Performedmanually3,193CTsliceannotations� Protocoltoproduceexpert-validatedunbiaseddelineations

Quantifydelineationobservervariabilityforcommonstructuresandpathologies

Livertumors5cases

Lungtumors5cases

Kidneys6cases

BrainHematomas

2cases

CTresolution:512×512×102-449,0.5-0.98×0.5-0.98×1.0-3.3mm3

14scans1.5mm,2scans3mm,2scans1mmspacing,hematomas0.5x0.5x1.5mm3

Page 24: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

ManualdelineationstudyQuantifydelineationobservervariabilityforcommonstructuresandpathologies

Livertumors5cases

Lungtumors5cases

Kidneys6cases

BrainHematomas

2cases

Page 25: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

ExperimentalResultsVariability by

1. Manual tracing2. Pairs of annotators3. Groups of annotators4. Disagreement between annotators5. Case type and difficulty6. Expertise of annotators7. Surface distance difference

Page 26: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

1.ManualtracingHowmuchvariabilitycomesfrommanualskillsalone?

Slice with the smallest variability for kidney contour:

Kidneycontours:16%[-8,+8]%

~4 pixelsKidneycontours:Nomedicalknowledgerequired!

Lowestvariabilityof16%forscans0.75x0.75x1.5mm3

Page 27: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

2.PairsofannotatorsLivertumors18%[-6,+7]%

BrainHematomas

18%[-6,+6]%

KidneyContours

9%[-1,+1]%

Lungtumors21%

[-9,+10]%

40%

37%

13%

31%

Maximumdifferencebetweentwoannotators

Varia

bilit

y %

Casenumber

VolumeOverlapDifference

Significantmeandifferencesbetweenstructuresandcases

VerysignificantdiscrepanciesforliverandlungtumorsDiscrepancyrangesfrom5%to57%

Page 28: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

3.Groupsofannotators

# of annotators

Varia

bilit

y %

one annotatorno variability

Possible

+31%

Consensus

-26%

Maximum diff two annotators

Minumum diff two annotators

Possible,Consensus,andVariabilityvolume%asafunctionofthe#ofannotators

Livertumors

Meanvariabilityof2annotatorsismuchsmallerthan10:

Maximum:10%vs.31%Minimum7%vs26%Thevariabilityof5vs.10annotatorsisalsosignificant!

VolumeOverlapDifference

Page 29: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

3.Groupsofannotators

Possible

Consensus

+31%

-26%

Possible,Consensus,andVariabilityvolume%asafunctionofthe#ofannotators

Livertumors

Variabilityrangesfor<5annotatorsis20%!

Variabilityrangeforkannotatorsdecreasesslowly

# of annotators

Varia

bilit

y %

Page 30: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

3.GroupsofannotatorsLungtumors

Kidneycontours

Livertumors[-24,+27]%

Lungtumors[-25,+31]%

Kindey contours[-12,+13]%

Brainhematomas[-24,+29]%

Similarprogressionrateforallstructures

Significantdifferencesbystructure:[-12,+23]%to[-25,31]%

Page 31: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

100%

37%

Two

53%

Three

72%

Five

82%Eight

All

#ofannotators

Nor

mal

ized

Var

iabi

lity %

3.Groupsofannotators:normalizedvariability

Page 32: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Mea

n D

ice

coef

ficie

nt5.Cases bytype

Casenumber

Livertumors93%

[-3,+2]%

KidneyContours

96%[-1,+1]%

LungTumors91%

[-4,+4]%

Verygoodagreementwiththemeandelineationforallstructureswithlowvariability:91-96[-4,+4]%

Ofcourse,themeandelineationisunknowninpractice…

Volumeagreementwithmean

BrainHematomas

93%[-2,+2]%

Page 33: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

6.Annotatorsexpertise

novices residents mid-career experts novices residents mid-career experts

novices residents mid-career experts novices residents mid-career experts

Livertumors

Kidneycontours Brainhematomas

Lungtumors 0.92-0.04,+0.03

0.94-0.02,+0.01

0.98-0.01,+0.01 0.94

-0.01,+0.02

Somestatisticaldifferencesbetweenstructures

Nostatisticaldifferencebetweengroups!

VolumeOverlap

Page 34: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Summaryofthestudy� Significant volume variability differences between cases(easy/hard) and structures (liver/lung tumors): 27-78%

� Wide volume variability range for 2 annotators: 5-57%

� Mean volume variability range of 2 annotators is muchsmaller than for 10 annotators: 7-10% vs 26-31%

� 40% of the variability is due to 1 annotator; 60% to 2

� Annotators disagreement similar in % and trends for all� 37%, 53%, 72% of the variability captured by2, 3, 5 annotatorsà about 10 annotators are necessary!

� No statistical difference between annotatorsexpertise!

Page 35: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

AutomaticestimationofvariabilityAutomaticmethodforestimatingthevariabilityofa

givensegmentationwithoutgroundtruth

CTscan

Delineation

INPUT

Segmentationpriorslibrary

Sensitivity

OUTPUT

Variability

Contouranalysis

Manuallycompiledforeachstructure,scanprotocol, task

Localandglobalintensity,texture,shape,…

Page 36: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2
Page 37: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

� Segmentationpriorsforeachproperty

� Qualityestimate:functionofsegmentationpriors

� Function is similar to objective/energy function inoptimization-based segmentation

� BUT: it is used for evaluation, not for optimizationà can be richer/more complex, no search!

� Variability estimation by sensitivity analysis of F(f(v))

Segmentationpriors

fi(v):voxel à quality(v)=f(error(v),variability(v))

f1(v),…,fk(v): image,structure,taskspecificfeaturepriors

F(f1(v),…,fk(v))=F(f(v))

Page 38: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

� 𝐹(𝑠)segmentationpriorsfunction� s0 ∈ 𝑆 initialsegmentation� 𝜀 sensitivitythreshold

� ∆𝑠0 segmentationvariabilityrange

∀𝑠,∈ ∆𝑠0, 𝐹(𝑠𝑖) − 𝐹(𝑠;) < 𝜀

Variabilityestimationbysensitivityanalysis

𝜀

∆𝑠0

𝐹(𝑠)

𝑠0

Page 39: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

ActualActualEstimatedEstimated

Automaticvariabilityestimation

Lungtumor

Estimationvalidatedwiththesamedataofthemanualdelineationstudy

Variabilityvolumedifference<6%

Variabilityvolumeagreement>70%

Highqualitypredictionofvolumevariability!

Lungtumor

Page 40: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Takehomemessages� There is no single segmentation ground truth!

� Significant manual segmentation variability 5-57%by type of structure, case, and observer: 15-45%

� Annotatordelineationvariabilitycanbequantified

� Delineation and variability estimation can be reliablycomputed automatically for many structures/pathologies

Page 41: Segmentation uncertainty and error estimation in medical ... · Manual delineation study Collected18 representativeCTscans from4structures Recruited 11 annotators: 4 residents, 2

Manythanksto:N.Caplan coordinatorM.Awad,K.Azzam,A.Beinshtein,E.Ben-David,D.Halevi,N.Lev-Cohen,N.Simanovsky,A.Soto

Thanks for your attention!