Using Compilation/Decompilation to Enhance Clone Detection
-
Upload
chaiyong-ragkhitwetsagul -
Category
Engineering
-
view
33 -
download
1
Transcript of Using Compilation/Decompilation to Enhance Clone Detection
IWSC ‘17
CREST, University College London, UK
Using Compilation/Decompilation to Enhance Clone Detection
Chaiyong Ragkhitwetsagul, Jens Krinke
Clone det.
Plag det.
Comp.
Others
ccfxdeckard
iclonesnicad
simianjplag-javajplag-text
plaggiesherlocksimjavasimtext
7zncd-BZip27zncd-LZMA
7zncd-LZMA27zncd-Deflate
7zncd-Deflate647zncd-PPMd
bzip2ncdgzipncd
icdncd-bzlib
ncd-zlibxz-ncd
bsdiffdiff
py-difflibpy-fuzzywuzzy
py-jellyfishpy-ngram
py-sklearn
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 F1
Orig.
Dec.Ragkhitwetsagul et al., 2016
Clone det.
Plag det.
Comp.
Others
ccfxdeckard
iclonesnicad
simianjplag-javajplag-text
plaggiesherlocksimjavasimtext
7zncd-BZip27zncd-LZMA
7zncd-LZMA27zncd-Deflate
7zncd-Deflate647zncd-PPMd
bzip2ncdgzipncd
icdncd-bzlib
ncd-zlibxz-ncd
bsdiffdiff
py-difflibpy-fuzzywuzzy
py-jellyfishpy-ngram
py-sklearn
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 F1
Orig.
Dec.
4
O. Kononenko, C. Zhang, and M. W. Godfrey, ICSME ‘14
What Happens?Compiling Clones:
and Decompiling
4
What Happens?Compiling Clones:
Source Code
and Decompiling
4
What Happens?Compiling Clones:
Existing tools
Source Code
and Decompiling
4
What Happens?Compiling Clones:
Missing Source
Existing tools
Source Code
and Decompiling
4
What Happens?Compiling Clones:
C. Ragkhitwetsagul J. Krinke CREST, UCL, UK
decomp. clones
clone mapper
decomp. & mapped clones
compiler
decompiler
decompiled software
clone detector
original clones
5
software
common clones
disjoint clones
manual investigation
Experimental Framework
C. Ragkhitwetsagul J. Krinke CREST, UCL, UK 6
System Ver.Original Decompiled
Files SLOC Files SLOC
4.1.3 203 9,777 311 11,233
1.5.0 644 96,711 669 85,251
9.0 1,688 241,924 2603 256,974Apache Tomcat®
Software Systems
7
Tool Config. Parameters
NiCad Type-1 UPI=0.0, renaming=none
Type-2 UPI=0.0, renaming=consistent
Type-3 UPI=0.3, renaming=consistent
Tools
javac Procyon NiCad
Compiler Decompiler Clone Detector
8
Clone Mapper
decompiled clone report
DCP1(dm1, dm2)
decompiled clone pairs
softwarem1
m2
m4 m3
mn
…
DCP2(dm1, dm3)
DCP3(dm2, dm4)
… DCPn(dmm, dmo)
set of methods (M)mo
decompiled-and-mapped clone report
DCP*1((dm1,dm2),(m1,m2))DCP*2((dm1,dm3),(m1,m3))DCP*3((dm2,dm4),(m2,m4))
DCP*n((dmm,dmo),(mm,mo))
…
decompiled-and-mapped clone pairs
…
C. Ragkhitwetsagul J. Krinke CREST, UCL, UK 9
Common & Disjoint Clone Pairs
Ccommon
Corig-only Cdecomp-only
Original Decompiled
C. Ragkhitwetsagul J. Krinke CREST, UCL, UK 10
Results
11
JUnitOriginal Decompiled
Type-1
Type-2
Type-3
6
3
12
JFreeChartOriginal Decompiled
Type-1
Type-2
Type-3
159
155
33
15
48
1
17
27
3
13
TomcatOriginal Decompiled
Type-1
Type-2
Type-3
217
608
20
25
141
22
3
23
1
C. Ragkhitwetsagul J. Krinke CREST, UCL, UK 14
Manual Investigation
15
No.
of c
lone
pai
rs
0
10
20
30
40
50
Type-1 Type-2 Type-3
47
15
1
48
15
1
Candidates TP
JFreeChart
Cforig-only
No.
of c
lone
pai
rs
0
6
12
18
24
30
Type-1 Type-2 Type-3
27
17
3
27
17
3
Cfdecomp-only
16
No.
of c
lone
pai
rs
0
32
64
96
128
160
Type-1 Type-2 Type-3
141
2522
141
2522
Candidates TP
Tomcat
Cforig-only
No.
of c
lone
pai
rs
0
6
12
18
24
30
Type-1 Type-2 Type-3
23
31
23
31
Cfdecomp-only
Clone set Reasons
Cforig-only
Too small after decomp.
Too diff. after decomp.
Smaller after decomp. higher dissimilarity
Unknown
Cfdecomp-only
Having deleted/added stmt., type cast, package name.
Different if-else statements
Different loop statements
Inner class methods
Unknown
Characteristics of Disjoint Clones
Clone set ReasonsCforig-only Too small after decomp.
Too diff. after decomp.
Smaller after decomp. higher dissimilarity
Unknown
Cfdecomp-onlyHaving deleted/added stmt., type cast, package name.
Different if-else statementsDifferent loop statementsInner class methods
Unknown
JFreeChart
0 10 20 30 40 50
5
11
32
6
9
T1 T2 T3
0 2 4 6 8 10 12 14 16
12
4
3
8
12
53
Clone set ReasonsCforig-only Too small after decomp.
Too diff. after decomp.
Smaller after decomp. higher dissimilarity
Unknown
Cfdecomp-onlyHaving deleted/added stmt., type cast, package name.
Different if-else statementsDifferent loop statementsInner class methods
Unknown
Tomcat
0 28 56 84 112 140
16
5
120
19
6
T1 T2 T3
0 5 10 15 20 25 30
20
2
1
3
2
@Override publicRangefindRangeBounds(XYDatasetdataset){ if(dataset!=null){ Ranger=DatasetUtilities.findRangeBounds(dataset,false); if(r==null){ returnnull; }else{ returnnewRange(r.getLowerBound()+this.yOffset,r.getUpperBound()+this.blockHeight+this.yOffset); } }else{ returnnull; } }
@Override publicRangefindDomainBounds(XYDatasetdataset){ if(dataset==null){ returnnull; } Ranger=DatasetUtilities.findDomainBounds(dataset,false); if(r==null){ returnnull; } returnnewRange(r.getLowerBound()+this.xOffset, r.getUpperBound()+this.blockWidth+this.xOffset); }
O R I G I N A L
@Override publicRangefindRangeBounds(XYDatasetdataset){ if(dataset!=null){ Ranger=DatasetUtilities.findRangeBounds(dataset,false); if(r==null){ returnnull; }else{ returnnewRange(r.getLowerBound()+this.yOffset,r.getUpperBound()+this.blockHeight+this.yOffset); } }else{ returnnull; } }
@Override publicRangefindDomainBounds(XYDatasetdataset){ if(dataset==null){ returnnull; } Ranger=DatasetUtilities.findDomainBounds(dataset,false); if(r==null){ returnnull; } returnnewRange(r.getLowerBound()+this.xOffset, r.getUpperBound()+this.blockWidth+this.xOffset); }
O R I G I N A L
@Override publicRangefindDomainBounds(finalXYDatasetdataset){ if(dataset==null){ returnnull; } finalRanger=DatasetUtilities.findDomainBounds(dataset,false); if(r==null){ returnnull; } returnnewRange(r.getLowerBound()+this.xOffset,r.getUpperBound()+this.blockWidth+this.xOffset); }
@Override publicRangefindRangeBounds(finalXYDatasetdataset){ if(dataset==null){ returnnull; } finalRanger=DatasetUtilities.findRangeBounds(dataset,false); if(r==null){ returnnull; } returnnewRange(r.getLowerBound()+this.yOffset, r.getUpperBound()+this.blockHeight+this.yOffset); }
D E C O M P I L E D
publicvoidclearRangeMarkers(){if(this.backgroundRangeMarkers!=null){Set<Integer>keys=this.backgroundRangeMarkers.keySet();for(Integerkey:keys){clearRangeMarkers(key);}this.backgroundRangeMarkers.clear();}if(this.foregroundRangeMarkers!=null){Set<Integer>keys=this.foregroundRangeMarkers.keySet();for(Integerkey:keys){clearRangeMarkers(key);}this.foregroundRangeMarkers.clear();}fireChangeEvent();}
publicvoidclearRangeMarkers(){if(this.backgroundRangeMarkers!=null){Setkeys=this.backgroundRangeMarkers.keySet();Iteratoriterator=keys.iterator();while(iterator.hasNext()){Integerkey=(Integer)iterator.next();clearRangeMarkers(key.intValue());}this.backgroundRangeMarkers.clear();}if(this.foregroundRangeMarkers!=null){Setkeys=this.foregroundRangeMarkers.keySet();Iteratoriterator=keys.iterator();while(iterator.hasNext()){Integerkey=(Integer)iterator.next();clearRangeMarkers(key.intValue());}this.foregroundRangeMarkers.clear();}fireChangeEvent();}
ORIGINAL
publicvoidclearRangeMarkers(){if(this.backgroundRangeMarkers!=null){Set<Integer>keys=this.backgroundRangeMarkers.keySet();for(Integerkey:keys){clearRangeMarkers(key);}this.backgroundRangeMarkers.clear();}if(this.foregroundRangeMarkers!=null){Set<Integer>keys=this.foregroundRangeMarkers.keySet();for(Integerkey:keys){clearRangeMarkers(key);}this.foregroundRangeMarkers.clear();}fireChangeEvent();}
publicvoidclearRangeMarkers(){if(this.backgroundRangeMarkers!=null){Setkeys=this.backgroundRangeMarkers.keySet();Iteratoriterator=keys.iterator();while(iterator.hasNext()){Integerkey=(Integer)iterator.next();clearRangeMarkers(key.intValue());}this.backgroundRangeMarkers.clear();}if(this.foregroundRangeMarkers!=null){Setkeys=this.foregroundRangeMarkers.keySet();Iteratoriterator=keys.iterator();while(iterator.hasNext()){Integerkey=(Integer)iterator.next();clearRangeMarkers(key.intValue());}this.foregroundRangeMarkers.clear();}fireChangeEvent();}
ORIGINAL
publicvoidclearRangeMarkers(){if(this.backgroundDomainMarkers!=null){finalSet<Integer>keys=this.backgroundDomainMarkers.keySet();for(finalIntegerkey:keys){this.clearDomainMarkers(key);}this.backgroundDomainMarkers.clear();}if(this.foregroundDomainMarkers!=null){finalSet<Integer>keys=this.foregroundDomainMarkers.keySet();for(finalIntegerkey:keys){this.clearDomainMarkers(key);}this.foregroundDomainMarkers.clear();}this.fireChangeEvent();}
publicvoidclearRangeMarkers(){if(this.backgroundRangeMarkers!=null){finalSetkeys=this.backgroundRangeMarkers.keySet();for(finalIntegerkey:keys){this.clearRangeMarkers(key);}this.backgroundRangeMarkers.clear();}if(this.foregroundRangeMarkers!=null){finalSetkeys=this.foregroundRangeMarkers.keySet();for(finalIntegerkey:keys){this.clearRangeMarkers(key);}this.foregroundRangeMarkers.clear();}this.fireChangeEvent();}
DECOMPILED
26
Study on 3 real-world systems: JUnit, JFreeChart, Tomcat
Using Compilation/Decompilation to Enhance Clone Detection
1 Clone pairs before and after decompilation are mostly similar for all three clone types.
Findings:
2 One can complement the original clone results by incorporating clones after decompilation.
Characteristics of disjoint clones3
C. Ragkhitwetsagul, J. Krinke
cragkhit.github.io/crjk-iwsc17