Automatic Comment Analysis

95
Source Code comments, ICSSEA '15 the forgotten software check How comments helps to find ICSSEA 15 How comments helps to find problems J DERN Vl J. DERN, V aleo

Transcript of Automatic Comment Analysis

Source Code comments, ICSSEA '15the forgotten software check

How comments helps to find

ICSSEA 15

How comments helps to find problems

J DERN V lJ. DERN, Valeo

Code Comment Analysis

Jérôme DERNSoftware Quality Managery g

Contact:Contact: +33 1 48 84 56 85

d [email protected]

@jeromedern

https://fr.linkedin.com/in/jeromede

I29/03/2015 | 2

Source Code Comments: The forgotten check

IntroductionIntroduction

I29/03/2015 | 3

Source code comments

I29/03/2015 | 4

A lot of code tools exists

I29/03/2015 | 5

Source Code Comments: The forgotten check

Code related tools are numerousIDE (editors)IDE (editors),Unitary Tests tools, Static Analysis tools, Control flow analysis toolsControl flow analysis tools, Data flow analysis tools, Runtime oriented tools, Naming rule checkers,g ,Time and Stack tools, Reverse documentation builderReverse documentation builder…

I29/03/2015 | 6

Source Code Comments: The forgotten check

But, no tool focus on source code commentscommentsOnly two very limited freeware areOnly two very limited freeware are

available (one for Java and the other for C++)C++)Very limited and specific: they don’t coverVery limited and specific: they don t cover

problematic categories presented later on in this presentationin this presentation

I29/03/2015 | 7

Why this idea of comment analysis tool?

During a Peer Review I discover that a particular source code includes aa particular source code includes a lot of TODOI decided to search all occurrences of

TODO i th h l dTODO in the whole source codeI find out some others arguable practicesI find out some others arguable practices

I29/03/2015 | 8

Source Code Comments: The forgotten check

But searching without a dedicated tool was very limiteddedicated tool was very limited

and time consuming…g

I29/03/2015 | 9

Source Code Comments: The forgotten check

Why tools don’t considerWhy tools don t consider comments?

I29/03/2015 | 10

Source Code Comments: The forgotten check

We all focus on executable lines of code

Bugs are only located on executable lines

Comments are “inactive” that couldn’t have bugs

Nobody imagine that problems can be found by l ki i tlooking in comments

Analyzing comments is not always easy

I29/03/2015 | 11

Source Code Comments: The forgotten check

I29/03/2015 | 12

The forgotten check

But, comments are essential for MaintenanceC d d t diCode understandingCode documentationCode documentationBut may also reveal a lot of potential y p

code problems

I29/03/2015 | 13

The forgotten check

Comments are essential for

I29/03/2015 | 14

Handling comments

First, capturing comments is quite easy

I29/03/2015 | 15

Handling comments

In C they began by /* and ends */ and d d fi d b h ISOare not nested as defined by the ISO

C standard ISO/IEC 9899:1990C standard ISO/IEC 9899:1990

I29/03/2015 | 16

Handling comments

In C++, C, C#, D, Go, Java, JavaScript PHP Object PascalJavaScript, PHP, Object Pascal, ActionScript and Objective-C and p jSwiftComments starts by // Ends at the end of the lineEnds at the end of the line, /* and */ C notations are supported by all/ and / C notations are supported by all

excepting Object Pascal

I29/03/2015 | 17

Handling comments

In most assembly languagecomments starts by “;”ends at the end of the line. This notation is used also in Lisp,

Scheme Clojure and AutoItScheme, Clojure and AutoIt

I29/03/2015 | 18

Handling comments

In all languages commentsHas a clear start and endAre not nested (excepted for Swift)

So it is easy to capture themSo it is easy to capture them whatever language is used

I29/03/2015 | 19

Handling comments

But some specific practices like conditional compilation directives mayconditional compilation directives may be “used” to add comment…

Example in C,C++ and Objective-C:p , j#if 0

This is also a comment#endif#endif

I29/03/2015 | 20

Handling comments

If the difficulty is not related to comment’s capturecomment’s capture,

where it is?

I29/03/2015 | 21

Handling comments

Comment processing may be a little bit complexbit complex

Natural language handling is not so easyeasy

Commented code detection can be trickytricky

I29/03/2015 | 22

Handling comments

I29/03/2015 | 23

The difficulties to handle comments

To fully understand why commentsTo fully understand why comments may be seen as difficult to analyze, y y

we must clarify what can be expectedby a comment analysisby a comment analysis

I29/03/2015 | 24

Categories

Categories of checksCategories of checks

I29/03/2015 | 25

Defects detected by comment analysis

I29/03/2015 | 26

Defects detected by comment analysis

Comments not in EnglishComments not in English [NotEnglish]

I29/03/2015 | 27

Comments not in English

May happen when D l t t l t d iDevelopment teams are located in a country that is not natively speaking y y p gEnglishS l d l t t ki tSeveral development teams working at the same time but not in the same countryyUsing legacy code

I29/03/2015 | 28

Comments not in English

Why [NotEnglish] is important?Why [NotEnglish] is important?Comments not in English are not

understandable by all development teams Example: It is a major issue for a China team toExample: It is a major issue for a China team to

understand a source code with comments in FrenchFrench

I29/03/2015 | 29

Defects detected

Bad practices [Bad practice]

I29/03/2015 | 30

Bad practices

What can be considered as [Bad practice]?Abuses slang & other inappropriate commentsAbuses, slang & other inappropriate commentsComments that are not related to code activities

like jokes, personal life commentsFancy comments emoticon and personalFancy comments, emoticon and personal

opinion on code

I29/03/2015 | 31

Bad practices

But also…Presence of keywords to disable static

analysis ruleanalysis rule

I29/03/2015 | 32

Defects detected

Profanity in source code [Profanity]Profanity in source code [Profanity]

I29/03/2015 | 33

Profanity

Is there really [Profanity] in source code?y [ y]In Europe it is rare to find profanity in source

code but this is a very common practicecode, but this is a very common practice elsewhere

It depend surprisingly on computer language used…

I29/03/2015 | 34

Profanity

(data coming from open source analysis)

I29/03/2015 | 35

Defects detected by comment analysis

Licensed & Open Source usageLicensed & Open Source usage [Licensed]

I29/03/2015 | 36

Licensed

Wh [Li d] b bl ?Why [Licensed] can be a problem?Using Open Source for commercialUsing Open Source for commercial

products may lead to opening product dsource code

Commercial tools can detect open sourceCommercial tools can detect open source software but they are expensive and it is

ft it bl t d t t hoften suitable to detect such a case as soon as possiblep

I29/03/2015 | 37

Licensed

Well known example:pThe French ISP Free had a big problem

ith i f i t GPL li b iwith infringement GPL license by using “Busybox” and “Iptables”y p

I29/03/2015 | 38

Licensed

Comments may reveals use of Open Source software : license algorithm webSource software : license, algorithm, web links, emails … These kind of comments may reveals that

some portions of code are copiedsome portions of code are copied, adapted, inspired from source code found i th bin the web

I29/03/2015 | 39

Defects detected by comment analysis

Questioning commentsQuestioning comments [Problematic]

I29/03/2015 | 40

Defects detected by comment analysis

What are [Problematic] comments? Comments that may reveals developers

interrogation aboutinterrogation about – Source code correctness,– Algorithm chosen, – Data strategy etcData, strategy, etc.

I29/03/2015 | 41

Defects detected by comment analysis

Unfinished software [U Fi i h d]Unfinished software [UnFinished]

I29/03/2015 | 42

Defects detected by comment analysis

What are [UnFinished]?Comments that give clues that the

software is not finishedsoftware is not finishedFor example, special keywords used by

developers to indicate that the code – Is not finishedIs not finished– May be optimized, – Have to be updated

I29/03/2015 | 43

Defects detected by comment analysis

“C t d t” d“Commented out” code [CommentedCode]

I29/03/2015 | 44

Defects detected by comment analysis

Why [CommentedCode] Should be pointed out?out?Very bad practice that should not be allowed

because this is the symptom of – Unfinished source code – Code removed to ease some R&D tests like

integration tests but that may not be removed for a released version.

–Poor practice difficult to justify to customers–Forbidden by MISRA-C 2004 rule 2.4, MISRA-C

2012 Dir 4.4, and MISRA C++ rule 2-7-3

I29/03/2015 | 45

Defects detected by comment analysis

Why [CommentedCode] Should be pointed out?

I29/03/2015 | 46

Defects detected by comment analysis

Commented code is difficult to detect becauseCommented code is difficult to detect because code may be

Unfinished (not understandable by a compiler)– Unfinished (not understandable by a compiler)– Not working anymore (using old or deleted definitions)

A mix of code and human language– A mix of code and human language

I29/03/2015 | 47

Defects detected by comment analysis

Mi i b t tiMissing best practices [BestPractice]

I29/03/2015 | 48

Defects detected by comment analysis

Lack of respect of [BestPractice] can beCopyright header missingCopyright header missingMandatory keyword for development toolsy y p

–Configuration Management tool keywords missing (Used to automatically store in comment history of file(Used to automatically store in comment history of file modifications)

–Documentation management tool keywords (used to g y (generated automatically software documentation)

I29/03/2015 | 49

Defects detected by comment analysis

Comment/Code ratio violationsI t t f i d bilit &–Important for measuring code reusability & maintainability

–Can be calculated by some commercial tool like Static Analyzers

– A real ratio that eliminate all mandatory comments (Company headers, Functionscomments (Company headers, Functions Headers…), “commented out” code and trivial commentscomments

Additional statistics can include % of t bl t icomments problem categories

I29/03/2015 | 50

Comments checking

Automatic checking of commentsAutomatic checking of comments

I29/03/2015 | 51

Automating comment checks

Comments can be checked manuallyD i d i f d ifDuring code cross-review of source code if cross reader is aware of previously seen defects

t icategoriesDuring some Agile practices like PairDuring some Agile practices like Pair

programming

I29/03/2015 | 52

Automating comment checks

ButThi i h k t f A t blThis is a huge work, sort of « Augean stables »

Not all teams are doing Cross-reviews and pairNot all teams are doing Cross reviews and pair programming on all source code

H h k f il ilHuman checks may fails more easily than a tool even if humans can see more clever

i tpoints

I29/03/2015 | 53

Automating comment checks

Main challenge of Automatic comment analysisanalysisForeign language detectiong g gCommented code detection

How these two challenges can be gresolved?

I29/03/2015 | 54

Automating comment checks

Foreign language detection This may be done by complex natural

language semantic analysis but this is g g yreally not needed

I29/03/2015 | 55

Automating comment checks

Since Zipf and his “Selected studies of the principle of relative frequency in language” we allprinciple of relative frequency in language we all knows that some words are more frequently used than others in a given human languageused than others in a given human language

A list of most used words that are not available in English can be sed to detect a foreignin English can be used to detect a foreign language

A list of 30 to 70 words may be enough to detect a particular languagep g g–Accurate French language detection can be achieved

with 60 words including technical words for short gsentence detection

I29/03/2015 | 56

Automating comment checks

Another similar method is to check the presence of a sufficient amount of Englishpresence of a sufficient amount of English (and technical) words in commentIt may generate more false positive

B t thi th d i t l t d tBut this method is not related to any language other than English g g g

I29/03/2015 | 57

Automating comment checks

“Commented out” code detection Complex solution would be to embedded

a specific language analyzer in the toola specific language analyzer in the toolSpecific because we have to deal with

incomplete code, old code, partial code, mixed code & human language (pseudomixed code & human language (pseudo code, comments, …)

I29/03/2015 | 58

Automating comment checks

As for human language detection there is a simple solutionthere is a simple solution– Code detection can be done partially, but p y,nearly enough, by capturing typical computer language grammar and tokenscomputer language grammar and tokens

–This can be done by using simple regular expressions

I29/03/2015 | 59

Automating comment checks

All other kind of checksAs foreign languages are already detected

and reported we can assume that all otherand reported we can assume that all other categories of checks are done in English

I29/03/2015 | 60

Automating comment checks

All other kinds of checks can be done by checking the presence or combination ofchecking the presence or combination of some keywords in commentsExample for Unfinished: TBD, TBC,

TODOTODO, …Some keywords may be checked case y y

insensitively, some others must not

I29/03/2015 | 61

Automating comment checks

We need some exclusion rules to avoid catching too many false positivestoo many false positives

Example:l /* if ( 8L ki I P ) */–else /* if (u8Locking == InProcess) */

–If this is true commented code, this is more a documentation of what “if” the “else” refers to… Same for:

–#endef /* #ifdef NVRAM_MODULE */

I29/03/2015 | 62

Automating comment checks

Obtained resultsObtained results

I29/03/2015 | 63

Automating comment checks

Managing false positiveIn foreign language detection, false

positive was around 15% but very accuratepositive was around 15% but very accurate to detect even a few foreign words

I29/03/2015 | 64

Automating comment checks

Managing false positiveCommented code false positive is around

10%10%Lowering this leads to higher missed

positives

I29/03/2015 | 65

Automating comment checks

False positive was reducedBy choosing only strategic keywordB i l ti l i li tBy implementing an exclusion list

composed of regular expression By not reporting company header

commentscommentsBy adding a list of exclusion path forBy adding a list of exclusion path for

COTS, compiler libraries, legacy code…

I29/03/2015 | 66

Automating comment checks

Missed true positiveDifficult to evaluate except for foreign language

and commented code On foreign language it is 5% for sentences of 2

or more words but starting with 4 words it is lessor more words but starting with 4 words it is less than 0,5%

On commented code it is around 3% (due to partial lines of code, “typedef” and “macro” p ypusage that may be not captured by regular expressions)p )

I29/03/2015 | 67

Automating comment checks

Missed true positiveOn other kind of checks missed positive are less than

10% and can be reduced by adding new keyword This can be achieved in cross review when seeing

problematic comment reported by the tool

I29/03/2015 | 68

Automating comment checks

Automating comment checks in several computer languagescomputer languagesCapturing comments is quite easy in several

computer languages (small configuration)Analyzing natural language comments is notAnalyzing natural language comments is not

dependant of the computer language used“Detected “commented out” code is computer

language dependant and new languages can be added by a new small set of regular expression and exclusion list

I29/03/2015 | 69

Automating comment checks

Automating comment analysisT l t t t & thTool must capture comment & process them

Tool must be easy to use with a minimum needTool must be easy to use with a minimum need of configuration

T l t d t t t l dTool must detect computer language used The tool must be configurable forg

– Adding computer languages by just adding some configuration filesconfiguration files

– Categories of problematic comments

I29/03/2015 | 70

Automating comment checks

Automating comment analysisT l t d “ t ” t– Tool must produce an “easy to use” report

– Tool must produce a real comment ratio metric– Tool must provide statistics for project

I29/03/2015 | 71

Automating comment checks

Automating comment analysisThi t l i t C t A l t lThis tool exists: Comment Analyser tool can parse C and ASM and can be easily extended by

fi ti tconfiguration to manage– other human languages (detection), – other computer language – other categoriesother categories – new keywords in actual categories

I29/03/2015 | 72

Global results

45 projects analyzed in C and ASM,

4 Million of SLOC analyzed

3% of comments reveals problems% p

Mean comment/code ratio is 2 83Mean comment/code ratio is 2,83

Most common issue is Not in English and “commented out” codea d co e ted out code

I29/03/2015 | 73

Results global view45 various projects from 2,3 to 644 KSLOC of C & ASM automotive embedded source code from 2002 to 2015

[PROFANITY]0%

Comment Analysis 2002 to 2015

[NOTENGLISH]

0%

[COMMENTEDCODE]18%

[ ]59%

[BESTPRACTICE]12%

[BADPRACTICE]7%

[UNFINISHED]1%

[PROBLEMATIC]3%

[LICENSED]0%

I29/03/2015 | 74

Results global: recent projects22 various projects of C & ASM from 2011 to 2015: reported comments dropped from 4,72% to 1,22%

[PROFANITY]0%

Comment Analysis 2011 to 2015

[COMMENTEDCODE]24%

[LICENSED]1%

[NOTENGLISH]15%

0%

24%

[BESTPRACTICE]26%

[BADPRACTICE]27%

[PROBLEMATIC]6%

[UNFINISHED]

I29/03/2015 | 75

[UNFINISHED]1%

Results global view

Comment ratio comparison between QAC and “Comment Analyser”QAC and “Comment Analyser”QAC Static Analysis tool compute STCDNQAC Static Analysis tool compute STCDN

metric as th b f i ibl h t i t–the number of visible characters in comments, divided by the number of visible characters

t id toutside comments. –Comment delimiters are ignored. g–Whitespace characters in strings are treated as visible charactersvisible characters.

I29/03/2015 | 76

Results global view

Comment Analyser tool compute this metrics asmetrics as –NBCMT = number of useful characters in

t lti l hit h tcomments: multiple whitespace characters are treated as one character, trivial comments (ex: /*******************/) i lifi d ( >/*[ ]*/)/*******************/) are simplified (=>/*[…]*/), and, commented code is removed

–NBPRG = number of useful characters outside comments

–STCDN² = NBCMT/NBPRG

I29/03/2015 | 77

Results global view

Comment ratio comparison between QAC and “Comment Analyser”Comment AnalyserOn a project of 194K lines QAC gives a mean STDCN

value of 3 15 (3 15 characters in comments for 1value of 3,15 (3,15 characters in comments for 1 character not in comment) while Comment Analyser tool gives 2,4 (2,4 characters in comments for 1 character not g (in comment).

STCDN² metric is significantly different than those g ycalculated by QAC.

This mean that classic way of calculating Comment/CodeThis mean that classic way of calculating Comment/Code ratio is not accurate.

I29/03/2015 | 78

Results global viewExample of tool output for a given project

Project: XXXX Comments analysis reportProject base directory: E:\User\XXXX\04-Software\04-Coding\01-Sources

E:\User\XXXX\04-Software\04-Coding\01-Sources\Application\ACC_IG\ACCIG_Manage.c (Rev. 1.23 by [email protected])

000894 [COMMENTEDCODE] //(WRP_GetData(flg, EvtBCMEmgcStop) == TRUE)

000901 [COMMENTEDCODE] //(WRP G tD t (fl E tBCME St ) TRUE)000901 [COMMENTEDCODE] //(WRP_GetData(flg, EvtBCMEmgcStop) == TRUE)

E:\User\XXXX\04-Software\04-Coding\01-Sources\Application\ACC_IG\ACCIG_Manage.h (Rev. 1.4 by [email protected])

000099 [COMMENTEDCODE] /* #ifndef ACCIG_MANAGE_H */

000266 [COMMENTEDCODE] /*LOC u8ComCID State = u8COMCID BUSY; */000266 [COMMENTEDCODE] /*LOC_u8ComCID_State = u8COMCID_BUSY; */

E:\User\XXXX\04-Software\04-Coding\01-Sources\Application\DIA\RC\DIA_SendBF\DIA_RCSendBF_loc.h

(Rev. 1.4 by [email protected])

000118 [COMMENTEDCODE] /* define u8REQ LF EM ANT INS ALL ((uint8) 0x40) */000118 [COMMENTEDCODE] / define u8REQ_LF_EM_ANT_INS_ALL ((uint8) 0x40) /

000134 [COMMENTEDCODE] /* define u8LF_ALL_EXT_ANT ((uint8) 0x80) */

001415 [UNFINISHED] /* TODO: remove ASIC activation */

001764 [COMMENTEDCODE] /* CanACC BDB: 0 = OFF; 1 = ON */[ ] _ ;

001863 [COMMENTEDCODE] /*EVT_strEntries.u8EvtBSTPE = FALSE;*/

001869 [COMMENTEDCODE] /*EVT_strEntries.bEvtBDB1S01st = FALSE;*/

001873 [UNFINISHED] // TODO: confirm filtering

001925 [COMMENTEDCODE] /* &&(WRP_GetData(u8,EsclState) == HFS_u8ESCL_UNL) */

I29/03/2015 | 79

Results detailed view

Example of warning per type: commented code

/* CAR u8ComTypeInProgress == CAR u8COM TYP NO COM *//* CAR_u8ComTypeInProgress == CAR_u8COM_TYP_NO_COM */

// level = BATT_u8BattLevel

#if 0 /* Not used in 128 bit key */ if( (((uint8) u8NB_COL_KEY) > ((uint8) 6)) && ( ((uint8) (u8Index % u8NB_COL_KEY)) == ((uint8) 4) )) {

/* { _CLI } */

//UTL 8M ( & t T C fi 8T S tK [0]//UTL_u8Memcpy( &strTrpConfig.au8TrpSecretKey[0], WRP_GetData(au8,StartKeys),AUT_u8eSIZEOF_ISK_CODE)

/*( )*//*(uint8)*/

/*ASM_vidChecksStack() */

I29/03/2015 | 80

Results detailed view

Example of warning per type: Bad practice

/* PRQA S 3198 */ (Disabling Static analysis tool in source code)/* PRQA S 3198 -- */ (Disabling Static analysis tool in source code)

/* game over for the windowed mode, return to immediate mode and counter to 0 */

Missing PVCS $Workfile:\$Revision:\$Log:\$Modtime: or $Date: keywords!

I29/03/2015 | 81

Results detailed view

Example of warning per type: Unfinished

/* TODO *//* TODO */

/*TBD: in case of end of stop field success/error/timeout*/

//todo :LINVHStart used ?

#if 0 8LCK END ST { // TODO R ??#if 0 case u8LCK_END_ST : { // TODO: Remove ??

// TODO: verify if ( LNK[...]

/* xxx x xxxx [...]*/

/* Toff = xxxx ms (unit = 13,5115 µs) */

// TODO: CALL DET

/* TBC */

I29/03/2015 | 82

Results detailed viewExample of warning per type: Problematic

If (mode /*=*/=RUNNING) [ ] <= May be a bug! And for sure a very bad practiceIf (mode / / RUNNING) […] < May be a bug! And for sure a very bad practice

// BUG : Result is written although not expected + Null pointer provided + Pointer not tested

/* Bug : one extra command was sent => LF carrier of length 0 !!! */

/* Temporary workaround: due to a bug in the ASIC software, the ASIC is not woken up by the t [ ]*/request [...]*/

/* Compiler bug: __transponder_reset must be used else address RESET_SUBCOMMAND is not linked */linked /

/* direct write in EVT struct !! */

/* Index overflow !!! */

/* !ONLY USED FOR TESTS! */

/* Reset can never be prevented => Problem !!! */

/* ERROR !!! */

I29/03/2015 | 83

/ ERROR !!! /

Results detailed view

Example of warning per type: Not English

/* timeout sur ACK 1 *//* timeout sur ACK 1 */

/* si le polling est en cours => signal à EVT pour activation fonction */

/* Accès ML avec calibrage */

/* A t B *//* eomA et eomB */

/* > Plus rien ne doit figurer après ce point. */

#if 0 /* Lecture entrées directes */ LOC_au8TabEntriesBruts[INDEX_ACC] = [...]Dio_ReadChannel(DIO_UC_INFO_ACC) ^ (DIO_UC_INFO_ACC_MASK) OCLOC_au8TabEnt[...]

// 0 pour +1 minimum

/* Durée du filtrage des entrées capteurs x période tâche = 6 * 2ms = 8 ms + latence prise en compte soit x ms en veille => filtrage de X ms et y ms e[...]

I29/03/2015 | 84

Automating comment checks

Future WorkFuture Work

I29/03/2015 | 85

Automating comment checks

Possible future workT d thi i t d l t l b t iToday this is a stand alone tool, but is also can be provided as an add-on of

t ti l i t l lik QAC Csome static analysis tool like QAC-C, QAC-C++

A link with executable source code and comments can be added in order tocomments can be added in order to–Ensure comments presence for strategic lines of code (structure members “if”lines of code (structure members, if statement, “while” statements, …)Etc– Etc…

I29/03/2015 | 86

Automating comment checks

Creating an add-on to configuration management tool to check that modified code also include comments addition or modification –To ensure that comments are always up to date y p

I29/03/2015 | 87

Understand Automotive Software

ReferencesReferences

I29/03/2015 | 88

References

Early, previous and related worksG Kingsley Zipf : “Selected studies of the principle of relativeG. Kingsley Zipf : Selected studies of the principle of relative

frequency in language”, Harvard University Press, Cambridge, MA, USA, 1932. 51 pp. LCCN P123 .Z5.

Lin Tan, Ding Yuan and Yuanyuan Zhou. HotComments. How to Make Program Comments More Useful? Department of Computer Science, University of Illinois at Urbana-Champaign, {lintan2,Science, University of Illinois at Urbana Champaign, {lintan2, dyuan3, yyzhou}@cs.uiuc.edu

W. E. Howden. Comments analysis and programming errors. IEEE Trans. Softw. Eng., 1990

Z. Li and Y. Zhou. PR-Miner: Automatically extracting implicit programming rules and detecting violations in large software code Inprogramming rules and detecting violations in large software code. In FSE'05.

D. Steidl, B. Hummel, E. Juergens: “Quality Analysis of Source CodeD. Steidl, B. Hummel, E. Juergens: Quality Analysis of Source Code Comments”, CQSE GmbH, Garching b. München, Germany

I29/03/2015 | 89

ReferencesN. Khamis, R. Witte, and J. Rilling, “Automatic Quality Assessment

of Source Code Comments: the JavadocMiner,” ser. NLDB ’10, 20102010.

M.-A. Storey, J. Ryall, R. I. Bull, D. Myers, and J. Singer, “TODO or To Bug: Exploring How Task Annotations Play a Role in the Work g p g yPractices of Software Developers,” ser. ICSE ’08, 2008.

A. T. T. Ying, J. L. Wright, and S. Abrams, “Source code that talks: l ti f E li t k t d th i i li ti tan exploration of Eclipse task comments and their implication to

repository mining,” ser. MSR ’05, 2005. L Tan D Yuan and Y Zhou “HotComments: How to Make L. Tan, D. Yuan, and Y. Zhou, HotComments: How to Make

Program Comments More Useful?” ser. HOTOS ’07, 2007.D. J. Lawrie, H. Feild, and D. Binkley, “Leveraged Quality , , y, g Q y

Assessment using Information Retrieval Techniques,” ser. ICPC ’06, 2006.

I29/03/2015 | 90

References Z. M. Jiang and A. E. Hassan, “Examining the Evolution of Code

Comments in PostgreSQL,” ser. MSR ’06, 2006.B. Fluri, M. Wursch, and H. C. Gall, “Do Code and Comments Co-

Evolve? On the Relation between Source Code and Comment Changes,” ser. WCRE ’07, 2007.g , ,

J. Tang, H. Li, Y. Cao, and Z. Tang, “Email data cleaning,” ser. KDD’05, 2005.

A. Bacchelli, M. D’Ambros, and M. Lanza, “Extracting Source Code from E-Mails,” ser. ICPC ’10, 2010.

I29/03/2015 | 91

Automating comment checks

“Comments” and Questions?“Comments” and Questions?

I29/03/2015 | 92

Comments and Questions

Any questions?

Any comments? “Human language” y g gcomments of course…

I29/03/2015 | 93

Automatic Comment Analysis

THANK YOU !THANK YOU !

I29/03/2015 | 94

Reference

I29/03/2015 | 95