An Empirical Study of Goto in C Code from GitHub Repositories

59
An Empirical Study of Goto in C Code from GitHub Repositories Mei Nagappan, Romain Robbes, Yasutaka Kamei, ric Tanter, Shane McIntosh, Audris Mockus, Ahmed E. Hassan

Transcript of An Empirical Study of Goto in C Code from GitHub Repositories

Page 1: An Empirical Study of Goto in C Code from GitHub Repositories

An Empirical Study of Goto in C Code from GitHub Repositories

Mei Nagappan, Romain Robbes, Yasutaka Kamei, Eric Tanter, Shane McIntosh, Audris Mockus, Ahmed E. Hassan

Page 2: An Empirical Study of Goto in C Code from GitHub Repositories

Go-to statement considered

harmful (1968)

Page 3: An Empirical Study of Goto in C Code from GitHub Repositories

A Case against the GO TO Statement

(1968)

Page 4: An Empirical Study of Goto in C Code from GitHub Repositories

Do Practitioners care?

Page 5: An Empirical Study of Goto in C Code from GitHub Repositories

Do Practitioners care?

https://xkcd.com/292/

Page 6: An Empirical Study of Goto in C Code from GitHub Repositories

Do Practitioners care?

Page 7: An Empirical Study of Goto in C Code from GitHub Repositories

Do Practitioners care?

Page 8: An Empirical Study of Goto in C Code from GitHub Repositories

Do Practitioners care?

Page 9: An Empirical Study of Goto in C Code from GitHub Repositories

Impact of Article on Research Community

Page 10: An Empirical Study of Goto in C Code from GitHub Repositories

Impact of Article on Research Community

Page 11: An Empirical Study of Goto in C Code from GitHub Repositories

Impact of Article on Research Community

Page 12: An Empirical Study of Goto in C Code from GitHub Repositories

Impact of Article

Page 13: An Empirical Study of Goto in C Code from GitHub Repositories

Koenig: Dijkstra provides strong logical evidence for why goto statements can introduce problems in software.

Page 14: An Empirical Study of Goto in C Code from GitHub Repositories

Well informed and often well argued opinions

Page 15: An Empirical Study of Goto in C Code from GitHub Repositories

Shift from opinion to facts

Page 16: An Empirical Study of Goto in C Code from GitHub Repositories

Our Goal: Empirically Examine the use of goto

Shift from opinion to facts

Page 17: An Empirical Study of Goto in C Code from GitHub Repositories

Example

Label

Goto Statement

Page 18: An Empirical Study of Goto in C Code from GitHub Repositories

PRELIMINARY ANALYSIS

Do developers use goto statements in their source code?

Page 19: An Empirical Study of Goto in C Code from GitHub Repositories

Experiment

11,627 Projects Goto Miner

Page 20: An Empirical Study of Goto in C Code from GitHub Repositories

11,627 Projects Goto Miner 3,093

Projects

Despite the popularity of Dijkstra’s case against goto…

Page 21: An Empirical Study of Goto in C Code from GitHub Repositories

11,627 Projects Goto Miner 3,093

Projects

246,657 out of the 2,150,387 files (11.47%)

Extent of use of goto statements by Developers is non-trivial

Page 22: An Empirical Study of Goto in C Code from GitHub Repositories

Two Research Questions

Page 23: An Empirical Study of Goto in C Code from GitHub Repositories

Two Research Questions

RQ1: What are goto statements used for?

Page 24: An Empirical Study of Goto in C Code from GitHub Repositories

Two Research Questions

RQ1: What are goto statements used for?

RQ2: Do developers remove/modify goto statements?

Page 25: An Empirical Study of Goto in C Code from GitHub Repositories

Qualitative Analysis

246,669 C files

384files

Properties and Purposesof Goto

IterativeTagging

Sample Selection

Page 26: An Empirical Study of Goto in C Code from GitHub Repositories

Basic Bean Counting

Page 27: An Empirical Study of Goto in C Code from GitHub Repositories

Num of Goto Statements per Function

Num of Goto Labels per Function

Not many goto statements per function

Page 28: An Empirical Study of Goto in C Code from GitHub Repositories

Num of Goto Statements per Function

Num of Goto Labels per Function

Not many goto statements per function

Very few lines of code in label blocksMedian = 4 LOC

Page 29: An Empirical Study of Goto in C Code from GitHub Repositories

Properties

Page 30: An Empirical Study of Goto in C Code from GitHub Repositories

Context

Why did Dijkstra think goto statements would have ‘disastrous effects’?

‘The go to statement as it stands is just too primitive; it is too much an invitation to make a mess of one’s program.’

Page 31: An Empirical Study of Goto in C Code from GitHub Repositories

Desire to make programs verifiable

Page 32: An Empirical Study of Goto in C Code from GitHub Repositories

Desire to make programs verifiable

Page 33: An Empirical Study of Goto in C Code from GitHub Repositories

3 ways to reach the code under the label “back”

Page 34: An Empirical Study of Goto in C Code from GitHub Repositories

First way: Top down execution

Page 35: An Empirical Study of Goto in C Code from GitHub Repositories

Second way: condition2 == true

Page 36: An Empirical Study of Goto in C Code from GitHub Repositories

Third way: done == false

Page 37: An Empirical Study of Goto in C Code from GitHub Repositories

Difficult to know what has been executed so far.

Hence go to statements must be avoided!

Page 38: An Empirical Study of Goto in C Code from GitHub Repositories

Our Findings

Page 39: An Empirical Study of Goto in C Code from GitHub Repositories

Multiple goto jumps to the same label

This is what Dijkstra feared about!

62%

Page 40: An Empirical Study of Goto in C Code from GitHub Repositories

Single point of entry into label block!

9%

But not all use of goto is like what Dijkstra feared!

Page 41: An Empirical Study of Goto in C Code from GitHub Repositories

Stacking labels at the bottom of functions is prevalent

27%

Page 42: An Empirical Study of Goto in C Code from GitHub Repositories

Spaghetti code is uncommon

6%

Page 43: An Empirical Study of Goto in C Code from GitHub Repositories

Most of the jumps are forward, not backwards

90%

14%

Page 44: An Empirical Study of Goto in C Code from GitHub Repositories

Purposes

Page 45: An Empirical Study of Goto in C Code from GitHub Repositories

Most goto usage is for error handling and cleanup

Error Handling = 80%Cleanup = 40%

Page 46: An Empirical Study of Goto in C Code from GitHub Repositories

Most goto usage is for error handling and cleanup

Error Handling = 80%

Goto and labels used to emulate a try/catch or finally mechanism

Cleanup = 40%

Page 47: An Empirical Study of Goto in C Code from GitHub Repositories

Less intuitive usages such as control-exit and loop-create are less common

10%

Page 48: An Empirical Study of Goto in C Code from GitHub Repositories

Less intuitive usages such as control-exit and loop-create are less common

9%Most usages of goto statements appear to be for

error handling and cleanup

Page 49: An Empirical Study of Goto in C Code from GitHub Repositories

Quantitative Analysis

Mine Commits

6 OSS Projects

Page 50: An Empirical Study of Goto in C Code from GitHub Repositories

Quantitative Analysis

Mine Commits

6 OSS Projects

Link to Bugs

180 Days Of Commit History

Page 51: An Empirical Study of Goto in C Code from GitHub Repositories

Quantitative Analysis

Mine Commits

6 OSS Projects

Link to Bugs

180 Days Of Commit History

Bug Fix Commits

Extract Goto

Page 52: An Empirical Study of Goto in C Code from GitHub Repositories

Almost no Gotos are removed/modified in the post-release phase of a project

clamav-d

evel

ghostp

dlgim

p

openldap

postgresq

lVTK

02468

10121416

RemovedModified

Page 53: An Empirical Study of Goto in C Code from GitHub Repositories

Even fewer Gotos are removed/modified in the post-release bug fixes

clamav-d

evel

ghostp

dlgim

p

openldap

postgresq

lVTK

0

1

2

3

4

5

6

RemovedModified

Page 54: An Empirical Study of Goto in C Code from GitHub Repositories

Even fewer Gotos are removed/modified in the post-release bug fixes

clamav-d

evel

ghostp

dlgim

p

openldap

postgresq

lVTK

0

1

2

3

4

5

6

RemovedModified

Developers did not remove/modify goto statements in the post-release phase.

Page 55: An Empirical Study of Goto in C Code from GitHub Repositories

Summary

Page 56: An Empirical Study of Goto in C Code from GitHub Repositories
Page 57: An Empirical Study of Goto in C Code from GitHub Repositories
Page 58: An Empirical Study of Goto in C Code from GitHub Repositories
Page 59: An Empirical Study of Goto in C Code from GitHub Repositories