Spotting Working Code Examples (ICSE 2014)
Transcript of Spotting Working Code Examples (ICSE 2014)
![Page 1: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/1.jpg)
SPOTTING
WORKING CODE EXAMPLES
Iman Keivanloo Juergen Rilling Ying Zou
1
![Page 2: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/2.jpg)
Code Completion
_File_>
public static void test() {
FileInputStream fStream = new FileInputStrea…
try {
String everything = IOUtils.toString(fStream );
} finally {
fStream.close();
}
2
![Page 3: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/3.jpg)
Code Recommendation
_FileInputStream_>
3
• Limited query
• Usage pattern
![Page 4: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/4.jpg)
4
Spotting Working Code Examples
_Read file line by line FileInputStream_ __> Real-time search
100ms < <400ms
![Page 5: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/5.jpg)
Challenges
in Spotting Working Code Example
Correctness
while ((content = fis.read()) != -1){
System.out.print((char) content);}
Correct Complete Concise
FileInputStream fis = null;
File file = new File(“foo.txt”);
fis = new FileInputStream(file);
int content;Send SMS …
+ +
5
![Page 6: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/6.jpg)
Challenges in Spotting Working Code
Example
Query:
{read, file}549,750
6
![Page 7: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/7.jpg)
7
![Page 8: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/8.jpg)
8
Why NOT Vector Space Model?
• e.g.,
test(readFile(“f1.txt”));
test(readFile(“f2.txt”));
test(readFile(“f3.txt”));
VSMBag-of-
words
Cosine
similarity+ +
VSM does not search for patterns+
![Page 9: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/9.jpg)
9
Search Space
Search Algorithm
![Page 10: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/10.jpg)
Similarity
Search Space
Content
Similarity
int temp = 1;
int temp = 0;
float var = 3;
{int, temp}
{int, temp}
{float, var}
*Bag-of-words model
10
![Page 11: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/11.jpg)
Our Approach
Search Space
Content
Similarity
int temp = 1;
int temp = 0;
float var = 3;
{int, temp}
{int, temp}
{float, var}
*Bag-of-words model *p-strings
[Baker, B. S. 1993]
Pattern
Similarity
𝜌 𝜌 = 𝜌 ;
𝜌 𝜌 = 𝜌 ;
𝜌 𝜌 = 𝜌 ;
𝜌 𝜌 = 𝜌 ; + {int, temp, foat, var}
𝜌 𝜌 = 𝜌 ; + {int, temp, foat, var}
𝜌 𝜌 = 𝜌 ; + {int, temp, foat, var}
11
![Page 12: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/12.jpg)
12
Offline Code Snippet Processing
12
![Page 13: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/13.jpg)
13
Discarding Unnecessary Details …
13
![Page 14: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/14.jpg)
14
{int, temp, foat, var}
Representation without Ordering Data
14
![Page 15: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/15.jpg)
Mining Abstract Solutions
15
abstract programming solution (clone)
![Page 16: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/16.jpg)
16
Search Space
Search Algorithm
![Page 17: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/17.jpg)
The Proposed Greedy Algorithm
top-k lines
(imaginary snippet)
1st abstract clone top snippet
query
{read, file}
𝑙𝑞,1
𝑙𝑞,2
…
𝑙𝑞,𝑛
𝑝𝑐,1
𝑝𝑐,2
…
𝑝𝑐,𝑛
𝑐𝑝,1
𝑐𝑝,2
…
𝑐𝑝,𝑛
top-k abstract
clones
top-k lines
17
![Page 18: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/18.jpg)
Spotting Working Code Examples
1. Free-form querying
2. Self-contained code examples
query= { JFreeChart, JPEG}
18
![Page 19: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/19.jpg)
Spotting Working Code Examples
3. Less dependency on term matching
4. No limitation on query’s terms
query= { bubblesort }
19
![Page 20: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/20.jpg)
Case Study
1. Feasibility (e.g., no data/control flow data!)
2. Scalability
3. Performance:
•RQ1 Ranking schema?
•RQ2 Our approach VS. code search engines?
20
![Page 21: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/21.jpg)
Corpus
~12 million
Java classes
~25,000~3 million
Unique Java
classes
~300 million
LOC
-----------------
5.5 million
fragments
~15.5 million
abstract clones
21
![Page 22: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/22.jpg)
• Features for ranking:
1. Similarity (S)
2. Popularity (P)
3. Size (A)
feature X
Top-K
RQ1 – What is the best ranking schema
for spotting working code examples?
Re-ranking
4. Combination of P and S
5. Combination of A and S
22
![Page 23: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/23.jpg)
• Recall is misleading
• The first answer matters
• WTA (Winner Takes All)
RQ1 – What is the best ranking schema?
23
![Page 24: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/24.jpg)
Whether the top ranked answer is correct?
RQ1 – What is the best ranking schema?
60
70
80
90
S P PS A AS
Coverage
Precision
Similarity (S) Popularity (P) Size (A)
S P P+S A A+S
24
![Page 25: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/25.jpg)
Whether the top ranked answer is a good code example?
RQ1 – What is the best ranking schema?
Completeness Conciseness
S P A P
100
60
30
100
60
20
(S) Similarity
(P) Popularity
(A) Size
S P A P+S A+S
Popularity + Similarity
leads to the best ranking schema
for
spotting working code examples
![Page 26: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/26.jpg)
RQ2 – Can our approach outperform
Internet-scale code search engines?
Our approach
~25,000
26
![Page 27: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/27.jpg)
27
RQ2 – Our approach vs. Ohloh Code?
Our approach Our approach
Best Hit’s Rank NDCG
40
20
2
1
0.7
0.5
The proposed real-time search is
feasible + outperforms Ohloh Code
![Page 28: Spotting Working Code Examples (ICSE 2014)](https://reader033.fdocuments.in/reader033/viewer/2022051112/559489341a28abfa7c8b465f/html5/thumbnails/28.jpg)
28
Summary