Live API Documentation
-
Upload
hossein-mobasher -
Category
Documents
-
view
58 -
download
4
Transcript of Live API Documentation
Live API Documentation
Subramanian, S., Inozemtseva, L., & Holmes, R. (2014, May). Live API documentation.In ICSE (pp. 643-652).
Presenter: Hossein Mobasher
Course: Software Evolution
Contents
• Introduction• Previous works• What’s new?• Scenario• Oracle Generation• Problems• Approach• Example• Evaluation• Conclusion
2 / 20
Introduction
• APIs enable complex functionality to be used by client programs
• Understanding how to use an API can be difficult• API documentation is often insufficient on its own.
• Writing documentation and keeping it up to date is very difficult • Developers ignore the documentation that does exist and declare that “code
is king”
3 / 20
Introduction (continue)
• Online sites fill the gap between traditional API documentation and more example-based resources • StackOverflow
• Github Gists
• Unfortunately, these two important classes of documentation are independent
• Baker links source code examples to API documentation
4 / 20
Previous works
• Identify source code references within non-code resources.
• These approaches have several limitations:• Some systems explicitly ignored external references.
• Others only returned partially qualified names (PQN).• Are insufficient for documentation linking.
• None of them worked for dynamically-typed languages.
5 / 20
What’s new?
• A constraint-based, iterative approach for determining the fully qualified names of code elements in source code snippets.
• A prototype tool that implements this approach and uses the results to automatically create bidirectional links between documentation and source code examples.
6 / 20
Scenario 1
• Code snippet posted to StackOverflow to assist a developer whodidn’t understand how to manipulate the state of History objects.
• Baker can uniquely link bolded elements to the API.• The elements for which it can determine a fully qualified name. (FQN)
7
Scenario 2
• Code snippet that a developer is trying to make a web app that can take a photo and inject it into an element in an HTML documents.
• Baker also can identify the API that bolded elements are from.
8
Oracle Generation
• Baker’s oracle is key to its success.
• It is generally impossible to identify FQN of the code elements in a snippets.• FQN is essential to documentation linking tasks.
• The oracle are implemented as web services.• Allowing Java/JS to be updated dynamically by any user or program.
9 / 20
Oracle Generation (continue)
• Java Oracle• Containing class, method and field signatures.
• Using Neo4j to represent the hierarchies between code elements that an object-oriented language like Java offers.
• Oracle includes full type information in the database.• The types of classes, fields, return types, and parameters.
• The Java oracle can be dynamically updated by adding an appropriate JAR file.
10 / 20
Oracle Generation (continue)
• JS Oracle• Is built by statically analyzing the source files of the libraries to be included.
• Using ESPRIMA to parse the source code of each library.
• ESPRIMA returns a JSON representation of the AST.
• JS oracle identifies all of the ‘Function Expressions’ and ‘Function Declaration’ nodes.
• Parsing problems:• JavaScript is dynamically typed language. It is difficult to identify all method declarations
by static analysis of source code.
• JavaScript is not annotated with visibility (e.g. public and private)
11 / 20
Problems
• Parsing code snippets is more difficult than full files.• Code snippets can be ambiguous.
• Kinds of ambiguity:• Declaration Ambiguity
• External Reference Ambiguity
12 / 20
Approach
• Deductive Linking• Handles declaration and external reference ambiguity.
• The goal is identifying the sole FQN that a given identifier can represent.
• Generating AST for code snippets.
• Uses information from the oracle to deduce facts about the AST.
13 / 20
Example (JavaBaker)
• History: 58 candidate types are recorded for History in oracle.
• addHistoryListener: Test 58 candidate types to see which ones contain a method called addHistoryListener(…) that take a single object parameter.• This results in 4 candidate methods.
• History node is also updated (reduced from 58 to 4)
• History.getToken: Test 4 candidates, reduced to from 4 to 2
• …
• At the end, Baker iterates again.
• Baker can be identified History ascom.google.gwt.user.client.History
14 / 20
Example (JSBaker)
• $(…).on(…)• $ matches only with jQuery.
• on matches with jQuery’s on method. (Even though there are three onmethod in the oracle)
• useGetPicture• Oracle doesn’t contain a result for that.
• Baker records that this function as locally defined, rather than being an external function.
• …
15 / 20
Evaluation
• Two research questions:• Can Baker accurately identify API elements in code snippets?
• Does Baker work on a variety of systems, or is it limited to just a few libraries?
16 / 20
Linker Accuracy
• Precision is much more important than recall.
• Choose five Java systems (libraries) for analyzing.• Android/ GWT/ Hibernate/ Joda Time/ XStream
• Manually examined 50 code elements for each system to determine if the result returned by Baker:• True Positive (TP)• False Positive (FP)• False Negative (FN)
17 / 20
Linker Accuracy (continue)
• Baker’s overall Java precision (0.98) and recall (0.83). Only exact matches (cardinality = 1) were considered.
18 / 20
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 98% 𝑅𝑒𝑐𝑎𝑙𝑙 = 83%
Linker Accuracy (continue)
19 / 20
• Baker’s overall JavaScript precision (0.97) and recall (0.96). Only exact matches were considered. (cardinality = 1)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 97% 𝑅𝑒𝑐𝑎𝑙𝑙 = 96%
Example Diversity
• JavaBaker parsed 4000 source code examples.• Identified over 30000 links to 4500 unique elements.
• JSBaker parsed 1000 source code examples.• Identified over 10000 links to 500 unique elements.
Qualifying High-cardinality Match
• Linking methods may return more than one match when there isn’t enough information to FQN of a method or type.• This is relatively rare.
• Graph shows the cardinality of the resultfor each of the 4,000 snippets.• JDK types and methods have been removed.
• The majority (69%) of elements can be precisely identified.
21 / 20
Conclusion
• Maintaining API documentation is challenging, time-consuming task.• The documentation is frequently out of date.
• Baker automatically generates links between API documentation and source code examples.
• Baker has high precision. (0.97)
22 / 20
Questions?