Icws 2016 v1
-
Upload
pradeep-k-venkatesh -
Category
Presentations & Public Speaking
-
view
34 -
download
0
Transcript of Icws 2016 v1
What Concerns Do Client Developers Have When Using Web APIs?
An Empirical Study of Developer Forums and Stack Overflow
Pradeep K. Venkatesh Ying Zou Ahmed E. HassanShaohua Wang Feng Zhang
Agenda• Introduction/Background of the work
• Research problem
• What we do?
• Dataset
• Results - RQ’s
• Future work
Service Oriented Computing
Organizations Client Applications
• Popularity of service-oriented computing makes companies
to offer their services through Web APIs
• Client applications extensively use these Web APIs to offer
more value-added services
Research Problem
Web API Evolution
Fix Bugs Optimize PerformanceIntroduce New features
Research Problem
Web API Evolution
Fix Bugs Optimize PerformanceIntroduce New features
More
Frequently
Research Problem
Web API Evolution
Fix Bugs Optimize PerformanceIntroduce New features
More
Frequently
Grace period3 to 6 Months
Research Problem
Web API Evolution
Fix Bugs Optimize PerformanceIntroduce New features
More
Frequently
Grace period3 to 6 Months
Client Developers often have concerns and difficulties adapting Web APIs.
‘SWe Want!
• Know the discussions related to web APIs & understand
the common challenges faced by client developers
‘SWe Want!
• Know the discussions related to web APIs & understand
the common challenges faced by client developers
• Make smart recommendation/suggestions to providers &
client developers
What We Do?
Developer Discussions
Mine and Analyze the Developer Discussions
What We Do?
Developer Discussions
Topic Modelling Technique}
Mine and Analyze the Developer Discussions
Identify Common & Dominant
Discussions
What We Do?
Developer Discussions
Topic Modelling Technique}
Mine and Analyze the Developer Discussions
Identify Common & Dominant
Discussions
Time Series
Clustering
Patterns of Dominant Discussions
DatasetConsists of 32 popular Web APIs from 7 different domains
Business/eCommerce
Data Storage/Sharing
Location Based Services
Mapping Services
Platform/Tools/Utilities
Social Media/Network
Video/Audio Streaming
DatasetConsists of 32 popular Web APIs from 7 different domains
Business/eCommerce
Data Storage/Sharing
Location Based Services
Mapping Services
Platform/Tools/Utilities
Social Media/Network
Video/Audio StreamingTotal Number o
f Disc
ussions
92,471
Research Questions
(RQ1.) What are the most discussed topics related to Web APIs among client developers? (RQ2.) What are the evolution patterns of the most discussed topics?
Research Questions
(RQ1.) What are the most discussed topics related to Web APIs among client developers? (RQ2.) What are the evolution patterns of the most discussed topics?
RQ1. Analysis Approach
Apply Popular Topic Modelling Technique
• Latent Dirichlet Allocation
RQ1. Analysis Approach
Apply Popular Topic Modelling Technique
• Latent Dirichlet Allocation • Pre-process:
• Removing any code elements <code>….</code> • Removing all HTML tags <a>, <i>, <a href=“…”/> • Removing all stop words “a”,”is”,”was” • Applying stemming on english words to get to their
base form. e.g., “programming”, “programmer” to base form of “program”
RQ1. Analysis Approach
Apply Popular Topic Modelling Technique
• Latent Dirichlet Allocation • Pre-process:
• Removing any code elements <code>….</code> • Removing all HTML tags <a>, <i>, <a href=“…”/> • Removing all stop words “a”,”is”,”was” • Applying stemming on english words to get to their
base form. e.g., “programming”, “programmer” to base form of “program”
• Extract 40 Discussions Topics from developer discussions
RQ1. Results
• Only Limited Number of Topics Dominate developer discussions
• Some Dominant Discussions Topics are shared across Web APIs of each category
• Few Dominant Discussion Topics are Shared across categories
• E.g., Authentication/Authorization
On average, the five most discussed topics contribute to over 50% of discussions in each Web API.
Research Questions
(RQ1.) What are the most discussed topics related to Web APIs among client developers? (RQ2.) What are the evolution patterns of the most discussed topics?
RQ2. Analysis Approach Apply Hierarchical Time Series Clustering Algorithm
RQ2. Analysis Approach Apply Hierarchical Time Series Clustering Algorithm
1) we sum up the number of discussions per month for every LDA topic of a Web API;
2) we create a matrix where a row label is a topic of a Web API and a column label is a time unit (i.e., one month);
3) we convert the matrix to a distance matrix using the Autocorrelation
4) apply a hierarchical time series clustering algorithm
RQ2. Analysis Approach Apply Hierarchical Time Series Clustering Algorithm
We identify patterns of discussion topics of Web APIs
1) we sum up the number of discussions per month for every LDA topic of a Web API;
2) we create a matrix where a row label is a topic of a Web API and a column label is a time unit (i.e., one month);
3) we convert the matrix to a distance matrix using the Autocorrelation
4) apply a hierarchical time series clustering algorithm
RQ2. Results
• Discover five evolution patterns of discussion topics among
client developer discussions
RQ2. Results
• Discover five evolution patterns of discussion topics among
client developer discussions
(P1) Persistent topics with the number of discussions DECLINING QUICKLY
(P2) Persistent topics with the number of discussions DECLINING SLOWLY
(P3) Occasional topics with the number of discussions DECLINING QUICKLY
(P4) Occasional topics with the number of discussions DECLINING SLOWLY
(P5) Reoccurring topics
• We conjecture that P2 type are the hardest to deal with
RQ2. Results
• Discover five evolution patterns of discussion topics among
client developer discussions
(P1) Persistent topics with the number of discussions DECLINING QUICKLY
(P2) Persistent topics with the number of discussions DECLINING SLOWLY
(P3) Occasional topics with the number of discussions DECLINING QUICKLY
(P4) Occasional topics with the number of discussions DECLINING SLOWLY
(P5) Reoccurring topics
• We conjecture that P2 type are the hardest to deal with
Only a small proportion (i.e. , 4.94%) of discussions of the topics are in P2, and
the majority (i.e. , 75.45%) of the discussions are P3.
Implications of Our Results
Through the findings in RQ1 and RQ2, we observe:
• most of the discussions from client developers are only limited to a very few topics for a Web API (i.e., on average, over 50% of the discussions are linked with only 5 topics).
• More importantly, some dominant topics appear throughout the timeline (i.e., different releases of a Web API) persistently or reoccurring.
client developers:
feel very frustrated and quit using certain Web APIs.
Implications of Our ResultsWeb API providers:
• can optimize their on-line resources (e.g., documentation, tutorials, and videos),
• updates of future API releases on the dominant topics (especially the ones regarding developers’ concerns and challenges) that appear persistently in a time-efficient way
Client developers:
• can make a better preparation for the dominant topics when using a Web API if they already have the knowledge of the history and evolution patterns of the dominant topics.
Future work
An immediate follow-up would be an in-depth analysis of
the five patterns.
• to map the release cycle to the five patterns and find what
types of modifications in Web APIs trigger the change in
the concerns of client developers.
What Concerns Do Client Developers Have When Using Web APIs?
An Empirical Study of Developer Forums and StackOverflow
Pradeep K. Venkatesh
Dr. Ying (Jenny) Zou Dr. Ahmed E. Hassan
Shaohua (David) Wang Feng Zhang