Icws 2016 v1

32
What Concerns Do Client Developers Have When Using Web APIs? An Empirical Study of Developer Forums and Stack Overflow Pradeep K. Venkatesh Ying Zou Ahmed E. Hassan Shaohua Wang Feng Zhang

Transcript of Icws 2016 v1

Page 1: Icws 2016 v1

What Concerns Do Client Developers Have When Using Web APIs?

An Empirical Study of Developer Forums and Stack Overflow

Pradeep K. Venkatesh Ying Zou Ahmed E. HassanShaohua Wang Feng Zhang

Page 2: Icws 2016 v1

Agenda• Introduction/Background of the work

• Research problem

• What we do?

• Dataset

• Results - RQ’s

• Future work

Page 3: Icws 2016 v1

Service Oriented Computing

Organizations Client Applications

• Popularity of service-oriented computing makes companies

to offer their services through Web APIs

• Client applications extensively use these Web APIs to offer

more value-added services

Page 4: Icws 2016 v1

Research Problem

Web API Evolution

Fix Bugs Optimize PerformanceIntroduce New features

Page 5: Icws 2016 v1

Research Problem

Web API Evolution

Fix Bugs Optimize PerformanceIntroduce New features

More

Frequently

Page 6: Icws 2016 v1

Research Problem

Web API Evolution

Fix Bugs Optimize PerformanceIntroduce New features

More

Frequently

Grace period3 to 6 Months

Page 7: Icws 2016 v1

Research Problem

Web API Evolution

Fix Bugs Optimize PerformanceIntroduce New features

More

Frequently

Grace period3 to 6 Months

Client Developers often have concerns and difficulties adapting Web APIs.

Page 8: Icws 2016 v1

‘SWe Want!

• Know the discussions related to web APIs & understand

the common challenges faced by client developers

Page 9: Icws 2016 v1

‘SWe Want!

• Know the discussions related to web APIs & understand

the common challenges faced by client developers

• Make smart recommendation/suggestions to providers &

client developers

Page 10: Icws 2016 v1

What We Do?

Developer Discussions

Mine and Analyze the Developer Discussions

Page 11: Icws 2016 v1

What We Do?

Developer Discussions

Topic Modelling Technique}

Mine and Analyze the Developer Discussions

Identify Common & Dominant

Discussions

Page 12: Icws 2016 v1

What We Do?

Developer Discussions

Topic Modelling Technique}

Mine and Analyze the Developer Discussions

Identify Common & Dominant

Discussions

Time Series

Clustering

Patterns of Dominant Discussions

Page 13: Icws 2016 v1

DatasetConsists of 32 popular Web APIs from 7 different domains

Business/eCommerce

Data Storage/Sharing

Location Based Services

Mapping Services

Platform/Tools/Utilities

Social Media/Network

Video/Audio Streaming

Page 14: Icws 2016 v1

DatasetConsists of 32 popular Web APIs from 7 different domains

Business/eCommerce

Data Storage/Sharing

Location Based Services

Mapping Services

Platform/Tools/Utilities

Social Media/Network

Video/Audio StreamingTotal Number o

f Disc

ussions

92,471

Page 15: Icws 2016 v1

Research Questions

(RQ1.) What are the most discussed topics related to Web APIs among client developers? (RQ2.) What are the evolution patterns of the most discussed topics?

Page 16: Icws 2016 v1

Research Questions

(RQ1.) What are the most discussed topics related to Web APIs among client developers? (RQ2.) What are the evolution patterns of the most discussed topics?

Page 17: Icws 2016 v1

RQ1. Analysis Approach

Apply Popular Topic Modelling Technique

• Latent Dirichlet Allocation

Page 18: Icws 2016 v1

RQ1. Analysis Approach

Apply Popular Topic Modelling Technique

• Latent Dirichlet Allocation • Pre-process:

• Removing any code elements <code>….</code> • Removing all HTML tags <a>, <i>, <a href=“…”/> • Removing all stop words “a”,”is”,”was” • Applying stemming on english words to get to their

base form. e.g., “programming”, “programmer” to base form of “program”

Page 19: Icws 2016 v1

RQ1. Analysis Approach

Apply Popular Topic Modelling Technique

• Latent Dirichlet Allocation • Pre-process:

• Removing any code elements <code>….</code> • Removing all HTML tags <a>, <i>, <a href=“…”/> • Removing all stop words “a”,”is”,”was” • Applying stemming on english words to get to their

base form. e.g., “programming”, “programmer” to base form of “program”

• Extract 40 Discussions Topics from developer discussions

Page 20: Icws 2016 v1

RQ1. Results

• Only Limited Number of Topics Dominate developer discussions

• Some Dominant Discussions Topics are shared across Web APIs of each category

• Few Dominant Discussion Topics are Shared across categories

• E.g., Authentication/Authorization

On average, the five most discussed topics contribute to over 50% of discussions in each Web API.

Page 21: Icws 2016 v1

Research Questions

(RQ1.) What are the most discussed topics related to Web APIs among client developers? (RQ2.) What are the evolution patterns of the most discussed topics?

Page 22: Icws 2016 v1

RQ2. Analysis Approach Apply Hierarchical Time Series Clustering Algorithm

Page 23: Icws 2016 v1

RQ2. Analysis Approach Apply Hierarchical Time Series Clustering Algorithm

1) we sum up the number of discussions per month for every LDA topic of a Web API;

2) we create a matrix where a row label is a topic of a Web API and a column label is a time unit (i.e., one month);

3) we convert the matrix to a distance matrix using the Autocorrelation

4) apply a hierarchical time series clustering algorithm

Page 24: Icws 2016 v1

RQ2. Analysis Approach Apply Hierarchical Time Series Clustering Algorithm

We identify patterns of discussion topics of Web APIs

1) we sum up the number of discussions per month for every LDA topic of a Web API;

2) we create a matrix where a row label is a topic of a Web API and a column label is a time unit (i.e., one month);

3) we convert the matrix to a distance matrix using the Autocorrelation

4) apply a hierarchical time series clustering algorithm

Page 25: Icws 2016 v1

RQ2. Results

• Discover five evolution patterns of discussion topics among

client developer discussions

Page 26: Icws 2016 v1

RQ2. Results

• Discover five evolution patterns of discussion topics among

client developer discussions

(P1) Persistent topics with the number of discussions DECLINING QUICKLY

(P2) Persistent topics with the number of discussions DECLINING SLOWLY

(P3) Occasional topics with the number of discussions DECLINING QUICKLY

(P4) Occasional topics with the number of discussions DECLINING SLOWLY

(P5) Reoccurring topics

• We conjecture that P2 type are the hardest to deal with

Page 27: Icws 2016 v1

RQ2. Results

• Discover five evolution patterns of discussion topics among

client developer discussions

(P1) Persistent topics with the number of discussions DECLINING QUICKLY

(P2) Persistent topics with the number of discussions DECLINING SLOWLY

(P3) Occasional topics with the number of discussions DECLINING QUICKLY

(P4) Occasional topics with the number of discussions DECLINING SLOWLY

(P5) Reoccurring topics

• We conjecture that P2 type are the hardest to deal with

Only a small proportion (i.e. , 4.94%) of discussions of the topics are in P2, and

the majority (i.e. , 75.45%) of the discussions are P3.

Page 28: Icws 2016 v1

Implications of Our Results

Through the findings in RQ1 and RQ2, we observe:

• most of the discussions from client developers are only limited to a very few topics for a Web API (i.e., on average, over 50% of the discussions are linked with only 5 topics).

• More importantly, some dominant topics appear throughout the timeline (i.e., different releases of a Web API) persistently or reoccurring.

client developers:

feel very frustrated and quit using certain Web APIs.

Page 29: Icws 2016 v1

Implications of Our ResultsWeb API providers:

• can optimize their on-line resources (e.g., documentation, tutorials, and videos),

• updates of future API releases on the dominant topics (especially the ones regarding developers’ concerns and challenges) that appear persistently in a time-efficient way

Client developers:

• can make a better preparation for the dominant topics when using a Web API if they already have the knowledge of the history and evolution patterns of the dominant topics.

Page 30: Icws 2016 v1

Future work

An immediate follow-up would be an in-depth analysis of

the five patterns.

• to map the release cycle to the five patterns and find what

types of modifications in Web APIs trigger the change in

the concerns of client developers.

Page 31: Icws 2016 v1
Page 32: Icws 2016 v1

What Concerns Do Client Developers Have When Using Web APIs?

An Empirical Study of Developer Forums and StackOverflow

Pradeep K. Venkatesh

Dr. Ying (Jenny) Zou Dr. Ahmed E. Hassan

Shaohua (David) Wang Feng Zhang