Emmanuel Cecchet, Robert Sims, Prashant Shenoy and Xin He University of Massachusetts Amherst...

37
Emmanuel Cecchet, Robert Sims, Prashant Shenoy and Xin He University of Massachusetts Amherst mBenchLab Measuring QoE of Web Applications Using Mobile Devices http://benchlab.cs.umass.edu/

Transcript of Emmanuel Cecchet, Robert Sims, Prashant Shenoy and Xin He University of Massachusetts Amherst...

Emmanuel Cecchet, Robert Sims,

Prashant Shenoy and Xin He

University of Massachusetts Amherst

mBenchLabMeasuring QoE of Web ApplicationsUsing Mobile Devices

http://benchlab.cs.umass.edu/

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

2

THE WEB CIRCA 2000(1998)

Wikipedia (2001)

(2000)

(1999)(1995)

(2004)

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

3

THE WEB TODAY…

Amazon.com Tablet: 225 requests

30 .js 10 css 10 html 175 multimedia files

Phone: 15 requests 1 .js 1 css 10 html 8 multimedia files

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

4

THE WEB TODAY…

Wikipedia on Montreal 226 requests 3MB

WEB SITES ARE MORE AND MORE COMPLEX

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

5

BENCHMARKING TOOLS HAVE NOT KEPT UP

Workload definition Application under TestWeb Emulator

HTTP trace

Application under Test

Real DevicesReal BrowsersReal Networks

+

http://...http://...http://...http://...http://...http://...

http://...http://...http://...http://...http://...http://...

BenchLab approach

Traditional approach (TPC-W, RUBiS…)

MOBILES ADD COMPLEXITY

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

6

BROWSERS MATTER FOR QOE?

QoE measurement Old way: QoE = Server +

Network Modern way: QoE = Servers +

Network + Browser Browsers are smart

Parallelism on multiple connections

JavaScript execution can trigger additional queries

Rendering introduces delays in resource access

Caching and pre-fetching HTTP replay cannot approximate

real Web browser access to resources

GET /wiki/page

Analyze page

generate page

GET combined.min.cssGET jquery-ui.cssGET main-ltr.cssGET commonPrint.cssGET shared.cssGET flaggedrevs.cssGET Common.cssGET wikibits.jsGET jquery.min.jsGET ajax.jsGET mwsuggest.jsGET plugins...jsGET Print.cssGET Vector.cssGET raw&gen=cssGET ClickTracking.jsGET Vector...jsGET js&useskinGET WikiTable.cssGET CommonsTicker.cssGET flaggedrevs.jsGET Infobox.cssGET Messagebox.cssGET Hoverbox.cssGET Autocount.cssGET toc.cssGET Multilingual.cssGET mediawiki_88x31.png

Rendering + JavaScript

GET ExtraTools.jsGET Navigation.jsGET NavigationTabs.jsGET Displaytitle.jsGET RandomBook.jsGET Edittools.jsGET EditToolbar.jsGET BookSearch.jsGET MediaWikiCommon.css

0.90s

0.06s

send files

GET page-base.png GET page-fade.png GET border.png GET 1.png GET external-link.png GET bullet-icon.png GET user-icon.png GET tab-break.png GET tab-current.png GET tab-normal-fade.png GET search-fade.png GET search-ltr.png GET wiki.png GET portal-break.png

0.97s

Rendering0.28sGET arrow-down.pngGET portal-break.pngGET arrow-right.png

send files

send files

send files

Rendering + JavaScript

0.67s

0.14s

0.70s

0.12s

0.25s

1.02s

1.19s

1.13s

0.27s

Replay

1

2

3

4

0.25s

3.86s + 2.21s total rendering time1.88s

Total network time

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

7

USE CASES

ResearchersProvide an open source infrastructure for

accurate QoE on mobile devicesBenchmarking of

Real applications & workloads Over real networks Using real devices & browsers

UsersProvide QoE on relevant Web sitesDifferent from a basic SpeedTest

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

8

OUTLINE

The mBenchLab approach

Experimental results

What’s next?

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

9

MBENCHLAB ARCHITECTURE BenchLab

Dashboard Backend application Experiment Client App

mBA

mBA mBA

mBA

Web Frontend

Experiment scheduler

Traces (HAR or access_log)Results (HAR or latency)

Experiment ConfigBenchmark VMs

Exp

erim

ent

star

t/st

op

Tra

ce d

ownl

oad

Bro

wse

r re

gist

ratio

n

Res

ults

upl

oad

http://...http://...http://...

DevicesMetricsHTMLSnapshots…

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

10

Web Frontend

Experiment scheduler

Traces (HAR or access_log)Results (HAR or latency)

Experiment ConfigBenchmark VMs

Exp

erim

ent

star

t/st

op

Tra

ce d

ownl

oad

Bro

wse

r re

gist

ratio

n

Res

ults

upl

oad

RUNNING AN EXPERIMENT WITH MBENCHLAB

1. Upload traces 2. Define

experiment3. Start experiment

1. Start the BenchLab Dashboard and create an experiment

2. Start the mBenchLab App on the mobile device

3. Browser issues HTTP requests and App performs QoE measurements

4. Upload collected results to Dashboard

5. Analyze results

Detailed Network and Browser timings

Play trace

Uploadresults

• View results

• Repeat experiment• Export

setup/traces/ VMs/results

mBA

mBA mBA

mBA

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

11

MBENCHLAB ANDROID APP (MBA)

Issue HTTP requests in native Android browser Collect QoE measurements in HAR format Upload results to BenchLab Dashboard when done

mBenchLab Android App

Native Android browser

Selenium

Android driver

HAR recording

proxy

mBenchLab runtime GPS

Network

Storage trace

HAR #1

snap-shot HAR

#1 HAR #1

Wifi

3G/4G

Cloud Web services

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

12

MBA MEASUREMENTS QoE

Overall page loading time including Javascript execution and rendering time

Request failure/success/cache hit rate HTML correctness Rendering overview with screen snapshots

QoE Overall page loading time including Javascript

execution and rendering time Request failure/success/cache hit rate HTML correctness Rendering overview with screen snapshots

Network DNS resolution time Connection establishment time Send/wait/receive time on network connections

QoE Overall page loading time including Javascript

execution and rendering time Request failure/success/cache hit rate HTML correctness Rendering overview with screen snapshots

Network DNS resolution time Connection establishment time Send/wait/receive time on network connections

Device Hardware and software configurations Location (optional)

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

13

OUTLINE

The mBenchLab approachExperimental resultsWhat’s next?

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

14

EXPERIMENTAL SETUP Desktop

MacBook Pro using Firefox

Tablets Trio Stealth Pro ($50 Android 4 tablet) Kindle Fire

Phones Samsung S3 GT-I9300 (3G) Motorola Droid Razr (4G) HTC Desire C

Traces Amazon Craigslist Wikipedia Wikibooks

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

15

QOE ON DIFFERENT WEB SITES

Web sites can be very dynamic Amazon content is very dynamic (hurts repeatability) Craigslist is very simple and similar across platforms

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

16

INSTRUMENTATION OVERHEAD Hardware matters

Single core underpowered hardware shows ~3s instrumentation overhead on Wikipedia pages

Modern devices powerful enough to instrument with negligible overhead

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

17

QOE ON DIFFERENT MOBILE NETWORKS

Quantify QoE variation between Wifi vs Edge vs 3G vs 4G Performance varies based on location, network provider,

hardware…

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

18

IDENTIFYING QOE ISSUES

Why is the page loading time so high?

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

19

QOE BUGS: THE SAMSUNG S3 PHONE Number of HTTP requests and page sizes off for

Wikipedia pages

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

20

QOE BUGS: THE SAMSUNG S3 PHONE Bug in srcset implementation

<img src="pear-desktop.jpeg" srcset="pear-mobile.jpeg 720w, pear-tablet.jpeg 1280w" alt="The pear">

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

21

OUTLINE

The mBenchLab approach

Experimental results

What’s next?

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

22

RELATED WORK

Benchmarking tools have not kept up RUBiS, TPC-*: obsolete backends Httperf: unrealistic replay

Commercial benchmarks Spec-* Not open nor free

Mobility adds complexity Networks

Hossein et al., A first look at traffic on smartphones, IMC’10 Hardware

Hyojun et al., Revisiting Storage for Smartphones, FAST’12 Thiagarajan et al., Who killed my battery?, WWW’12

Location Nordström et al., Serval: An End-Host stack for Service

Centric Networking, NSDI’12

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

23

SUMMARY AND FUTURE WORK

Real devices, browsers, networks and application backends needed for modern WebApp benchmarking

mBenchLab provides For researchers: Infrastructure for Internet scale

Benchmarking of real applications with mobile devices For users: Insight on QoE with real Web sites rather

than a simple SpeedTest Larger scale experiments

More users, devices, locations More Web applications and traces

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

24

Q&A

SOFTWARE, DOCUMENTATION, RESULTS:http://benchlab.cs.umass.edu/

WATCH TUTORIALS AND DEMOS ON YOUTUBE

BONUS SLIDES

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

26

RELATED WORK

Hossein et al., A first look at traffic on smartphones, IMC’10 Majority of traffic is Web browsing 3G performance varies according to network provider Mobile proxies improve performance

Hyojun et al., Revisiting Storage for Smartphones, FAST’12 Device storage performance affects browsing experience

Thiagarajan et al., Who killed my battery?, WWW’12 Battery consumption can be reduced with better JS and CSS Energy savings with JPG

Nordström et al., Serval: An End-Host stack for Service Centric Networking, NSDI’12 Transparent switching between networks Location based performance

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

27

WEB APPLICATIONS HAVE CHANGED

Web 2.0 applicationso Rich client interactions (AJAX, JS…)o Multimedia contento Replication, caching…o Large databases (few GB to multiple TB)

Complex Web interactionso HTTP 1.1, CSS, images, flash, HTML 5…o WAN latencies, caching, Content Delivery Networks…

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

28

EVOLUTION OF WEB APPLICATIONS

Applications HTML CSS JS Multimedia Total

RUBiS 1 0 0 1 2

eBay.com 1 3 3 31 38

TPC-W 1 0 0 5 6

amazon.com 6 13 33 91 141

CloudStone 1 2 4 21 28

facebook.com 6 13 22 135 176

wikibooks.org 1 19 23 35 78

wikipedia.org 1 5 10 20 36

Number of interactions to fetch the home page of various web sites and benchmarks

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

29

TYPING SPEED MATTERS

Auto-completion in search fields is common Each keystroke can generate a query Text searches use a lot of resources

GET /api.php?action=opensearch&search=WGET /api.php?action=opensearch&search=WebGET /api.php?action=opensearch&search=Web+GET /api.php?action=opensearch&search=Web+2GET /api.php?action=opensearch&search=Web+2.GET /api.php?action=opensearch&search=Web+2.0

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

30

STATE SIZE MATTERS

Does the entire DB of Amazon or eBay fit in the memory of a cell phone? TPC-W DB size: 684MB RUBiS DB size: 1022MB

Impact of CloudStone database size on performance

Dataset size

State size (in GB)

Database rows

Avg cpu load with 25 users

25 users 3.2 173745 8%100 users

12 655344 10%

200 users

22 1151590 16%

400 users

38 1703262 41%

500 users

44 1891242 45%

CloudStone Web application server load observed for various dataset sizes using a workload trace of 25 users replayed with Apache HttpClient 3.

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

31

OUTLINE

What has changed in WebAppsBenchmarking real applications with BenchLab

Experimental results

Demo

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

32

3 options to record traces in HTTP Archive (HAR) format directly in Web browser at HA proxy load balancer level using Apache httpd logs

Internet

Frontend/Load balancer

Databases

App.Servers

RECORDING HTTP TRACES

HA Proxy recorder

httpd recorder

Recording in the Web browser

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

33

WIKIMEDIA FOUNDATION WIKIS

Wikimedia Wiki open source software stack Lots of extensions Very complex to setup/install

Real database dumps (up to 6TB) 3 months to create a dump 3 years to restore with default tools

Multimedia content Images, audio, video Generators (dynamic or static) to avoid copyright issues

Real Web traces from Wikimedia Packaged as Virtual Appliances

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

34

COMPLEXITY BEHIND THE SCENE

Browsers are smart Caching, prefetching, parallelism… Javascript can trigger additional requests

Real network latencies vary a lot Too complex to simulate to get accurate QoE

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

35

BENCHLAB DASHBOARD

Upload traces / VMsDefine and run

experimentsCompare resultsDistribute

benchmarks, traces, configs and results

http://...http://...http://...http://...http://...http://...

http://...http://...http://...http://...http://...http://...

Web Frontend

Experiment scheduler

Traces (HAR or access_log)Results (HAR or latency)

Experiment ConfigBenchmark VMs

Exp

erim

ent

star

t/st

op

Tra

ce d

ownl

oad

Bro

wse

r re

gist

ratio

n

Res

ults

upl

oad

JEE WebApp with embedded database Repository of benchmarks and traces Schedule and control experiment execution Results repository Can be used to distribute / reproduce

experiments and compare results

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

36

OPEN VS CLOSED

Open Versus Closed: A Cautionary Tale – B. Schroeder, A. Wierman, M. Harchor-Balter – NSDI’06 response time difference between open and close can

be large scheduling more beneficial in open systems

mB

ench

Lab

– c

ecch

et@

cs.u

ma

ss.e

du

37

SUMMARY AND FUTURE WORK

Larger scale experiments More users, devices, locations More Web applications and traces

Social and economical aspects User privacy Cost of mobile data plans Experiment feedback (to and from user)

Automated result processing Anomaly detection Performance comparison

Server side measurements with Wikipedia Virtual appliances Mobile proxy

Software distribution for other researchers to setup their own BenchLab infrastructure