Sacriﬁcing Efﬁciency for Quality of Experience: YouTube’s … · 2016. 4. 12. ·...

Sacrificing Efficiency for Quality of Experience:YouTube’s Redundant Traffic Behavior

Christian Sieber∗, Poul Heegaard†, Tobias Hoßfeld‡, Wolfgang Kellerer∗∗Chair of Communication Networks, Technical University of Munich, Germany

c.sieber, [email protected]†Norwegian University of Science and Technology, Trondheim, Norway

[email protected]‡Universitat Duisburg-Essen, Germany

[email protected]

Abstract—Internet traffic reports show that YouTube is oneof the major sources of data traffic world-wide. Furthermore,the data traffic shifts from mostly fixed landlines to cellulardata connections where bandwidth is sparse and expensive.Previous studies revealed that YouTube uses a user-friendlyHTTP Adaptive Streaming (HAS) strategy which sacrifices band-width efficiency to increase the average playback quality forthe user. That way, it happens that the same video segmentis transmitted in two or more quality levels, but only one canbe shown to the user. We denote this as redundant traffic andthis work is dedicated to understanding the influence factorson the amount of redundant traffic. This paper presents theresults of a large-scale study with over 12,000 video views overa bottleneck link shaped to various bandwidths.We first evaluatethe playback characteristics and show that YouTube’s HASalgorithm linearly increases the average playback quality with theavailable bandwidth while at the same time video buffering is sub-linearly decreased. Furthermore, we identify video-dependentbandwidths which optimize the playback time on a quality level.Afterward, we show that this is achieved by discarding lowerlayer segments and therefore paid with redundant traffic of upto 40 %. We evaluate the overall efficiency of the system and showthat YouTube is able to improve the average quality level by upto 0.7 quality levels by using this adaptation strategy. However,a penalty of 0.5 quality levels is paid for it due to the discardeddata of the lower quality segments.

I. INTRODUCTION

Video streaming services in the Internet gain more andmore importance. The offered content and services for videoon-demand (VoD) including live streaming exhibit a strongincrease in popularity. By now, Internet traffic reports accountVoD for the largest single type of consumer traffic in theInternet for landline and cellular access [1].

HTTP over TCP has become the de-facto standard fordelivery of the VoD content. The HTTP protocol is firewall-friendly and easy to implement. Furthermore, large contentdelivery networks (CDN) are in place to distribute HTTPcontent globally and provide the content close to the users.At the beginning, VoD over HTTP was implemented byprogressive download [2]. With progressive download, a HTTPserver provides a single file with the content in a specificquality level. The player, e.g., the browser, starts to downloadthe file and at the same time, or after an initial buffering

phase, begins to play the video to the user. However, in caseof insufficient network bandwidth, the video buffer depletesand the video stalls, which significantly decreases the Qualityof Experience (QoE) of the end user [3].

Progressive download does not allow to adapt to currentnetwork conditions or to specific devices, e.g., phone or TVscreen. By now, progressive download is being replaced byHTTP Adaptive Streaming (HAS). See [2] for a survey onHAS. HAS encodes the content into different quality levels,segments it into small chunks of a few seconds and makes itavailable over HTTP. A manifest file describing the content,e.g., video codec, number of quality levels and chunk location,is placed alongside the segments on the HTTP server. At firstthe client requests the manifest file and afterward downloadsthe content segments in a quality level chosen by the client.Dynamic Adaptive Streaming of HTTP (DASH) is an ISOstandard describing the structure of the manifest file and isin use by some of the major content providers. However, thequality level adaptation logic, which dictates how the clientshould adapt the quality, is not part of the standard and isleft to be decided by the implementation of the streamingclient. Factors to consider for the adaptation are for examplethe viewing device, the available download bandwidth andthe video bit-rates of the quality levels. As every client canimplement their own adaption algorithm, the QoE of the userdiffers between content providers and streaming clients. TheQoE of adaptive streaming is an active research topic [2]. QoEstudies show that stalling must be avoided [3], [2] and that thequality switching rate should be minimized [4]. The averagequality level is a major influence factor for the QoE in adaptivestreaming [4].

In this paper we take a closer look at one of the majorsources of video streaming traffic [5] in the Internet, YouTube.In a first pilot study [6] we have shown that YouTube’sadaptation algorithm replaces previously downloaded lowerquality segments. However, this introduces overhead, denotedas redundant traffic, as the lower quality segments are dis-carded. Hence, there is a trade-off between network efficiencyand average playback quality that also affects the Quality ofExperience of the user. Due to the limited scope of the study(four videos in total, 150 views), no reliable conclusions couldISBN 978-3-901882-83-8 c© 2016 IFIP

be drawn with respect to possible influence factors.In this paper, we tackle two important questions. First, when

does redundant traffic occur and how much redundant trafficis transferred over the network? Second, what is the overallefficiency of this approach, i.e., how much of the additionaltraffic can actually be used to improve the average qualitylevel and how much has to be paid as penalty for replacinglower layer segments? To answer this, over 12,000 video viewswere recorded in our test-bed under different emulated networkbottlenecks. We first describe the playback characteristicsas observed for the different network conditions from theend user’s point of view. Afterward we relate the playbackcharacteristics to the recorded redundant traffic and show thatthe amount of redundant traffic can be estimated based on theplayback characteristics. At the end, we evaluate the overallefficiency by comparing the quality gain due to the adaptationstrategy with an estimation of the maximum achievable qualitylevel based on the amount of total downloaded data in thesession.

This paper is structured as follows. We first describe thebackground and related work in Section II. Here we alsoexplain the behavior which leads to the redundant traffic indetail. In Section III we introduce the measurement method-ology and explain the measurement set-up. In Section IV wediscuss general playback characteristics observed in the study.In Section V we first evaluate the relationship between theobserved playback characteristics and the overhead introducedby discarding lower quality segments. Afterward we evaluatethe overall efficiency of the adaptation strategy considering theoverhead. In Section VI we conclude the work and give futureresearch directions.

II. BACKGROUND & RELATED WORK

In the following we first introduce HTTP adaptive streamingin general. Afterward we discuss the results of our previousstudy where we do a first characterization of the adaptionbehavior of YouTube. At the end we revisit the related researchto this topic.

A. HTTP Adaptive Streaming

HAS enables client-driven adaptation of the quality levelto device capabilities such as screen size, resolution andstereoscopic capabilities. Furthermore, by switching to dif-ferent quality levels, the client can adapt the video bit-rateto the available bandwidth and therefor adapt to dynamicallychanging network conditions such as observed for example incellular environments. HAS is implemented by encoding thecontent into multiple representations, e.g. different quality lev-els, segmenting it in small chunks (for YouTube: 10 s to 20 s,depending on the video) and making the segments availableto the client through the HTTP protocol. The ISO standardMPEG-DASH is a widely accepted HAS standard adopted byYouTube. DASH defines an XML-based Media PresentationPresentation (MPD) file which describes the representations,e.g., average bit-rate or codec used, and gives the URL to theindividual segments. At the beginning of a streaming session,

the client requests the MPD file. Afterward the implementationon the client-side decides which chunks to request from whichrepresentation. As every content provider can implement theirown adaption strategy, the experience for the user dependsnot only on the offered representations, but also on the abilityof the adaption strategy to request in chunks in a user-friendly way. Evaluations show that the adaptation algorithmhas a strong influence on the resulting playback behavior andtherefore also on the perceived QoE of the user [7]. It has tobe noted that YouTube also allows the end user to manuallyselect a fixed quality level which deactivates HAS even if itresults in frequent buffering. However, the default setting isthe automated quality adaptation.

B. YouTube’s Quality Adaptation Strategy

In our previous study [6] we describe the behavior ofYouTube where the streaming client replaces previously down-loaded chunks with higher quality ones. Figure 1 illustrates arequest schedule for one of the experiment runs. At first, theYouTube player requests four segments of quality level 144pof a playback time up one minute and all four requests aremade in the first 20 seconds of the experiment. At about 21seconds into the experiment, the player revises its decision andreplaces the segment containing the playback time 30 s to 45 swith a quality level of 240p. Afterward the player switchesback to 144p and downloads playback time 60 s to 90 s inquality level 144p. Later in the experiment, the player switchesup to a quality level of 480p by replacing 360p and 144psegments. The figure illustrates that some chunks of contentare downloaded even more than twice.

0 20 40 60 80 100 120

Requested Video Interval (s)

0

20

40

60

80

100

Tim

eof

Req

uest

(s)

κ = 1144p240p360p480p

Figure 1. Example request schedule. The shaded areas indicate where lowerquality segments are replaced by higher quality segments. For the first minuteinto the playback, 30 seconds of 144p are replaced with 15 seconds of 240pand 15 seconds of 360p to increase the average playback quality.

From the request schedule it becomes obvious that theadaption strategy tries to optimize the average playback qualityshown to the user. Furthermore, by constantly downloadingchunks, the algorithm prevents stalling of the underlying TCPconnection. In a previous study [7] we show that by constantlyutilizing the TCP connection a HAS strategy can keep the fairshare of the network bandwidth provided by TCP adaptivebehavior. Furthermore, adaptation strategies that underutilize

the TCP connection, e.g., if the playback buffer is full, have adistinct disadvantage compared to strategies which constantlyutilize the TCP connection.

C. Related Work

In [8], Anorga et al. present a recent study about YouTube’sHAS adaptation and review previous studies. Their findingsshow that YouTube uses a large playback buffer and thereforereacts only slowly to changing bandwidth conditions (13 s to40 s). In [9], Yao et al. evaluate different sources of redundanttraffic in video streaming services. They show how the iOSYouTube player requests overlapping segments to smoothenthe playback. The amount of redundant traffic is not evaluated.In [10], Lui et al. evaluate the difference between Androidand iOS-based YouTube media streaming. They conclude thatbuffering is not dependent on the playback time, but onthe amount of data buffered. Furthermore, they quantify aredundant traffic of 15 % and account it for re-downloadingthe beginning of the video. Mansy et al. [11] analyze thestreaming behavior of the three streaming providers includingYouTube in terms of their playback characteristics, redundanttraffic and bandwidth utilization. The authors observe thatYouTube aggressively discards segments of lower quality lev-els to download higher bit-rate segments when the bandwidthincreases. In the study, one video is evaluated in a wirelessscenario with varying bandwidths (every 2 minutes a newlink bandwidth is set). For this scenario, the authors concludea percentage of redundant traffic of 16 %. The study alsoshows that other content providers deploy similar adaptationstrategies. Nam et al. [12] show that in a mobile scenario morethan 35 % of transferred data by YouTube is redundant. Theauthors account frequent termination of TCP connections anddiscarded on-fly packets as the case of the redundant traffic.Rao et al. [13] and Ito et al. [14] model the traffic patternsproduced by a YouTube streaming session. Their findingssuggest that the implementation of the adaptation strategy andtherefor the resulting behavior varies with the type of theviewing device. Alcock et al. [15] describe the initial burstphase deployed by YouTube. In the initial burst phase, 32 sof playback time are sent to the client a fast as possible. Wealso observe this initial burst phase and account it for a sourcefor redundant traffic, as a low quality initial burst phase is insome cases later replaced by higher quality segments.

This work extends the state-of-the art and presents the firstlarge-scale study which statistically quantifies the amount ofredundant traffic. Further, we relate redundant traffic to QoEinfluence factors and are able to quantify the QoE loss bydownloading redundant traffic instead of higher quality levels.

III. METHODOLOGY

In the following we first discuss the experimental set-up used to collect the results. Afterward we introduce thenotation and metrics evaluated in this work. At the end wegive a description of how we selected the content and whichcharacteristics the selected content exhibits.

Virtual Machine

Virtual Network

Internet

Virtual Host

Network Monitoring

Application Monitor

Shaping

Service

Proxy

Figure 2. Experimental set-up based on a virtual machine (VM) and virtualnetwork with browser-based and in-network-based monitoring.

A. Experiment Set-up

Figure 2 illustrates the experiment set-up. The set-up con-sists of a virtual machine with Xubuntu 14.04 64-bit runninga browser (Firefox 21) with the YouTube player, an HTTPSproxy inside the virtual machine and a virtual network whichlimits the available bandwidth. The set-up is connected to theInternet through a lightly utilized lab network and throughthe university’s Internet connection. Internal traffic reports andmonitoring of experiment download speed rule out externalbottlenecks. Since the beginning of 2015, YouTube uses theencrypted HTTPS protocol for video delivery. In order tostill being able to decrypt the traffic in the network, weinject a custom certificate into the browser and mark it astrustworthy for the domains used by YouTube. Furthermore,we use a customized version of mitmproxy [16] in order tointercept, capture and decrypt the YouTube traffic. The proxywas customized to quickly forward all incoming traffic withoutbuffering and tested to rule out any performance issue asinfluence factor on the adaptation behavior.

In order to capture the player state and the HTTP requestsin the network, we use browser-based and network-basedmonitoring. We implemented browser-based monitoring byembedding the YouTube video into a custom web-page anduse an extension for the browser [17] to access the API of theplayer to monitor buffering events and quality switches. Fornetwork monitoring, we use the described proxy to capture anddecrypt the HTTPS requests. In order to translate between thebyte range requests and playtime seconds, we download thevideos [18] in all evaluated quality levels and afterward decodethe downloaded mp4 containers [19].

Before every experiment run, the virtual environment isreset to a default state. During the playback, the status ofthe experiment is constantly monitored and if the video wasnot played out until the end, the experiment run is discarded.A detailed description including the whole experimental set-up is available together with the raw and summarized tracesonline [20]. We encourage others to play with the data anddo their own evaluations. At total, the traces and summarizedresults for 12,075 runs are available online (35 videos × 27bandwidth values × 15 replications).

Table IKEY VARIABLES AND NOTATIONS USED IN THE PAPER.

notation meaning

B downloaded and played bytesBT total downloaded bytesρ Redundant Traffic Ratio (RTR)Q number of quality levelsQ set of quality levels; Q = q0; qQ−1q quality level, q ∈ QV number of videosv video indexR number of replicated downloads (of each video v)r replication indexF number of bandwidth levelsf bandwidth levelf∗ rescaled bandwidth levelnv number of segments in video vτv (fixed) segment play time of video vxij quality index indicator; 1 if segment i is downloaded at

quality level j, 0 otherwisesij size of segment i for video quality level jJ average quality levelJ+i maximum quality level downloaded of segment i;

J+i = max j|xij > 0

J−i minimum quality level downloaded of segment i;J−i = min j|xij > 0

bi number of buffer event in segment iψ buffering rate for bandwidth ftfv buffering time for video v and bandwidth fTfq average relative time/probability on quality level q in a

video sequence with bandwidth fγ average bit-rateφi 1 if quality level has switched from segment i− 1 to i, 0

otherwise;w average number of switchesε overall efficiencyκ buffering ratio

B. Metrics

The notation used in the paper is summarized in Table I. Inthe following we define the key performance metrics as usedin our analysis. Efficiency ε is defined in Chapter V.

The average bit-rate per quality level q of video v

γqv =1

nvτ

nv∑i=1

siqv (1)

Total downloaded bytes in video sequence v:

BT =

nv∑i=1

Q−1∑j=0

xijsij (2)

Total played bytes in a session under the assumption thatalways the maximum quality level downloaded J+ is played:

B =

nv∑i=1

siJ+i

(3)

Each bandwidth level f , replication r and video v has aunique Bfrv and a corresponding average quality level Jfrv .The average quality level with played bytes B is denoted JBand is the average over the video sequences with Bfrv = B.

The average played quality level for bandwidth f :

Jf =1

RV

R∑r=1

V∑v=1

1

nv

nv∑i=1

J+i frv (4)

The average buffering event rate for bandwidth f :

ψf =1

τRV

R∑r=1

V∑v=1

1

nv

nv∑i=1

bifrv (5)

The average quality switching rate for bandwidth f :

wf =1

τRV

R∑r=1

V∑v=1

1

nv

nv∑i=1

φifrv (6)

The estimated probability of playtime on quality level q ina video sequence for bandwidth f :

Tfq =1

RV

R∑r=1

V∑v=1

1

nv

nv∑i=1

xiqfrv (7)

The buffering ratio κ (8) is the ratio between the experimentrun-time and the duration of the video:

κfrv =nvτv + tfrv

nvτv(8)

Buffering ratio κ = 1 means that no buffering happenedduring the experiment. κ for a specific bandwidth f is definedas κf = 1

RV

∑Rr=1

∑Vv=1 κfrv .

C. Redundant Traffic Ratio (RTR)

The redundant traffic is defined as the ratio between thedownloaded bytes that are discarded (BT −B) and the bytesthat are played (B). This gives the overhead of the playbackwith the data volume of only the played out segments asreference. The Redundant Traffic Ratio (RTR) for one singledownload is then

ρ =BT −B

B(9)

For download at a specific bandwidth factor f , the index isadded to BT in Eq. (2) and to B in Eq (3) and the redundanttraffic ratio ρ is updated accordingly.

D. Content

In order to represent the variety of videos uploaded toYouTube, we use the YouTube API provided by Google toautomatically select suitable videos. We define 5 categories”minecraft”, ”music”, ”funny cats”, ”gopro”, ”game” by usingthe category name as search query string. Furthermore, wefilter the query results by the following criteria. The selectionof videos considers the following aspects: 1) embeddable,i.e., use in HTML iframe allowed, 2) syndication is allowed,3) available in high resolutions and 4) videos which werepublished between 1 and 9 month ago. The query result issorted by popularity, i.e., view count, and from the result weselect videos with a duration of 1, 2, ... , 10 minutes with anallowed deviation of 5 seconds. In total, 35 different videoswere accessible during the whole experiment time and are

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Video bit-rate (Mbps)

0.0

0.2

0.4

0.6

0.8

1.0

CD

F

144p

240p 360p

480p

Figure 3. CDF of the quality levels of the selected video sequences. Means:144p: 0.11 Mbit/s, 240p: 0.24 Mbit/s, 360p: 0.37 Mbit/s, 480p: 0.72 Mbit/s.

included in the evaluation. Average duration of selected videosis 5.3 minutes.

Figure 3 shows the cumulative distribution function(CDF) of the bit-rates of the four quality levels, Q =144p, 240p, 360p, 480p for the selected videos. Note that theclient player uses some decision logic to select a subset of theavailable resolutions on YouTube’s servers based on the typeof device, e.g. based on the screen size. In our environment thiswas the subset Q. The average bit-rate for each quality level iscalculated by Eq. 1, then γ144p=0.11 Mbit/s, γ240p=0.24 Mbit/s,γ360p=0.37 Mbit/s, and γ480p=0.72 Mbit/s. Note that although360p has a higher number of pixels than 240p, approximately18 % of the videos in quality level 360p are encoded with alower average bit-rate then the average bit-rate for quality level240p. YouTubes encoding of (cover art) music videos leadsto higher data volume for lower resolutions than for higherresolutions. The reason for this is not clear as it depends onthe YouTube internal encoding of those videos. Please notethat we nevertheless include those videos in the first part ofthe evaluation as they are part of the content mix observed onthe platform. For the evaluation of the efficiency, we excludethose videos.

IV. VIDEO PLAYOUT AND ADAPTATION CHARACTERISTICS

This section presents key characteristics of the adaptationobserved in our measurements for different bandwidths f . Thisincludes the average quality level Jf , quality switching ratewf , buffering rate ψf , and the probability of a playback Tfqat quality q in a session. If not otherwise stated, error bars inthe figures depict the 95 % confidence interval.

Figures 4 (a) and (b) illustrate how the average quality Jf ,the buffering rate ψf , the quality switching rate wf and theredundant traffic ratio ρf develops for increasing bandwidth ffor video v = CbhnuRhbC, which shows scenes from a videogame with picture-in-picture commentary. The average bit-ratefor the four quality levels are γ =0.11 Mbps, 0.25 Mbps,0.43 Mbps, 0.84 Mbps.

Figure 4(a) shows the average quality J and the bufferingrate ψ. The buffering rate decrease from 0.8 [min−1] at0.8 Mbps to 0 at 2.5 Mbps. The buffering rate has two peaksfor ρ. The average quality level increases approximately linearfrom 0.5 at 0.8 Mbps to the maximum of 3 at 2.5 Mbps.

0.5 1.5 2.5Bandwidth f (Mbps)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Avg

.Qua

lity

J

Quality

0.0

0.2

0.4

0.6

0.8

1.0

Buf

feri

ngev

ents

ψ([

min−

1])

Buffering

(a) Buffering rate and average quality

0.5 1.5 2.5Bandwidth f (Mbps)

0.0

0.5

1.0

1.5

2.0

Qua

lity

Switc

hes

w([

min−

1]) Switches

0

10

20

30

40

50

60

RT

Rρ

(%)

RTR

(b) Quality switching rate and RTR

Figure 4. Average playback quality Jf , buffering rate ψf [min−1], qualityswitch rate wf [min−1] and RTR ρf for video v = CbhnuRhbX5I. 95 %confidence intervals over 15 runs are indicated.

In Figure 4(b) we observe that for bandwidth f ≤ 1.6 Mbps,the RTR is approximately ρ = 20 %, except for two peaks of30 % at both 0.9 Mbps and 1.5 Mbps. With f >1.6 Mbps, theRTR decreases linearly until it reaches zero at about 2.5 Mbps.The quality switching rate has a linear decreasing trend fromapproximately 1.5 [min−1] down to 0 [min−1] at 2.5 Mbps.The same two peaks at 0.9 Mbps and 1.5 Mbps as for the RTRρf are also observed for wr. Next we summarize the resultsover all videos per bandwidth f .

0.5 1.0 1.5 2.0 2.5

Bandwidth f (Mbps)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Eve

nts

[min−

1]

Switches

Buffering

κ f

1.0

1.2

1.4

1.6

1.8

2.0

κf

Figure 5. Buffering events [min−1], switching events [min−1] and bufferingratio κf . κf and buffering rate decreases non-linear. Switching rate decreaseslinear.

Figure 5 shows the buffering rate, the switching rate and the

average buffering ratio κ for the different bandwidth valuesf and averaged over all videos. The buffering ratio κ (8) isthe ratio between the experiment run-time and the duration ofthe video. κ = 1 means that no buffering happened during theexperiment. In our set-up, the browser start-up time is includedin the experiment time. The minimum value κ can reach isabout 1.05, depending on the video. The figures show that thebuffering rate and average buffering ratio decreases non-linearwith increasing f , while the corresponding decrease in qualityswitching rate is approximately linear. At the lowest evaluatedbandwidth 0.4 Mbps, we observe an average buffering rate of1.4 [min−1] and an average of buffering time close to twotimes the duration of the video. The quality switching rate is onaverage between 1.75 [min−1] and 2.0 [min−1] for 0.4 Mbpsto 0.5 Mbps. At about 2.6 Mbps, the three metrics reach theirminimum of zero for the buffering and switching eventsand about 1.07 for the ratio between experiment and videoduration. From the figure we conclude that for a bandwidthof 2.6 Mbps all videos in our result set are, on average,played back without buffering events and quality switches.Furthermore we see that switching events are more frequentthan buffering events and decrease slower for increasing f .

0.5 1.0 1.5 2.0

Bandwidth f (Mbps)

0.0

0.2

0.4

0.6

0.8

1.0

Rel

ativ

ePl

ayba

ckTi

me

T fq

144p

240p 360p 480p

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Ave

rage

Qua

lity

J f

Figure 6. Average quality level and probability of playout time on qualitylevel Tfq (

∑q∈ΩI

Tfq = 1).

In Figure 6, the average quality level is depicted on theaxis to the right. The axis on the bottom gives the bottleneckbandwidth in the range of f = (0.4, 2.2)Mbps. The shadedareas indicate the average fraction of time spent on each ofthe quality levels for a specific bandwidth, i.e., the relativeplayback time (scale on the left axis). Two key observationscan be made from the figure. First, for each quality levelthere is a distinct bandwidth range where one of the fourdifferent quality levels dominates. This is illustrated by Tfqfor q = 0, 1, 2, 3 which have their maximum at differentbandwidths f . Second, even at a low data rate as f = 0.5, theT0.5,4 = 0.2, which means that in 20 % of the time the highestquality level is viewed to the user. This can be explained bythe fact that some of the videos, especially music videos withstatic cover, have a low average bit-rate for the higher qualitylevels and therefor can be selected in the highest quality leveleven if the available download-rate is low. The figures also

illustrate for which range of f the different qualities dominateand when we switch from one level to the next higher orlower quality level. Based on Figure 6 we cannot identify aclear relationship between the bandwidth and the probabilityof selecting a specific quality level as the regions are heavilyoverlapping. This is due to the high variance in video bit-ratesas shown in Figure 3.

To align the probabilities of the four quality levels, Tvq ,we scale the bandwidth f with a quality and video dependentbandwidth factor f∗vq = f/γvq . In Figure 7 we plot the relativeplayback time Tvq for each experiment run and for all fourquality levels q using the rescaled bandwidth f∗vq . The errorbars indicate the confidence interval of 95 % for each of thebins. From this we observe that q1 = 240p, q2 = 360p andq3 = 480p reach their maximum Tfvq at approximately f∗q =3. This is indicated by the vertical (red) line. For 240p and360p the maximums are close to 60 %, while for 480p themaximum is 100 %. For 144p, the maximum of about 55 %is reached at f∗1 = 3.8. There are no samples for f∗1 < 3.8available, which corresponds approximately to a bandwidth of0.4 Mbps. Hence, the maximum is reached for all quality levelsat f∗ = 3, independent of the quality level q. It can be read as;when three times the average bit-rate of a certain video v of acertain quality level q is available, this quality level dominates.If we have more, we switch to a higher level, if it exists. Notethat we compare here the link bandwidth with the raw videobit-rate, without any overheads (e.g. IP, HTTP, TCP or videocontainer overhead) and without the audio stream. Therefor,the absolute value where Tvq reaches its maximum may varydepending on the scenario. But, the results show that thereis a (narrow) range of bandwidths for each video where eachquality level reaches a maximum probability of being playedout to the user. This suggests, that the adaptation algorithmof YouTube uses the average bit-rate of a quality level of avideo and compares it with the available network bandwidth todetermine which quality level to select next. The results alsoreveal that although the network bandwidth remains constantquality switches occur and the video is watched in differentquality levels. In the next chapter we discuss how playbackcharacteristics affect on the redundant traffic.

V. EVALUATION OF ADAPTATION EFFICIENCY

In the previous section we characterized the playbackbehavior from the perspective of the user (average qualitylevel, buffering and switching events) and discussed how theavailable bandwidth influences the playback behavior. Next,we put the evaluated playback characteristics in perspectiveto the amount of redundant traffic downloaded in the back-ground, unnoticed by the viewer, and subsequently discussthe effectiveness of this adaptation approach. We do this byintroducing a metric which summarizes the quality gain dueto the adaptation on the one side and the penalty introducedby the redundant traffic on the other side.

0 2 4 6 8 10Rescaled Bandwidth f ∗vq

0.0

0.2

0.4

0.6

0.8

1.0R

elat

ive

Play

back

Tim

eT q 480p

144p

240p360p

Figure 7. The probability of each playback quality level Tf∗vq

as a functionof the rescaled bandwidth f∗vq . The vertical line at f∗vq = 3 indicates wherewe find the peak for all quality levels.

A. Relationship of Playback Characteristics to Redundancy

In Figure 8 we take a closer look at the peak of relativeplayback time as shown in Figure 7 from the perspective ofthe RTR based on the rescaled bandwidth f∗vq . The average bit-rate of each of the quality levels q is given as a reference andcan be used to get an estimate of the bandwidth f based on f∗q(f = γq × f∗q ). However, this neglects the standard deviationof the bit-rates of the videos. The figure shows that for 240pand 360p, high Tq translates to about 30 % of RTR. For 144p,a RTR of 20 % for 3.6 of rescaled bandwidth is observed.The highest quality level 480p exhibits a RTR of 10 % at thepoint where it reaches close to 100 % of the playback time.We conclude from the figure that the peak of the estimatedprobability of playtime on a quality level q, Tfq , does nottranslate to a stable, i.e., sequential, quality selection behavior.On the contrary, the RTR for 240p and 360p show that theplayer downloaded up to 40 % of data more than it showed tothe user.

Figure 9 depicts the switching and buffering rate forincreasing values of RTR. K-means clustering is used togenerate bins of RTR in the figure. The error bars indicatethe 95 % confidence interval for a cluster. The shaded areain the background shows the probability density function,g(ρ) for the observed values of RTR. The g(ρ) shows thatapproximately 25 % of the experiment runs in the result setdo not exhibit any or a minor percentage of redundant trafficand the second highest density of samples can be observedbetween a RTR of 15 % and 50 %. The figure illustrates thatthere is a close relationship between the switching/bufferingrate and the amount of redundant traffic. While the bufferingrate increases only slowly for increasing RTR, the switchingrate increase is steeper. For 20 % of RTR the buffering rateis approximately one buffering event per two minutes, whileswitching rate is up to 1.2 switches per one minute. Byimplication, this shows that buffering events quickly translateto a high amount of redundant traffic. Buffering events area sign of sudden drop in bandwidth (or equivalent: a suddenincrease in the video bit-rate) or an effect of poor decision

2 3 4 5 6 7Rescaled Bandwidth f ∗144p

0

10

20

30

40

50

RT

R(%

)

(a) q0 =144p (γ0 =0.11 Mbps)

1.5 2.0 2.5 3.0 3.5 4.0 4.5Rescaled Bandwidth f ∗240p

0

10

20

30

40

50

RT

R(%

)

(b) q1 =240p (γ1 =0.24 Mbps)


0

10

20

30

40

50

RT

R(%

)

(c) q2 =360p (γ2 =0.37 Mbps)


0

10

20

30

40

50

RT

R(%

)

(d) q3 =480p (γ3 =0.72 Mbps)

Figure 8. RTR for the rescaled bandwidth f∗ relative to the four qualitylevels. Note that the shown range of f∗ is wider for 144p compared to theother quality levels.

0 20 40 60 80 100

RTR ρ (%)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Eve

nts

[min−

1 ]

Switches

Buffering

0.00

0.05

0.10

0.15

0.20

0.25

0.30

g(e)

Figure 9. Switching and buffering events for increasing RTR. The shaded areaillustrates the probability density function of the RTR values in the collectedresult set.

by the adaptation logic. Hence, it can be concluded thathigher values of redundant traffic can be observed whenthe playout buffer is often depleted. Furthermore, from the(approximately) strict monotonic increase follows that it ispossible to estimate the RTR of a viewing session based onthe playback characteristics.

B. Efficiency of Adaptation Strategy

Next, we discuss the efficiency of the observed adaptationstrategy. In particular we want to answer the following ques-tions: How much of the additionally downloaded data wasactually used to improve the average playback quality? Andhow much average quality was lost, compared to an optimumwhere the segments are downloaded without overlaps. Forcalculation of the efficiency we exclude videos where higherquality levels have a lower bit-rate than lower quality levels.

15 30 45 60Played Bytes B (MB)

0.0

0.5

1.0

1.5

2.0

2.5

3.0Q

ualit

yL

evel

J

(a) v1 = pjBVUyq530

15 30 45 60Played Bytes B (MB)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Qua

lity

Lev

elJ

(b) v2 = vbLLqaa9ksw

Figure 10. Approximated function θ for two of the videos.

0.5 1.0 1.5 2.0 2.5

Bandwidth f (Mbps)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Ave

rage

Qua

lity

J

θ(BT )

J−

J+

Figure 11. The worst case average quality J−f , the played average qualityJ+f and the estimated optimum quality θf (BT ) for increasing bandwidth f .

The translation between average playback quality of a cer-tain downloaded volume of bytes z is estimated by a functionθ(z), which is determined by regression of the observationof the quality levels as a function of the bandwidth f . Weapply isotonic regression [21] to fit the played bytes B tothe average quality level. Isotonic regression approximatesmonotonic functions and does not assume a specific shape ofthe underlying function. The translation between the bytes Bof the viewed quality level J+ and the quality level J+ itself is(mostly) monotonic, therefor isotonic regression is applicableto the problem at hand.

Figure 10 illustrates the isotonic regression result for two ofthe videos v1 and v2 used in the result set. The red dots depictthe samples collected for the video, (B, J+). The connectedgreen dots show the approximated function θ(y), y ∈ (0, B+)(B+ is the maximum observed B). Two observations canbe made from the figure. First, θv2 is more flat than θv1 .Second, v2 shows a larger deviation for the bytes requiredfor a specific average quality level. The difference in slope isa result of the different bit-rates for quality level q3 = 480pfor the two videos. q3 = 480p of video v1 has an averagebit-rate of βv1 = 0.49 Mbps, while for v2 the average bit-rateis βv2 = 0.78 Mbps. The deviation for the same quality levelindicates that the standard deviation for the video bit-rate forvideo v1 is higher than for v2. Next, we use θ to estimate theoptimum average quality level for a given BT and compareit to the observed average playback quality and to the worstcase quality level J− where no lower quality segments werereplaced by segments with a higher quality.

0.0 0.5 1.0 1.5 2.0 2.5 3.0

Bandwidth f (Mbps)

0.0

0.2

0.4

0.6

0.8

1.0

Effi

cien

cyε

e

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Qua

lity

Lev

elJ

J+− J−

θ(BT )− J−

Figure 12. Efficiency εf , difference between played and worst case qualitylevel J+

f − J−f and difference between estimated optimum and worst case

quality θf (BT )− J−f over bandwidth f .

Figure 11 depicts the worst case quality level J−f , the

played quality level J+f and the estimated optimum quality

level based on the total downloaded amount of data θ(BT )for a bandwidth factor up to f = 2.6 Mbps. It can be seenin the figure, that for low bandwidths, the absolute differencebetween three quality levels is larger than for higher bandwidthvalues. For example for 0.5 Mbps, the quality level increasesdue to the adaption strategy by 0.7 quality levels. However,based on the amount of downloaded data, the redundanttraffic introduced a penalty of 0.5 quality levels. For higherbandwidth values, the absolute difference between the threequality levels J−

f , J+f and θ(BT ) becomes less, but the ratio

between them stays roughly the same. For a bandwidth of2.6 Mbps, no difference can be observed.

Next, we define a metric for the efficiency of the adaptationstrategy in respect to the amount of downloaded and playeddata. In the best case, an adaption strategy allows the player toutilize all the bytes downloaded. However, this is only possibleif θ(BT ) = J+

f = J−f , as each difference between J+

f andJ−f introduces redundant traffic and therefor increases also

the difference between estimated optimum θ(BT ) and J+f .

We define the efficiency ε of the algorithm as the ratio be-tween the measured quality gain (J+−J−) and the estimatedmaximal quality gain θ(BT )− J− (see Eq (10)).

ε =J+ − J−

θ(BT )− J− (10)

If ε = 1, the algorithm was able to utilize all the additionallydownloaded data to improve the average quality level. If ε = 0,the algorithm requested additional segments, but was not ableto improve the average quality level by doing so.

Figure 12 depicts the efficiency εf , the difference betweenplayed and worst case quality level J+

f −J−f and difference be-

tween estimated optimum and worst case quality θf (BT )−J−f

over bandwidth f . The figure shows that for low bandwidths(f < 0.8Mbps), the adaptation strategy is able to utilize 60 %of the additionally downloaded data to increase the averagequality level J . For bandwidth from 0.8 Mbps to 1.5 Mbps, theefficiency drops to 55 %. For bandwidths from 1.5 Mbps up to

2 Mbps, the efficiency increases up to 65 %. For bandwidthslarger than 2 Mbps, the width of the confidence interval do notallow a reliable conclusion. For overcapacity bandwidth f , theεf looks ”noisy” because there are no adaptation, and hencenot redundancy, and therefor the θf (B

T ), J−f , and J+

f are(almost) equal. To summarize the figure, the algorithm is onaverage able to utilize 60 % of the additionally downloadeddata to improve the average quality level compared to J−,where no replacement of lower quality segments takes place.The left over 40 % are the penalty the algorithm has to payfor discarding previously downloaded segments.

VI. CONCLUSION & OUTLOOK

The traffic share of video streaming in the Internet continuesto increase. Major content providers such as YouTube providea global infrastructure to serve commercial and user-generatedcontents to every end-user’s device. YouTube alone accountsfor about 15 % of the total downstream Internet traffic inthe US. Furthermore, reports show that the video traffic alsoincreases for cellular access where available data-rate is scarce.Therefor it is important to understand how YouTube adaptsto the available bandwidth and how efficiently it uses theavailable network resources. Previous studies reveal that theadaption strategy used by the YouTube client tries to optimizethe average playback quality by replacing previously down-loaded lower quality segments with higher quality segments.This increases the average quality shown to user. However,it decreases the efficiency in respect to the used networkresources. In this paper we present the results of a large-scale study with more than 12,000 video views of differentcontents while the downstream traffic was shaped to emulatea bottleneck link with a certain bandwidth. First we show howYouTube adapts to the available bandwidth from the perspec-tive of the user in terms of quality switching, buffering eventsand playback quality. Higher available bandwidth increasesalmost linearly the average playback quality, while bufferingratio and buffering event rate are decreasing sub-linearly. Theswitching frequency is decreasing almost linearly. The resultsshow that for every video and quality level there is a specificnetwork bandwidth which maximizes the time spent on thespecific quality level. Thus, this network bandwidth is relatedto the video bit-rate of that quality level. However, that specificbandwidth does not force the player to select a specific qualitylevel with a probability larger than 0.6 for any quality levelbelow the maximum, except for the trivial case of bandwidthoverprovisioning of more than 300 % of the video bit-rate. Inany other case, quality switching always occurs.

In the second part of the study we discuss the efficiencyof the adaption strategy from a networking view point. Theresults show that by replacing lower quality segments, theYouTube player is able to increase significantly the averageplayback quality by up to 0.7 quality levels. However, 30 %redundant data has to be downloaded for this. Based onthe amount of redundant traffic, we estimate the optimal,i.e. highest, average quality level which could be achievedby downloading the same total amount of data but avoiding

redundant traffic. The results show that without redundancy theamount of downloaded data could have been used to increasethe average quality level by up to 1.3 quality levels. Therefor,about a half quality level is unnecessarily downloaded anddiscarded due to the redundancy.

In the future, we plan to include highly varying bandwidthconditions, e.g. as found in cellular access. Furthermore,the maximum achievable playback quality for a certain datavolume can be formulated as an optimization problem insteadof regression based on historic data. This can give insightinto an upper bound for potential improvements to YouTube’sadaptation algorithm.

REFERENCES

[1] “Cisco visual networking index: Foreecast 2014 - 2019.” [Online].Available: http://goo.gl/0iqjo9

[2] M. Seufert, S. Egger, M. Slanina, T. Zinner, T. Hossfeld, and P. Tran-Gia, “A survey on quality of experience of http adaptive streaming,”Communications Surveys Tutorials, IEEE, 2014.

[3] T. Hoßfeld, R. Schatz, E. Biersack, and L. Plissonneau, “Internet videodelivery in youtube: from traffic measurements to quality of experience,”in Data Traffic Monitoring and Analysis. Springer, 2013, pp. 264–301.

[4] T. Hoßfeld, M. Seufert, C. Sieber, T. Zinner, and P. Tran-Gia, “Iden-tifying qoe optimal adaptation of http adaptive streaming based onsubjective studies,” Computer Networks, vol. 81, pp. 320–332, 2015.

[5] Sandvine, “Global Internet Phenomen Report 1H 2014,” 2014.[6] C. Sieber, A. Blenk, M. Hinteregger, and W. Kellerer, “The cost of

aggressive HTTP adaptive streaming: Quantifying YouTube’s redundanttraffic,” in 2015 IFIP/IEEE International Symposium on IntegratedNetwork Management (IM). IEEE, may 2015, pp. 1261–1267.

[7] C. Sieber, T. Hoßfeld, T. Zinner, P. Tran-Gia, and C. Timmerer, “Imple-mentation and User-centric Comparison of a Novel Adaptation Logic forDASH with SVC,” in IFIP/IEEE International Workshop on Quality ofExperience Centric Management (QCMan), Ghent, Belgium, May 2013.

[8] J. Anorga, S. Arrizabalaga, B. Sedano, M. Alonso-arce, and J. Men-dizabal, “YouTube’s DASH implementation analysis,” in 19th Interna-tional Conference on Circuits, Systems, Communications and Computers(CSCC), 2015, pp. 61–66.

[9] L. Yao, W. Qi, G. Lei, S. Bo, C. Songqing, and L. Yingjie, “InvestigatingRedundant Internet Video Streaming Traffic on iOS Devices: Causes andSolutions,” Multimedia, IEEE Transactions on, vol. 16, no. 2, 2014.

[10] Y. Liu, F. Li, L. Guo, B. Shen, and S. Chen, “A comparative study ofandroid and ios for accessing internet streaming services,” in Passiveand Active Measurement. Springer, 2013, pp. 104–114.

[11] A. Mansy, M. Ammar, J. Chandrashekar, and A. Sheth, “CharacterizingClient Behavior of Commercial Mobile Video Streaming Services,” inProceedings of Workshop on Mobile Video Delivery, MoViD’14, 2014.

[12] H. Nam, B. H. Kim, D. Calin, and H. G. Schulzrinne, “Mobile video isinefficient: A traffic analysis,” 2013.

[13] A. Rao, A. Legout, Y.-s. Lim, D. Towsley, C. Barakat, and W. Dabbous,“Network characteristics of video streaming traffic,” in Proceedingsof the Seventh COnference on emerging Networking EXperiments andTechnologies. ACM, 2011.

[14] M. Ito, R. Antonello, D. Sadok, and S. Fernandes, “Network levelcharacterization of adaptive streaming over http applications,” in IEEESymposium on Computers and Communication (ISCC), June 2014.

[15] S. Alcock and R. Nelson, “Application flow control in youtube videostreams,” SIGCOMM Comput. Commun. Rev., vol. 41, no. 2, Apr. 2011.

[16] “mitmproxy.” [Online]. Available: https://mitmproxy.org/[17] B. Staehle, M. Hirth, R. Pries, F. Wamser, and D. Staehle, “YoMo: A

YouTube Application Comfort Monitoring Tool,” in New Dimensions inthe Assessment and Support of Quality of Experience for MultimediaApplications, 2010.

[18] “youtube-dl.” [Online]. Available: https://rg3.github.io/youtube-dl/[19] “ffprobe.” [Online]. Available: https://ffmpeg.org/ffprobe.html[20] “Traces and programs used in this paper.” [Online]. Available:

http://git.io/vRSSW/[21] R. E. Barlow, D. J. Bartholomew, J. Bremner, and H. D. Brunk,

Statistical inference under order restrictions: the theory and applicationof isotonic regression. Wiley New York, 1972.

Sacriﬁcing Efﬁciency for Quality of Experience: YouTube’s … · 2016. 4. 12. ·...

Documents

Transcript of Sacriﬁcing Efﬁciency for Quality of Experience: YouTube’s … · 2016. 4. 12. ·...