Oboe: Auto-tuning Video ABR Algorithms to Network Conditionsisl/papers/sigcomm18-final128.pdfOboe...

Oboe Auto-tuning Video ABR Algorithms toNetwork Conditions

Zahaib AkhtarUniversity of Southern California

Yun Seong NamPurdue University

Ramesh GovindanUniversity of Southern California

Sanjay RaoPurdue University

Jessica ChenUniversity of Windsor

Ethan Katz-BassettColumbia University

Bruno RibeiroPurdue University

Jibin ZhanConviva

Hui ZhangConviva

ABSTRACTMost content providers are interested in providing good videodelivery QoE for all users not just on average State-of-the-artABR algorithms like BOLA and MPC rely on parameters thatare sensitive to network conditions so may perform poorlyfor some users andor videos In this paper we propose atechnique called Oboe to auto-tune these parameters to dif-ferent network conditions Oboe pre-computes for a givenABR algorithm the best possible parameters for different net-work conditions then dynamically adapts the parameters atrun-time for the current network conditions Using testbed ex-periments we show that Oboe significantly improves BOLAMPC and a commercially deployed ABR Oboe also bettersa recently proposed reinforcement learning based ABR Pen-sieve by 24 on average on a composite QoE metric in partbecause it is able to better specialize ABR behavior acrossdifferent network states

CCS CONCEPTSbull Information systems rarr Information systems applica-tions Multimedia streaming

KEYWORDSVideo delivery Adaptive bitrate algorithms

ACM Reference FormatZahaib Akhtar Yun Seong Nam Ramesh Govindan SanjayRao Jessica Chen Ethan Katz-Bassett Bruno Ribeiro Jibin Zhanand Hui Zhang 2018 Oboe Auto-tuning Video ABR Algorithms to

Both authors contributed equally to this paper and can be contacted at followingzakhtaruscedu nam21purdueedu

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page Copyrights for components of this work owned by others than ACMmust be honored Abstracting with credit is permitted To copy otherwise or republishto post on servers or to redistribute to lists requires prior specific permission andor afee Request permissions from permissionsacmorgSIGCOMM rsquo18 August 20ndash25 2018 Budapest Hungarycopy 2018 Association for Computing MachineryACM ISBN 978-1-4503-5567-41808 $1500httpsdoiorg10114532305433230558

Network Conditions In SIGCOMM rsquo18 ACM SIGCOMM 2018Conference August 20ndash25 2018 Budapest Hungary 15 pageshttpsdoiorg10114532305433230558

1 INTRODUCTIONInternet video forms a major fraction of Internet traffic to-day [13] and delivering high quality of experience (QoE) iscritical since it correlates with user engagement and revenue[6 23 31] To deliver high quality video across diverse net-work conditions most Internet video delivery uses adaptivebitrate (ABR) algorithms [32 48 59] combined with HTTPchunk-based streaming protocols (eg Applersquos HTTP LiveStreaming Adobersquos HTTP Dynamic Streaming) ABR algo-rithms (a) chop a video into chunks each of which is encodedat a range of bitrates (or qualities) and (b) choose whichbitrate level to fetch a chunk at based on conditions such asthe amount of video the client has buffered and the recentthroughput achieved by the client Within this general frame-work ABR algorithms differ in how bitrate level selectiondecisions are made and these decisions impact metrics suchas the average bitrate or the rebuffering ratio We call theseQoE metrics because they have been shown to correlate wellwith QoE [23] but other perceptual video quality metrics [2]may also influence QoE

ABR algorithm design remains an active research area be-cause content providers continue to be interested in improvingthe performance of video delivery Current ABR algorithmsperform well on average but some users can experience poordelivery performance as measured by the QoE metrics Theseusers suffer because ABR algorithms have limited dynamicrange they do not perform uniformly well across the range ofnetwork conditions seen in practice because their parametersare sensitive to throughput variability (sect2)

Contributions In this paper we present the design ofOboe1 (sect3) a system that takes the first step towardsovercoming these hurdles Oboe improves the dynamic rangeof ABR algorithms by automatically tuning ABR behavior to

1In orchestras all instruments tune to the Oboe

SIGCOMM rsquo18 August 20ndash25 2018 Budapest Hungary Z Akhtar et al

the current network state of a client connection specificallyto throughput and throughput variability

Oboersquos design is based on the observation made by priorwork [17 35 38 52 60] that TCP connections are well-modeled as traversing a piecewise-stationary sequence ofnetwork states (sect31) the connection consists of multiplenon-overlapping segments where each segment is in a distinctstationary network state For each possible network stateOboe pre-computes offline the best parameter configurationfor a given ABR algorithm (sect32) It does this by subjectingthe algorithm for each state to different parameter valuesand picking the one that results in the best performance Thenduring video playback Oboe continuously uses a change-point detection algorithm to detect changes in network stateand selects the parameter identified by the offline analysis asbest for the current state Thus if a video session encountersvarying network state during its lifetime Oboe automaticallyspecializes the ABR parameter to each state (sect33)

We have implemented Oboe and demonstrated several as-pects of its performance through testbed experiments andtrace driven simulations First Oboe significantly improvesperformance of QoE metrics for three qualitatively differentABR algorithms one that makes bitrate switching decisionson buffer occupancy alone (BOLA) [48] another that usesboth throughput and buffer occupancy (HYB a widely de-ployed algorithm) and a third that also optimizes decisionsacross a finite lookahead horizon (RobustMPC) [59] In eachof these cases Oboe results in significant improvement Forinstance Oboe reduces sessions with rebuffering from 333to 53 relative to RobustMPC while also significantly im-proving a composite QoE metric

Oboe when applied to RobustMPC also performs signifi-cantly better than a newly proposed approach called Pensievethat learns from real traces (using reinforcement learning)how to adapt to a variety of network conditions For nearly80 of the sessions in our dataset Oboe improves the samecomposite metric with benefits exceeding 20 for 25 of thetraces Compared to Oboe which can specialize parametersto individual network states Pensieve is unable to special-ize across the entire range of network throughputs We havetried training specialized Pensieve models for different rangesof network throughputs and dynamically switching modelsbased on estimated session throughput This helps but a sig-nificant gap between the two approaches still remains (sect44)

While a variety of viable pathways exist to deploying Oboewe focus on an architecture where Oboe and the entire ABRlogic are deployed on the cloud which enables rapid evolutionand fine-grain customizability We show the viability of thisarchitecture with results from a pilot deployment

2 BACKGROUND AND MOTIVATIONThe Internet video delivery ecosystem consists of hundredsof content publishers and hundreds of client side applicationsthat stream video content to diverse user devices Publisherscontent delivery networks and users all seek to improve userquality of experience (QoE) There are many factors thataffect QoE including start up latency the average bitrate for avideo session as well as the rebuffering ratio (the percentageof time playback is stalled because of drained buffer) [23]Video players improve QoE using adaptive bitrate (ABR)algorithms which select bitrates for each chunk while (1)ensuring the bitrate seen by the user is as high as possibleand (2) avoiding rebuffering events at the client Some ABRalgorithms may also try to minimize the number of bitrateswitches to make the playback smooth

Content publishers serve different types of content includ-ing VoD (Video on Demand) or Live broadcasts They mayalso serve streams of different qualities ranging from HD(high definition) to SD (standard definition) These differ-ences impact how they serve videos For example publisherswho serve VoD content can use player buffers as large as 4minutes [32] whereas publishers serving live content mayhave a time-to-live2 requirement between 15-45 seconds Sim-ilarly based on the quality of streams they serve publishersmay use different bitrate levels or chunk sizes Further pub-lishers may have different QoE objectives For example somemay strictly prefer to minimize rebuffering and others mayrelax their tolerance for rebuffering to prioritize higher bi-trates We use the term publisher specifications to denotetheir choice of bitrate levels chunk sizes content type andrebuffering tolerance

21 Background on ABR AlgorithmsABR algorithms fall in two broad categories (i) those thatuse both prediction of network throughput and buffer occu-pancy [34 51 59] and (ii) those that are primarily based onbuffer occupancy [32 48] Within the above two categoriesABR algorithms can be designed using approaches rangingfrom heuristics to stochastic optimization In sect4 we discussa recently proposed ABR algorithm based on a qualitativelydifferent approach reinforcement learning [39]

MPC Throughput prediction and buffer occupancy withlook-ahead Selects bitrate by solving an optimization prob-lem MPC [59] predicts throughput of future chunk down-loads based on throughput samples of recently downloadedchunks then uses this predicted throughput to select bitratesto optimize a given QoE function (sect4) over a look-aheadwindow of 5 future chunks The aggressive version of thealgorithm (FastMPC) directly uses a throughput estimate ob-tained using a harmonic mean predictor To compensate for2For live content the time between the event and its broadcast to users This boundsthe maximum buffer that a player streaming a live event can build

Oboe SIGCOMM rsquo18 August 20ndash25 2018 Budapest Hungary

Figure 1mdashPerformance of ABR algorithms using different configurations for two sessions with different throughput behaviors

Figure 2mdashIllustrating how policy for setting discount factors in MPC impacts performance for different traces

throughput prediction errors a more conservative version Ro-bustMPC reduces predicted throughput by a discount factor1+119889 where 119889 is the maximum error in throughput predictionsexperienced in the last five chunk downloads

BOLA Buffer occupancy selects bitrate by solving anoptimization problem BOLA is a buffer-based algorithmused in Dashjs [7] so it does not employ throughput pre-diction in making bitrate decisions [48] It also models bitrateselection as an optimization problem which it solves for agiven value of the buffer It uses a parameter 120574 which is a ratioof (i) a minimum buffer threshold below which it downloadsthe lowest bitrate and (ii) a target buffer threshold which ittries to maintain Conceptually 120574 controls how strongly theABR should avoid rebuffering [48] Higher values of 120574 makethe algorithm conservative

HYB Throughput prediction without lookahead Selectsbitrate using a simple heuristic An algorithm widely used inproduction (sect5) HYB considers both the predicted throughputand current buffer occupancy (HYB is short for hybrid) Foreach chunk HYB picks the highest bitrate that can avoidrebuffering Specifically if 119878119895 (119894) denotes the size of chunk 119895encoded at bitrate 119894 119861 is the predicted throughput based onpast samples and 119871 the length of the buffer HYB picks thelargest bitrate 119894 such that 119878119895 (119894)

119861 lt 119871 times 120573 Here 120573 can havevalues between 0 and 1 (higher values represent aggressiveABR behavior) 120573 can be tuned to offset prediction errors inthroughput and to compensate for the greedy nature of theapproach which may make it susceptible to future bufferingevents owing to unexpected throughput dips

22 Ensuring High QoE for All UsersDespite widespread deployment ABR algorithms continueto be an active area of research [32 34 39 48 51 59] This

is because while deployed ABR algorithms work well onaverage they do not work uniformly well across all networkconditions A key reason for this is that ABR algorithms haveparameters (which we henceforth refer to as configurations)that must be set in a manner sensitive to network conditionsABR algorithms need to run on many different networks rang-ing from cellular and WiFi networks at one end to high-speedbroadband connections at the other Given this diversity net-work conditions can vary significantly Packet loss conditionscan vary by an order of magnitude or more across the globe[25] Network throughputs can also vary widely for 90 oftraces in a large dataset the tracersquos maximum throughput ismore than twice its average throughput Yet unfortunatelymost ABR algorithms today either employ fixed configura-tions or simple heuristics to adapt these configurations (sect21)

Figures 1(a) and 1(b) show how the choice of ABR config-uration depends on network conditions Figure 1(a) shows thebitrate and rebuffering ratio for two client sessions with theHYB algorithm for three different values of its 120573 parameterCons (Conservative) Mod (Moderate) and Aggr (Aggres-sive) The throughput behavior of the two sessions is shownin Figure 1(c) If a publisher prefers to eliminate rebufferingMod is suitable for session A but Cons is better for session BFigure 1(b) shows that BOLA behaves similarly with Modbeing the preferred setting for session A and Cons for sessionB to avoid rebuffering

Figures 2(a) and 2(b) show the difficulty in setting thediscount factor with MPC by comparing the performanceof FastMPC (no discount factor) and RobustMPC (discountfactor set by local heuristic) for two throughput traces withdifferent characteristics In each figure the top subgraphsshow the available throughput (green curve) and the through-put estimate of FastMPC (red) and RobustMPC (blue) For


the left graph although the throughput is generally good thesudden variations force RobustMPC to make overly conserva-tive bitrate decisions as well as incur more bitrate switches(bottom subgraph) In contrast in Figure 2(b) the quicker andmore frequent throughput changes (top subgraph) result inFastMPC experiencing rebuffering (middle subgraph) whileRobustMPC does not This is just one example illustratingthe difficulty in picking parameters ndash in our evaluations (sect4)we found that RobustMPC was itself too aggressive whenselecting discount factors for some traces

While this section uses synthetic traces for illustrative pur-poses our evaluations with real traces (sect4) more extensivelydemonstrate the limitations of current approaches with re-spect to selecting parameters and the benefits of automaticallytuning ABR parameters to network conditions

3 OBOE DESIGNOboe aims to ensure good QoE for all users by enabling ABRalgorithms to perform better across a wide range of networkconditions The configurations of many ABR algorithms aresensitive to network state specifically to the value and vari-ability of the available throughput between the client and thevideo server For example 120573 in HYB should be smaller whenavailable throughput is highly variable while 120574 in BOLAshould be higher This explains why the algorithms performdifferently for different values of parameters on a given clienttrace (sect22) However a line of prior work [17 35 38 52 60]has observed that network connections are piecewise station-ary that is connections can be in one of several distinct states(sect31) where each state is distinguished by stationarity in thestatistical sense (informally a process is stationary if its sta-tistical properties including mean and variance do not changeover time - see [43] for a more formal definition)

Oboe leverages the piecewise stationarity of network con-nections to address the key challenge of sensitivity of config-urations to network conditions It does so using a two stagedesign (a) an offline stage where it pre-computes the best con-figuration choice for each (stationary) network state (sect32)and (b) and an online stage where during a session it de-tects changes in network state and applies the pre-computedbest configuration for the current (stationary) state (sect33)Oboe can also accommodate publisher specifications suchas session type (live vs video-on-demand time-to-live re-quirements) bitrate levels or any explicit QoE tradeoffs (egpreference between rebuffering and average bitrate) (sect32) byusing these to influence the selection of the best configurationfor each (stationary) network state in the offline stage

31 Representing Network StateMost ABR algorithms today adapt bitrates based on thethroughput (more precisely goodput) achieved by recently

Figure 3mdashThe logical diagram of the offline pipeline used by Oboe

downloaded chunks This perceived throughput already ac-counts for network delays and loss-rates as well as the dy-namics of the underlying transport protocol

The network throughput along a path is not necessarily astationary process [17 35 38 52 60] flows at the bottleneckalong a path may change over time resulting in changes toavailable throughput or the bottleneck itself may shift [35]An analysis of the throughput traces used in our evaluations(sect4) confirms the lack of stationarity when applied to the en-tire trace We analyze throughput traces using the AugmentedDickey-Fuller (ADF [26]) test a hypothesis test to check forstationarity in a time series Our evaluations on a dataset of15000 video streaming throughput traces show that 595were non-stationary (see sect42 for details of the dataset) imply-ing the presence of distinct mean andor variance in differentsegments of the traces

However prior work [17 35 38 52 60] shows that TCPconnection throughput can be modeled as a piecewise sta-tionary process the connection consists of multiple non-overlapping segments where each segment is stationary andoften lasts for tens of seconds or minutes (eg Figure 8)Moreover Zhang et al [60] show that the throughput in eachsegment may be modeled as an iid process

Motivated by these observations Oboe defines networkstate 119904 by a tuple lt 120583119904 120590119904 gt where 120583119904 is the mean and 120590119904

the standard deviation of the client-perceived throughput in a(stationary) segment of the underlying TCP connection

32 Offline Mapping of Network StatesTo map network states to their optimal ABR configurationsOboe uses a pipeline (Figure 3) consisting of three compo-nents ndash the ConfigEvaluator the VirtualPlayer and the Con-figSelector The ConfigEvaluator takes a stationary through-put trace as input which represents a particular network stateand drives the exploration of different ABR configurationsover this trace It does so by using the VirtualPlayer whichmodels the dynamics of an actual video player The Virtu-alPlayer interfaces with the ABR algorithm implementationand outputs the performance of different configurations of theABR Finally the ConfigSelector compares the performanceof different configurations and builds a ConfigMap whichmaps a given network state to the best configuration


Generating throughput traces for ConfigEvaluator Toexplore configuration space of an ABR algorithm on each net-work state 119904 ConfigEvaluator needs a stationary throughputtrace to represent 119904 To generate such a trace we explored twodifferent approaches In one approach we extracted stationarysegments from real traces using offline change point detec-tion ([10] described in sect33) Change points capture pointswhere the distribution changes However because we are notguaranteed coverage (ie not all states might be observablein real traces) we also explored a second approach whichinvolved generating a synthetic trace for each 119904 with 119904rsquos meanand standard deviation assuming a Gaussian distribution forthe throughput samples This was motivated by Dinda et al[38] who showed that the throughput of TCP flows of thesame size in a given stationary segment may be modeled asa Gaussian distribution (also see sect31) More recent workalso shows that TCP throughput is well modeled as a Markovprocess each of whose states may be modeled as a Gaussiandistribution [49] We found that Oboe with synthetic tracesperformed comparably to stationary segments from real tracesSo ConfigEvaluator uses synthetic traces

Specifically ConfigEvaluator quantizes both mean andstandard deviation of throughput using a quantum (in ourexperiments of 50 Kbps) resulting in states (in our experi-ments 10000) spread over a two dimensional space (in ourexperiments 005-10 Mbps) of throughput and standard devi-ation For each state we generate a synthetic stationary traceWe found that the benefits of finer quantization are marginal

Estimating ABR performance with VirtualPlayer andpublisher specifications Oboe uses VirtualPlayer a trace-based simulator that mimics the behavior of an actual videoplayer without downloading or rendering actual videosIt takes as input a throughput trace and outputs the QoEperformance metrics of a video session for a specifiedABR algorithm We have validated VirtualPlayer in sect47In designing VirtualPlayer we have decoupled ABR logic(Figure 3) so the same implementation of the ABR logiccan be used in Oboersquos offline and online stage Further thisdesign provides an interface to the ABR designer throughwhich they can easily integrate their ABR algorithm withOboe without having to know about Oboersquos internals

The VirtualPlayer also takes into account publisher spec-ifications for bitrate levels player buffer sizes (determinedby time-to-live requirements) and chunk size These spec-ifications are used by VirtualPlayer when it executes ABRalgorithms on the input traces ensuring that the resulting Con-figMap meets the publisher specifications Finally Oboe alsoallows the publisher to optionally express an explicit QoEtradeoff such as maintaining the rebuffering under a desiredthreshold 119909 Oboe derives a ConfigMap that meets the re-buffering threshold in a best effort manner We evaluate theefficacy of this flexibility in sect47

Building the ConfigMap using ConfigSelector To buildthe ConfigMap the ConfigEvaluator drives the exploration ofdifferent configurations for an ABR algorithm For a givennetwork state 119904 ConfigEvaluator sweeps through possibleconfigurations of the ABR algorithm using the VirtualPlayerFor example the 120573 parameter in HYB can take values from 0to 1 so ConfigEvaluator plays the trace for state 119904 for multiplevalues of 120573 (quantized for efficiency see below) in this range

For each parameter value 119888119894 VirtualPlayer outputs a per-formance vector 119881119894 =lt 1199071 1199072 119907119898 gt where each 119907119896 cor-responds to the values achieved by 119888119894 for a QoE metric (egbitrate rebuffering ratio and more generally join time andfrequency of switching bitrates [23]) This set of performancevectors with the corresponding parameter values are then sentto ConfigSelector for picking the best configuration

ConfigSelector takes the set of performance vectors anddetermines the best configuration from them using vectordominance A configuration 119888119894 is said to dominate 119888119895 if 119881119894

element-wise dominates 119881119895 (ie each element of 119881119894 is bet-ter than or equal to the corresponding element of 119881119895) Thisstep also takes into account any rebuffering tolerance andConfigSelector applies this tolerance to select the maximalperformance vector Deferring the selection of the maximalvector for a given rebuffering tolerance to this stage (insteadof filtering vectors in the previous step) is beneficial it mini-mizes recomputation by allowing Oboe to quickly compute anew maximal vector if the publisher changes the rebufferingtolerance At the end of this stage Oboe obtains the Con-figMap a complete mapping of each network state to itscorresponding optimal ABR configuration

Two optimizations can be used to quicken the rate of ex-ploration of the ConfigEvaluator The first is to quantize theparameter sweep so that configurations are evaluated at acoarser granularity This trades off some performance forlower computational complexity The second optimization isbased on the observation that there is generally a monotonicrelationship between parameter values and the performanceFor instance for HYB (sect21) the rebuffering ratio and av-erage bitrate are monotonically non-decreasing with the pa-rameter 120573 Based on this observation we can instead use an119874(log 119899) binary search of the configuration space instead ofdoing a full 119874(119899) sweep of all configurations

33 Online ABR TuningOboe uses the ConfigMap generated offline and live through-put measurements from the video player to dynamicallychange ABR configurations during a video playback It doesthis by using an online change point detection algorithm[14] This algorithm identifies in an online fashion ifthe distribution of the throughput samples has changedsignificantly signaling a state transition When a changepoint is detected the algorithm also provides the new state


Figure 4mdashLogical diagram of Oboersquos online pipeline

119904rsquos mean and standard deviation Oboersquos ChangeDetector(Figure 4) implements the change point detection algorithmand the ReconfEngine is responsible for updating theABR configuration based on a new network state and theConfigMap

Change point detection algorithms Such algorithms ana-lyze a time series and check if there are regions in the timeseries where the underlying distribution of the data changesto a different set of parameters Offline change-point methodsrequire the full time series to be available whereas onlinemethods work with a continuous stream of samples as theybecome available We focus on online methods since Oboeidentifies change points for an in-progress session and dynam-ically changes configurations

While several techniques exist for change point detection[22 33 36 44 54 58] we focus on probabilistic methods[14 18 24 57] Further we use a Bayesian online proba-bilistic change-point detector [14] for two reasons First in[14] a sequence of observations can be partitioned into non-overlapping states such that the observations are iid condi-tioned on a given network state 119904 This view aligns well withthe way we have defined a network state (sect31) Further the al-gorithm is fast and requires no prior knowledge about the datastream matching our scenario We use the implementationprovided in [10] and integrate it with the ChangeDetector

Detecting changes in network state During a video ses-sion ChangeDetector is continually fed with a series of obser-vations of the network throughput which it uses to detect statechanges ChangeDetector calculates throughput and standarddeviation by only considering those samples which belongto the current state To generate inputs to ChangeDetectorone approach is to use each downloaded chunk to obtain asingle throughput sample However this may be too coarse-grained and prevent detection of changes in network statethat occur during the chunk download Instead we use finegrained samples recorded at periodic intervals (tens of mil-liseconds) during the download of each chunk Players suchas Dashjs already periodically log intermediate throughputsamples during a chunk download so obtaining these sam-ples does not incur any additional overhead We only needto modify players to report these samples to Oboe The setof samples are provided to ChangeDetector after the chunkdownload and any change in state is only detected at the end

of the chunk download This is acceptable since any actionthat can be taken by the ABR algorithm (such as a bit rateswitch) only impacts subsequent chunks In the rarer case thatan ABR algorithm abandons the download of a chunk thattakes too long the report is sent when the chunk download isabandoned sect48 evaluates the overheads of ChangeDetector

An alternative approach to changing configurations is touse an exponentially weighted moving average (EWMA) ofthe mean and standard deviation of throughput samples andto lookup the corresponding configuration We experimentedwith such an approach and found its performance unsatisfac-tory The approach can result in continual and unnecessaryreconfigurations since throughput may vary across sampleseven when the network is (statistically) stationary Dampingthese changes can result in slow reaction times when a recon-figuration is actually beneficial In contrast Oboe (i) modelsthe underlying TCP connection as a sequence of states (ii)does not make changes to the configuration within a givennetwork state and (iii) only reconfigures when a state changeis observed

Reconfiguring ABR Algorithm When a change in the net-work state is detected the ChangeDetector signals the changeand the new network state 119904 to the ReconfEngine The Recon-fEngine then searches a neighborhood of radius 119903 in the Con-figMap to select the configuration to use for state 119904 Specifi-cally if state 119904 is a point in a 2-dimensional space of averagethroughput and standard deviation of throughput then it picksthe most conservative ABR configuration within a search ra-dius 119903 around 119904 It does this for two reasons First becauseOboe quantizes the network states it might not have precom-puted the best configuration for 119904 Second the estimated newnetwork state 119904 may have some error for example due toinefficiencies in the client download stack [27] Given thesesources of uncertainty Oboe chooses to be safe in its selec-tion of the best configuration for 119904 Finally ReconfEngineconfigures the ABR algorithm and the reconfigured ABRalgorithm is ready to compute the bitrate decision to be usedfor the next chunk at this point

4 EVALUATIONIn this section we demonstrate Oboersquos ability to auto-tunethree existing algorithms RobustMPC BOLA and HYB Wealso compare an Oboe-tuned RobustMPC to Pensieve [39]

41 MetricsThe performance of a video session depends on multiple fac-tors Average bitrate and rebuffering ratio were found to havethe most impact on user quality of experience [23] thoughother factors such as changes in bitrates during a session canplay a role [23] There is no consensus on how to best cap-ture a userrsquos QoE Consequently ABR algorithms today aredesigned to optimize different metrics For instance HYB


Figure 5mdashA scatter plot of average bitrate and rebuffering ratio between the Virtu-alPlayer and real Dashjs player

and BOLA primarily maximize average bitrate subject to lowrebuffering In contrast other algorithms [39 59] have beendesigned to optimize a QoE metric which is a linear combina-tion of bitrate rebuffering and bitrate changes (smoothness)

With Oboe our primary evaluation goal is to demonstratethe extent to which it can improve the underlying metrics thatan ABR algorithm is designed for Thus our evaluations withBOLA and HYB focus on average bitrate and rebufferingwhile those with MPC+Oboe focus on the linear combinationof QoE (which we refer to as QoE-lin [59]) defined asfollows For a video with 119873 chunks let 119877119894 be the bitratechosen for chunk 119894 Then the magnitude of bitrate changes119872 may be defined as 119872 =

sum119873minus1119894 |119877119894+1 minus 119877119894| If the ses-

sion experiences a total of 119879 seconds of rebuffering thenQoE-lin(119901 119888) = 1

119873 sum

119894 (119877119894 minus 119901119879 minus 119888 119872 ) where 119901 and 119888represent scaling penalties applied to rebuffering and changesin the session This function may be viewed as the sessionQoE averaged over the number of chunks For our videosthat had a maximum bitrate of 43 Mbps we use 119901 = 43 and119888 = 1 as our default parameters (following previous work thatset default rebuffering penalty equal to the maximum bitratevalue [39 59])

Even when an algorithm optimizes a metric such asQoE-lin it is important to understand the distributions ofunderlying factors The underlying factors represent concreteapplication performance that publishers understand how toreason about Moreover a unified metric like QoE-lin canobscure important differences For example two sessionsmay have the same QoE-lin but different performance inunderlying metrics leading to varied user experience So wealso present graphs of these metrics

42 MethodologyImplementation For RobustMPC we used the imple-

mentation available at [11] Our implementation of BOLA[bola] is from the Dashjs player The implementationof HYB is a variant of the algorithm used in a large-scaledeployment These ABR algorithms and Oboersquos online stage(change point detection and ABR reconfiguration) run onthe server in our experiments Our client runs the Dashjsvideo player (version 12) a reference player implementedin JavaScript by the MPEG-DASH forum [7] We modified

Dashjs to send client player state information (eg bufferlength video play state and throughput measurements) toOboe (sect5) This player runs on the Google Chrome browser(version 61) in our experiments In sect5 we show that Oboecan also be run as a cloud service

Testbed setup Our evaluations measure ABR performanceby delivering a video stream (the ldquoEnvivioDash3rdquo video fromthe MPEG-DASH reference videos [12]) from a video host-ing server to a client while varying network conditions us-ing throughput traces from real user sessions We use bi-trates 300 750 1200 1850 2850 4300119896119887119901119904 with a 4 sec-ond chunk duration and total length of 192 seconds We focuson this video as it has been used in prior work [39] and wedo not consider videos of longer duration because we onlyhave throughput traces available for a video publisher thatserves short music videos (as we discuss below) The videois hosted on an Apache server Both the server and clientsoftware run on the same 8-core 4 Ghz Intel i7 commoditydesktop with 12 GB RAM running Ubuntu 1604 Betweenserver and client we emulate different network conditionsusing the Chrome DevTools API [9] This allows us to controlthe uploaddownload throughput as well as latency using theChrome-Remote-Interface based on throughput traces [5] Weuse 571 throughput traces3 from our dataset (discussed below)for this emulation All our testbed experiments use a clientbuffer of 2 minutes

Datasets We use throughput traces from real user sessionscollected over a three month period Each trace contains the in-dividual chunk sizes and their download times for on-demandvideo sessions from a publisher that serves short (4-6 minute)music videos We derive throughput by dividing the chunksizes by their download durations The traces contain sessionsthat used desktops with wired connections and also sessionson mobile devices using WiFi or cellular connections Likeprevious work [39 59] we primarily focus on traces thathave less than 6 Mbps average throughput since this is theregime where bitrate switching decisions are likely to haveQoE impact We filtered out traces which were too short forplaying our entire 192 second video after which we obtained5K traces from wired desktops and 4K sessions from WiFi or3G4G mobile devices Our testbed experiments use a subsetof 571 traces with roughly the same number of traces sampledfrom each of desktop and mobile clients

VirtualPlayer setup Recall that Oboe uses the Virtu-alPlayer to obtain a ConfigMap for any ABR algorithmSince the majority of our results use an actual testbed withthe Dashjs player the benefits of Oboe in our evaluationresults already arise despite any inaccuracies in building theConfigMap on account of using the VirtualPlayer That saidwe have also verified that the VirtualPlayer does a good job

3Available at httpsgithubcomUSC-NSLOboe


Figure 6mdashThe percentage improvement in QoE-lin of MPC+Oboe over RobustMPC for the Testbed experiment The distribution of average bitrate rebuffering ratio and bitratechange magnitude for the schemes is also shown

Figure 7mdashQoE-lin of MPC+Oboe compared to RobustMPC

of tracking the performance of the actual ABR algorithmsFor instance Figure 5(a) and 5(b) demonstrates this for theHYB algorithm The figures shows the correlation for theaverage bitrate and rebuffering ratio for 100 throughputtraces randomly sampled from our dataset using HYBon the VirtualPlayer compared to using an actual Dashjsplayer For both metrics the graph closely tracks the119910 = 119909 line indicating close correlation Given these closecorrelations we use the VirtualPlayer in sect47 to exploreOboersquos performance over a larger range of diverse settingsand our entire set of traces

43 Oboe with RobustMPCWe now demonstrate that Oboe can be used to auto-tune Ro-bustMPC the best performing variant of the MPC algorithmsThe resulting MPC+Oboe uses the best value of the discountparameter 119889 corresponding to the current network state re-placing RobustMPCrsquos online adaptation based on throughputestimates obtained over the past 5 chunks (sect2)

Figure 6(a) shows the CDF of the percentage improvementin QoE-lin of MPC+Oboe over RobustMPC4 MPC+Oboeimproves QoE-lin for 71 of sessions with an overall av-erage QoE-lin improvement of 1762 across all sessionsIn particular for 19 of the sessions QoE-lin improvesby more than 20 For the sessions MPC+Oboe is unable toimprove RobustMPC its performance degradation is mostlyunder 8 Figures 6(b) 6(c) and 6(d) show the constituentQoE metrics While MPC+Oboe achieves distributionally sim-ilar bitrates as RobustMPC as shown in 6(b) it significantlyreduces rebuffering across sessions the number of sessionswith rebuffering reduces from 332 to 53 Further italso achieves better playback smoothness by improving the

4The increase in QoE-lin over RobustMPC relative to the absolute QoE-lin valueof RobustMPC expressed as a percentage

Figure 8mdashAn example session showing how MPC+Oboe is able to outperform Ro-bustMPC by reconfiguring the discount parameter when a network state change is de-tected

median per chunk change magnitude by 38 (Figure 6(d)) Fi-nally Figure 7 shows the CDF of QoE-lin for MPC+Oboeand RobustMPC and indicates MPC+Oboe distributionallyperforms better

Figure 8 illustrates using a single session why MPC+Oboeperforms better than RobustMPC The top graph showsthroughput as a function of time which includes an initialstable state followed by a drop in throughput The middlegraph shows how the discount factor 119889 of both RobustMPCand MPC+Oboe vary (the predicted throughput for eachsystem is reduced by a factor of 1

1+119889 where 119889 is shown onthe y-axis) During the initial stable state when predictionerrors are low RobustMPC steadily lowers its discount factorleading to more aggressive bitrate selections (not shown)This results in a rebuffering event 44 seconds into the session(lowest graph shows buffer occupancy with 0 indicating arebuffering event) In contrast MPC+Oboe does not incur arebuffering event and maintains a fixed 119889 during the initialstable state At 29 sec it detects a change in the network stateand adapts its discount factor leading to more conservativebitrate selections

44 Oboe vs PensievePensieve [39] uses deep reinforcement learning [41 42]a combination of deep learning with reinforcement learn-ing [50] and has been shown to outperform existingABRs including RobustMPC [39] in some settings Since


Figure 9mdashThe percentage improvement in QoE-lin of MPC+Oboe over Pensieve for the 0-6 Mbps throughput region The distribution of average bitrate rebuffering ratio andbitrate change magnitude for the schemes is also shown

Figure 10mdashValidation of our trainingmethodology for Pensieve

Figure 11mdashQoE-lin of MPC+Oboecompared to Pensieve

Figure 12mdashBenefits of specializingPensieve models Each curve shows theQoE improvement of MPC+Oboe rela-tive to each Pensieve model

Figure 13mdashQoE improvement ofMPC+Oboe over two ways of dy-namically selecting from specializedPensieve models

MPC+Oboe outperforms RobustMPC as well we explorehow MPC+Oboe performs relative to Pensieve Our exper-iments use the Pensieve implementation provided by theauthors [11]

Pensieve Re-Training and Validation Before evaluatingPensieve on our dataset we retrain Pensieve using the sourcecode on the trace dataset provided by the Pensieve authors[11] This helps us validate our retraining given that deepreinforcement learning results are not easy to reproduce [29]

We experimented with five different initial entropy weightsin the author suggested range of 1 to 5 and linearly reducedtheir values in a gradual fashion using plateaus with fivedifferent decrease rates until the entropy weight eventuallyreached 01 This rate scheduler follows best-practice [55]From the trained set of models we then selected the bestperforming model (an initial entropy weight of 1 reducedevery 800 iterations until it reaches 01 over 100K iterations)and compared its performance to the pre-trained Pensievemodel provided by the authors Figure 10 shows CDFs ofQoE-lin for the pretrained (Original) model and the modeltrained by us (Retrained) The performance distribution of thetwo models are almost identical over the test traces providedby the Pensieve authors thereby validating our retrainingmethodology

Having validated our retraining methodology we trainedPensieve on our dataset with the same complete strategydescribed above For this we pick 1600 traces randomly fromour dataset with average throughput in the 0-6 Mbps rangeThe number of training traces the number of iterations pertrace and the range of throughput are similar to [39] We thencompare Pensieve and MPC+Oboe over a separate test set oftraces also in the range of 0-6 Mbps (sect42)

Comparison with Pensieve Figure 9(a) shows the CDF ofthe percentage improvement in QoE-lin for MPC+Oboeover Pensieve MPC+Oboe outperforms Pensieve for 81 ofthe sessions with a QoE-lin improvement of 239 in aver-age across all sessions 25 of the sessions achieve more than20 QoE-lin improvement For the sessions MPC+Oboeis unable to improve over Pensieve the performance differ-ence is mostly less than 5 Figures 9(b) 9(c) and 9(d) showthat MPC+Oboe distributionally outperforms Pensieve withrespect to all underlying metrics It reduces the number ofsessions with rebuffering from 107 to 53 reduces themedian per chunk change magnitude by 439 and improvesmedian and 95th percentile average bitrate by 26 and 47respectively Finally Figure 11 shows the CDF of QoE-linfor MPC+Oboe and Pensieve and indicates MPC+Oboe per-forms distributionally better

Analyzing Pensieve performance To understand wherethese performance improvements were coming from we ex-amined the relative performance of these two schemes in the0-3 Mbps range (ie traces having an average throughput be-tween 0-3 Mbps) In this more constrained range of networkconditions we found that MPC+Oboe achieves bigger gainsover Pensieve (average QoE-lin improvement in 0-3 Mbpsis 4623) We hypothesize that this performance differencestems from the fact that Pensieve builds a single model whichdoes not specialize to different throughput ranges

To test this we trained a separate Pensieve model only withtraces that have an average throughput between 0-3 Mbpsrange and compared it with MPC+Oboe Figure 12 showsthe per session QoE-lin improvement of MPC+Oboe com-pared to Pensieve models trained for 0-3 Mbps (which werefer to as Pens-Specialized) and for 0-6 Mbps The me-dian QoE-lin improvement with MPC+Oboe over Pens-Specialized is 1049 while the median improvement over


Figure 14mdashPercentage improvement in bitrate and rebuffering of BOLA+Oboe over BOLA (a)(b) and HYB+Oboe over HYB (c) (d)

Figure 15mdashAverage QoE-lin of MPC+Oboe with various throughput predictors

Pensieve is 199 This indicates specializing the model doesimprove Pensieversquos performance

Thus Pensieversquos model is as yet unable to create special-ized versions of itself based on the session characteristicsBy contrast Oboe specializes parameters for every networkstate and therefore performs better We have also validatedPensieversquos inability to specialize in several other ways build-ing a model for the 3-6 Mbps and showing that it performsbetter with test data in that range compared to a 0-6 Mbpsmodel checking that a 0-6 Mbps model performs better fordata in that range compared to a 0-100 Mbps model andensuring that these results hold even when the training set isdoubled It is hard to pinpoint exactly why Pensieve is unableto learn to be more conservative in the 0-3 Mbps range deepneural network models remain a black box despite efforts bythe machine learning community to make these models moretransparent [45] and obtaining such understanding may needfurther advances in interpretable deep learning models

A model selector with Pensieve One way to improve Pen-sieversquos specialization might be to train different models fordifferent throughput ranges and use the model more suitedto the network conditions To test the efficacy of this ap-proach we used two models (specialized for 0-3 Mbps and3-6 Mbps) and tried two different model selectors Pens-SelMultiple switches models throughout the session usingthe average throughput of the past 5 chunks Pens-SelOncestarts with the 0-6 Mbps model selects either the 0-3 Mbpsor 3-6 Mbps model based on the average throughput of thefirst 5 initial chunks and does not switch thereafter

Figure 13 shows CDFs of per-session QoE-lin improve-ment of MPC+Oboe over these selectors MPC+Oboe is ableto outperform both Pens-SelMultile and Pens-SelOnce withaverage QoE-lin improvements of 142 and 2432 re-spectively Even though one of the model selection schemes

offers some improvements over the 0-6 Mbps Pensieve modelthe benefits are modest We hypothesize that this behavioris due to the dynamic selection of distinct Pensieve modelswhich can interfere with reinforcement learningrsquos decisionchoices since during training the reinforcement learningalgorithm assumes there is no such third party intervention

45 Oboe with other ABR AlgorithmsOboe can also improve other existing ABR algorithms suchas BOLA and HYB which are designed to maximize averagebitrate while minimizing rebuffering

BOLA BOLA+Oboe tunes 120574 (sect2) which determines howmuch the algorithm strives to avoid rebuffering BOLA asimplemented in Dashjs uses a fixed default value of 120574 =minus1028 Figure 14(a) and 14(b) show CDFs of per sessionperformance improvement over BOLA with respect to av-erage bitrate and rebuffering ratio BOLA+Oboe maintainsthe rebuffering ratio of BOLA while improving average bi-trates for more than 83 of sessions with an overall increaseof 72 in average across all sessions For sessions whereBOLA+Oboe does not outperform BOLA its degradation isless than 31

HYB The performance of HYB is sensitive to the choiceof 120573 parameter which HYB+Oboe tunes In production HYBuses 120573 = 025 determined using AB tests in a large-scale de-ployment Figure 14(c) and 14(d) show CDFs of per sessionperformance improvement of average bitrate and rebuffer-ing ratio over HYB As with BOLA HYB+Oboe maintainssimilar rebuffering ratios as shown in 14(d) but improvesbitrates for 98 of sessions with an overall average bitrateimprovement of 832 in average across all sessions

46 Sensitivity experimentsAlternative throughput traces To understand how Oboe

works on throughput datasets beyond those discussed in sect42we evaluated Oboe on two other datasets FCC [8] and HS-DPA [46] that have been used in recent work [39 59] FCCis a broadband dataset while HSDPA contains throughputtraces collected from video streaming sessions over 3G net-works in Norway using mobile devices Our comparisons usethe traces and a Pensieve model pre-trained for those tracesavailable at [11] We focus our evaluations on MPC+Oboeand Pensieve given that Pensieve has been shown to out-perform existing ABR schemes including RobustMPC on


Figure 16mdashComparing HYB with multiple fixed configurations and HYB+Oboe for various settings

these traces Our results show that MPC+Oboe continues toperform better than RobustMPC on these traces Further rela-tive to Pensieve MPC+Oboe improves QoE-lin by an aver-age of 694 across the FCC dataset and 1092 across theHSDPA dataset These improvements are more modest thanthose in Figure 9(a) The vast majority of traces in the FCCand HSDPA set have an average throughput under 3 Mbps(over 95 for FCC and 98 for HSDPA) The results cor-roborate Figure 12 which indicates that MPC+Oboe providesmore modest gains over Pensieve when the latter is trainedand evaluated on datasets with a narrow throughput rangeMPC+Oboe provides larger gains in settings like the tracesdiscussed in sect42 where only 41 traces are under 3 Mbpsand 59 are in the 3-6 Mbps range

Alternative throughput prediction methods Our experi-ments with RobustMPC rely on throughput prediction basedon the harmonic mean of prior throughput samples (follow-ing earlier work [39 59]) with Oboe tuning the configura-tion to compensate for prediction errors We next considerif Oboersquos benefits hold if RobustMPC were to have moreaccurate throughput predictions potentially by using alter-nate prediction methods [49] Rather than using a specificprediction technique we consider an ideal (and unachievable)approach that we denote as Ideal(T) which can exactly predictthe average throughput over the next T seconds Our experi-ments were conducted in simulation using the VirtualPlayerand the testbed experiment traces (sect42)

Figure 15 shows the average QoE-lin across the tracesfor RobustMPC and MPC+Oboe using both the default har-monic mean approach and Ideal(T) for different values of TAlthough RobustMPC performs better with an ideal predictorOboe still provides benefits achieving an average improve-ment in QoE-lin of 634 for Ideal(5) and of 18 forIdeal(10) compared to a 161 improvement with the har-monic mean estimator While the magnitude of benefits issmaller with the ideal prediction approach in practice Oboewill likely result in larger benefits since even more sophisti-cated schemes [49] cannot achieve the ideal predictions andthe errors are likely to grow with larger T

Oboe can improve performance over RobustMPC evenwhen an Ideal(T) prediction method is used for two reasonsFirst 119879 may not match the duration of chunk downloads

with RobustMPC which depends on the exact sequence ofbitrates chosen during the look-ahead window The durationis not known a priori since RobustMPC itself determines thebitrates based on a provided prediction Second the decisionsmade by RobustMPC are over a small look-ahead windowwhich may not guarantee optimality over the entire sessionduration

47 Oboe Across Various SettingsIn sect45 we have shown that Oboe outperforms other ABRalgorithms when compared to their default configurationsWe now explore for HYB whether Oboe outperforms allparameter settings of HYB and whether it can tune ABRsbased on content type and publisher specifications For theseexperiments we use the VirtualPlayer described in sect32

Comparison against all fixed configurations To exploredifferent fixed configurations we run HYB with 10 differentfixed 120573s and compare with HYB+Oboe We summarize theperformance for each configuration by considering the (i)median of the average bitrate and the (ii) 90th percentile ofthe rebuffering ratio across test traces In this experiment wealso consider an Oracle which is the best fixed configurationfor each throughput trace with respect to two metrics thatHYB tries to optimize

Figure 16(a) and 16(b) compare HYB HYB+Oboe andOracle over desktop and mobile traces respectively WhileOracle and HYB+Oboe are depicted as single dots since theirperformance is uniquely determined we present a frontierfor HYB that shows its performance for different fixed con-figuration Figure 16(a) shows that HYB+Oboe outperformsHYB in the sense that there is no fixed configuration for HYBthat does better than HYB+Oboe performance HYB+Oboeimproves the average bitrates of the median session by 32while achieving similar rebuffering ratio Alternately it re-duces the 90th percentile rebuffering ratio from 19 to 0while maintaining similar bitrates A similar result holds formobile traces (Figure 16(b)) Thus even if publishers wereto find the best fixed parameter choice for HYB Oboe wouldoutperform that choice because it dynamically adapts theparameters

Comparison under different publisher specificationsOur results so far are for a VoD (video on demand) settingwith a maximum buffer size of 2 minutes Figure 16(c)


Figure 17mdashAvg of avg bitrate andfraction of sessions with rebuffering forHYB+Oboe and different publisher pref-erences

Figure 18mdashAvg of avg bitrate andfraction of sessions with rebufferingfor RobustMPC and different publisherpreferences

depicts performance for live video (which uses a maximumbuffer size of 20 seconds to mimic live settings) HYB+Oboeoutperforms HYB for this setting though we note that thebitrate of both approaches degrades relative to the VoDsetting since the baseline HYB switches to higher bitratesmore conservatively owing to the smaller buffer sizes

Finally Figure 16(d) depicts performance for higher bi-trate levels (1002 1434 2738 3585 4661 5886119896119887119901119904) anda chunk size of 5 seconds Even for these choices HYB+Oboeoutperforms HYB demonstrating its ability to adapt to differ-ent publisher specification

Accommodating publisherrsquos rebuffering tolerance Oboeallows the publisher to optionally specify explicit rebuffer-ing preferences (sect3) ABR algorithms such as RobustMPCwhich use the QoE-lin function may permit this indirectlyby adjusting QoE-lin weights (sect21) Figure 17 shows theeffectiveness of these approaches showing the average ofaverage bitrates and the fraction of sessions with rebufferingfor HYB+Oboe As the publisher makes its rebuffering pref-erence stricter (from 2-0) HYB+Oboe achieves lowerrebuffering ratios close to the target rebuffering tolerance Incontrast Figure 18 shows that RobustMPC is less effectiveat controlling rebuffering by adjusting its rebuffering penaltywhen the weight on the rebuffering term is varied between100 (strictly avoid rebuffering) to 435 We find that even witha very high rebuffering penalty of 100 RobustMPC causesrebuffering in 11 of the sessions This shows the benefit ofOboersquos approach which gives direct control over the underly-ing metrics

48 Oboe OverheadComputing the ConfigMap incurs a one-time cost since themap can be reused across all clients once the it is built Com-puting the best parameter configuration for one network statetakes about 12 seconds on a single core This task is perfectlyparallelizable so computing 10K network states (sect3) willtake approximately 35 hours to explore with two machinesof 48 cores each We have also analyzed the processing over-head incurred by the ChangeDetector module of Oboe Wemeasure the time taken by ChangeDetector for every decision

5We used a change penalty of 0 for fair comparison

Figure 19mdashComparing prototype Oboe with commercial client side ABR implemen-tation in average bitrate and rebuffering ratio

Figure 20mdashTime between consecutivebitrate switches for two commercialABRs

Figure 21mdashVariance in bitrate levelsacross videos from two content publish-ers

cycle across our experiments and the measurement indicatesthat the median processing time of ChangeDetector is around14 ms Since each decision is made at a chunk boundary andchunks are 4 seconds ChangeDetector accounts for less than035 overhead

5 DEPLOYMENT CONSIDERATIONSThe offline stage of Oboe can be run on the cloud but severalchoices exist for the online stage ranging from embeddingthe online stage entirely in the client player or moving someor all of the online stage to the cloud In our implementationOboersquos components run on the server side This mimics acloud implementation which has the benefits of other cloudsoftware fast update deployment device independence etc[4] We leave a detailed comparison of these choices to futurework but explore in this section the feasibility of runningthe online stage on the cloud

To this end we have implemented a restricted version ofHYB+Oboe on AWS This limited version of Oboe imple-ments HYB and incorporates tuning based on publisher speci-fications but not network state In our implementation a clientplayer periodically reports player state (such as buffer lengthand current bitrate) and throughput samples to a Oboe cloudserver and receives bitrate decisions in return For 10 playerfeatures and two chunk downloads per second the commu-nication overhead is 64 Kbps negligibly small compared tothe size of video chunks Figures 19(a) and 19(b) compare theperformance of this implementation against a client playerrunning HYB over 20K sessions collected during a two-weekpilot deployment Oboe is comparable in performance to theclient side player and even improves bitrate slightly (becauseit was tuned to this publisherrsquos specification)


We expected a cloud implementation would perform worsebecause of the latency induced by client-server communica-tion However we found that most of the bitrate switchingdecisions occur on timescales much longer than the client-server latencies Figure 20 shows the CDF of the time intervalbetween consecutive bitrate switches for ABR algorithms intwo widely used video players(Adobersquos Flash [3] and Mi-crosoft Smooth Streaming [1]) The figure shows that over95 of switching decisions occur at intervals higher than1 second for both players This suggests that a cloud-baseddeployment is viable

6 DISCUSSION AND FUTURE WORKPerformance improvements for all sessions As our re-

sults (eg Figure 6(a)) show Oboe improves the perfor-mance for most but not all sessions relative to the ABR al-gorithm it tunes For instance after inspecting the resultsin sect43 we have found that MPC+Oboe typically improvesperformance relative to RobustMPC by reducing rebufferingandor the magnitude of bitrate changes but at the expense ofslightly lower bitrates The resulting QoE-lin is improvedfor most sessions indicating Oboe does a good job of properlybalancing the various factors but some sessions see lowerQoE-lin More generally designing an ABR approach thatcan optimize the performance of all sessions is a hard problemthat needs more research

Sharing ConfigMap across videos Oboe need not per-form offline precomputation for each individual video asit can use a single ConfigMap for a class of videos that followa similar bitrate encoding scheme Figure 21 shows that twopopular video publishers use similar encoding schemes acrosstwo thousand videos each Publisher 1 uses 7 distinct bitratelevels and the coefficient of variance across bitrates withineach level is only 013 while Publisher2 uses 10 distinctbitrate levels and the coefficient of variance across bitrateswithin each level is only 0067 This indicates the potential toshare a single ConfigMap across videos

Generality of Oboe While we have shown that Oboe cantune a variety of configuration parameters across several ABRalgorithms whether Oboe can tune all algorithms and all pa-rameters is an open question It is unclear if Oboe can directlyaugment Pensieve since a model learned by reinforcementlearning may not interact well with intermediaries such asOboe However combining the benefits of Oboe and Pensievein other ways is an interesting avenue for future work

7 RELATED WORKTuning ABR Algorithm Configurations The BBA2 algo-

rithm [32] tunes its lower reservoir based on buffer occupancydynamics while MPC [59] adapts its throughput discount fac-tor based on past prediction errors (sect2) In contrast to such ad-hoc heuristics Oboe selects configuration parameters based

on network state and publisher specifications The approachis generically applicable to many ABR algorithms Newercongestion control protocols like BBR [19] estimate networkthroughput which if exposed could benefit Oboe

Learning ABR Algorithms Among ABR algorithms thatuse Reinforcement Learning and other machine learning tech-niques [20 21 39 40 53] Pensieve [39] has been shown toperform the best While Pensieve does not specialize to differ-ent throughput regimes Oboe performs better by specializingparameter values for each network state independently

Other work in self-tuning Beyond ABR algorithms self-tuning approaches have been explored in other contexts Win-stein et al [56] used simulations to determine TCP parametersfor different settings while Semke et al [47] proposed tuningTCP socket buffers to ensure high throughput More gener-ally Google Vizier [28] performs such black-box tuning asa service While Vizier can potentially be used to implementthe offline phase of Oboe our work identifies underlyingprinciples (such as the piecewise stationarity of availablethroughput) that forms the basis for the tuning

Video QoE Several researchers have pointed out thatsub-optimal ABR performance can significantly impactuser-engagement and hence revenue [16 37] Others havelooked at quality issues that occur when multiple playersstart to compete for bandwidth [15 30 31 34] In contrastOboe improves the QoE performance of several ABRalgorithms across a range of different network conditions byautomatically tuning their parameters

8 CONCLUSIONOboe is a system for automatically tuning ABR algorithmsby adapting ABR configurations in realtime to match thecurrent network state Picking configurations in a mannerinformed by network state and publisher preferences distin-guishes Oboersquos approach from heuristics used today that donot consider these factors Oboe significantly improves theperformance of BOLA HYB and RobustMPC further fornearly 80 of the sessions in our dataset Oboe integratedwith RobustMPC improves QoE-lin relative to Pensieveand the improvements exceed 20 for 25 of the sessions

Acknowledgments We thank our shepherd Mohammad Al-izadeh and the anonymous reviewers for their constructivefeedback We thank Oleg White Yan Li and Shubo Liufor their assistance obtaining the throughput data and forhelpful discussions This work was funded in part by theNational Science Foundation (NSF) Awards CNS-1618921CNS-1564242 and CNS-1413978 and the ARO under theUS Army Research Laboratory award W911NF-09-2-0053Any opinions findings and conclusions or recommendationsexpressed in this material are those of the authors and do notnecessarily reflect the views of NSF or ARO


BIBLIOGRAPHY[1] Microsoft Smooth Streaming httpwwwiisnetdownloadsmicrosoftsmooth-

streaming[2] Toward A Practical Perceptual Video Quality Metric httpsmedium

comnetflix-techblogtoward-a-practical-perceptual-video-quality-metric-653f208b9652

[3] Adobe OSMF player httpwwwosmforg[4] Oracle 5 Reasons to Consider SaaS for Your Business Applications httpwww

oraclecomussolutionscloudsaas-business-applications-1945540pdf[5] Chrome Remote Interface httpsgithubcomcyrus-andchrome-remote-

interface[6] Cisco It Came to Me in a Stream httpswwwciscocomwebaboutac79docs

spOnline-Video-Consumption_Consumerspdf[7] DASH Industry Forum httpsgithubcomDash-Industry-Forumdashjs[8] Federal Communications Commission Raw Data - Measuring Broadband Amer-

ica wwwfccgovreports-researchreportsmeasuring-broadband-americaraw-data-measuring-broadband-america-2016

[9] Google-Chrome Chrome DevTools Protocol httpschromedevtoolsgithubiodevtools-protocoltotNetwork

[10] Bayesian Changepoint Detection httpsgithubcomhildensiabayesian_changepoint_detection

[11] Pensieve httpsgithubcomhongzimaopensieve[12] DASH Industry Forum httpsdashakamaizednetenvivioEnvivioDash3[13] Sandvine Global Internet phenomena report httpswwwsandvinecom

downloadsgeneralglobal-internet-phenomena20142h-2014-global-internet-phenomena-reportpdf

[14] Ryan Prescott Adams and David JC MacKay Bayesian Online ChangepointDetection In arXiv07103742v1 2007

[15] Saamer Akhshabi Lakshmi Anantakrishnan Ali C Begen and ConstantineDovrolis What Happens when HTTP Adaptive Streaming Players Compete forBandwidth In the International Workshop on Network and Operating SystemSupport for Digital Audio and Video NOSSDAV 2012

[16] Athula Balachandran Vyas Sekar Aditya Akella Srinivasan Seshan Ion Stoicaand Hui Zhang Developing a Predictive Model of Quality of Experience forInternet Video In Proceedings of the ACM Conference on Special Interest Groupon Data Communication SIGCOMM 2013

[17] Hari Balakrishnan Mark Stemm Srinivasan Seshan and Randy H Katz Ana-lyzing Stability in Wide-area Network Performance ACM SIGMETRICS Perfor-mance Evaluation Review 252ndash12 1997

[18] Daniel Barry and John A Hartigan A Bayesian Analysis for Change Point Prob-lems Journal of the American Statistical Society 88(421)309ndash319 1993

[19] Neal Cardwell Yuchung Cheng C Stephen Gunn Soheil Hassas Yeganeh andVan Jacobson BBR Congestion-Based Congestion Control ACM Queue 1420ndash53 2016

[20] Federico Chiariotti Stefano DrsquoAronco Laura Toni and Pascal Frossard OnlineLearning Adaptation Strategy for DASH Clients In Proceedings of the Interna-tional Conference on Multimedia Systems MMSys 2016

[21] Maxim Claeys Steven Latreacute Jeroen Famaey Tingyao Wu Werner Van Leek-wijck and Filip De Turck Design and Optimisation of a (FA)Q-learning-basedHTTP Adaptive Streaming Client Connection Science 26(1)25ndash43 2014

[22] Freacutedeacuteric Desobry Manuel Davy and Christian Doncarli An Online KernelChange Detection Algorithm IEEE Transactions on Signal Processing 53(8)2961ndash2974 2005

[23] Florin Dobrian Vyas Sekar Asad Awan Ion Stoica Dilip Joseph Aditya Gan-jam Jibin Zhan and Hui Zhang Understanding the Impact of Video Quality onUser Engagement In Proceedings of the ACM Conference on Special InterestGroup on Data Communication SIGCOMM 2011

[24] Paul Fernhead Exact and Efficient Bayesian Inference for Multiple ChangepointProblems Statistics and Computing 16(2)203ndash213 2006

[25] Tobias Flach Pavlos Papageorge Andreas Terzis Luis Pedrosa Yuchung ChengTayeb Karim Ethan Katz-Bassett and Ramesh Govindan An Internet-WideAnalysis of Traffic Policing In Proceedings of the ACM Conference on SpecialInterest Group on Data Communication SIGCOMM 2016

[26] Wayne A Fuller Introduction to Statistical Time Series John Wiley and Sons1976

[27] Mojgan Ghasemi Partha Kanuparthy Ahmed Mansy Theophilus Benson andJennifer Rexford Performance Characterization of a Commercial Video Stream-ing Service In Proceedings of the ACM Conference on Internet MeasurementConference IMC 2016

[28] Daniel Golovin Benjamin Solnik Subhodeep Moitra Greg Kochanski JohnKarro and D Sculley Google Vizier A Service for Black-Box OptimizationIn Proceedings of the ACM International Conference on Knowledge Discoveryand Data Mining SIGKDD 2017

[29] Peter Henderson Riashat Islam Philip Bachman Joelle Pineau Doina Precupand David Meger Deep Reinforcement Learning that Matters In Proceedings ofthe Association for Advancement of Artificial Intelligence AAAI 2018

[30] Reacutemi Houdaille and Steacutephane Gouache Shaping HTTP Adaptive Streams for aBetter User Experience In Proceedings of the Multimedia Systems ConferenceMMSys 2012

[31] Te-Yuan Huang Nikhil Handigol Brandon Heller Nick McKeown and RameshJohari Confused Timid and Unstable Picking a Video Streaming Rate is HardIn Proceedings of the ACM Conference on Internet Measurement ConferenceIMC 2012

[32] Te-Yuan Huang Ramesh Johari Nick McKeown Matthew Trunnell and MarkWatson A Buffer-based Approach to Rate Adaptation Evidence from a LargeVideo Streaming Service In Proceedings of the ACM Conference on SpecialInterest Group on Data Communication SIGCOMM 2014

[33] Daniel R Jeske Veronica Montes De Oca Wolfgang Bischoff and Mazda Mar-vasti Cusum Techniques for Timeslot Sequences with Applications to NetworkSurveillance Computational Statistics and Data Analysis 534332ndash4344 2009

[34] Junchen Jiang Vyas Sekar and Hui Zhang Improving Fairness Efficiency andStability in HTTP-based Adaptive Video Streaming with FESTIVE In Proceed-ings of the ACM International Conference on Emerging Networking Experimentsand Technologies CoNEXT 2012

[35] James Jobin Michalis Faloutsos Satish K Tripathi and Srikanth V Krishna-murthy Understanding the Effects of Hotspots in Wireless Cellular NetworksIn Proceedings of the Conference of the IEEE Computer and CommunicationsSocieties INFOCOM 2004

[36] Eamonn J Keogh Selina Chu David Hart and Michael J Pazzani An OnlineAlgorithm for Segmenting Time Series In Proceedings of the IEEE InternationalConference on Data Mining ICDM 2001

[37] S Shunmuga Krishnan and Ramesh K Sitaraman Video Stream Quality Im-pacts Viewer Behavior Inferring Causality Using Quasi-experimental DesignsIn Proceedings of the ACM Conference on Internet Measurement ConferenceIMC 2012

[38] Dong Lu Yi Qiao Peter A Dinda and Fabian E Bustamante Characterizing andPredicting TCP Throughput on the Wide Area Network In IEEE InternationalConference on Distributed Computing Systems ICDCS 2005

[39] Hongzi Mao Ravi Netravali and Mohammad Alizadeh Neural Adaptive VideoStreaming with Pensieve In Proceedings of the ACM Conference on SpecialInterest Group on Data Communication SIGCOMM 2017

[40] Virginia Martiacuten Juliaacuten Cabrera and Narciso Garciacutea Design Optimization andEvaluation of a Q-learning HTTP Adaptive Streaming Client IEEE Transactionson Consumer Electronics 62(4)380ndash388 2016

[41] Volodymyr Mnih Koray Kavukcuoglu David Silver Andrei A Rusu Joel Ve-ness Marc G Bellemare Alex Graves Martin Riedmiller Andreas K Fidje-land Georg Ostrovski et al Human-level Control Through Deep ReinforcementLearning Nature 518(7540)529ndash533 2015

[42] Volodymyr Mnih Adria Puigdomenech Badia Mehdi Mirza Alex Graves Tim-othy Lillicrap Tim Harley David Silver and Koray Kavukcuoglu AsynchronousMethods for Deep Reinforcement Learning In Proceedings of the InternationalConference on Machine Learning ICML 2016

[43] Hossein Pishro-Nik Introduction to Probability Statistics and Random Pro-cesses Kappa Research 2014

[44] Thanawin Rakthanmanon Eamonn J Keogh Stefano Lonardi and Scott EvansTime Series Epenthesis Clustering Time Series Streams Requires Ignoring SomeData In Proceedings of the International Conference on Data Mining ICML2011

[45] Marco Tulio Ribeiro Sameer Singh and Carlos Guestrin Why Should I TrustYou Explaining the Predictions of Any Classifier In Proceedings of the ACMInternational Conference on Knowledge Discovery and Data Mining SIGKDD2016

[46] Haakon Riiser Paul Vigmostad Carsten Griwodz and Paringl Halvorsen CommutePath Bandwidth Traces from 3G Networks Analysis and Applications In Pro-ceedings of the ACM Multimedia Systems Conference MMSys 2013

[47] Jeffrey Semke Jamshid Mahdavi and Matthew Mathis Automatic TCP BufferTuning In Proceedings of the ACM Conference on Special Interest Group onData Communication SIGCOMM 1998

[48] Kevin Spiteri Rahul Urgaonkar and Ramesh K Sitaraman BOLA Near-optimalBitrate Adaptation for Online Videos In Proceedings of the IEEE InternationalConference on Computer Communications INFOCOM 2016

[49] Yi Sun Xiaoqi Yin Junchen Jiang Vyas Sekar Fuyuan Lin Nanshu Wang TaoLiu and Bruno Sinopoli CS2P Improving Video Bitrate Selection and Adapta-tion with Data-Driven Throughput Prediction In Proceedings of the ACM Con-ference on Special Interest Group on Data Communication SIGCOMM 2016

[50] Richard S Sutton and Andrew G Barto Reinforcement learning An introductionMIT press Cambridge 1998

[51] Guibin Tian and Yong Liu Towards Agile and Smooth Video Adaptation inDynamic HTTP Streaming In the ACM International Conference on EmergingNetworking Experiments and Technologies CoNEXT 2012

[52] Guillaume Urvoy-Keller On the Stationarity of TCP Bulk Data Transfers InProceedings of the Passive and Active Measurement Conference PAM 2005


[53] Jeroen van der Hooft Stefano Petrangeli Maxim Claeys Jeroen Famaey andFilip De Turck A Learning-based Algorithm for Improved Bandwidth-awarenessof Adaptive Streaming Clients In Symposium on Integrated Network Manage-ment IM 2015

[54] Li Wei and Eamonn Keogh Semi-supervised Time Series Classification InProceedings of the ACM International Conference on Knowledge Discovery andData Mining SIGKDD 2006

[55] Ronald J Williams and Jing Peng Function Optimization using ConnectionistReinforcement Learning Algorithms Connection Science 3(3)241ndash268 1991

[56] Keith Winstein and Hari Balakrishnan TCP Ex Machina Computer-generatedCongestion Control In Proceedings of the ACM Conference on Special InterestGroup on Data Communication SIGCOMM 2013

[57] Xuan Xiang and Kevin Murphy Modelling Changing Dependency Structure inMultivariate Time Series In Proceedings of the International Conference onData Mining ICML 2007

[58] Kenji Yamanishi and Jun-ichi Takeuchi A Unifying Framework for DetectingOutliers and Change Points from Non-stationary Time Series Data In Proceed-ings of the ACM International Conference on Knowledge Discovery and DataMining SIGKDD 2002

[59] Xiaoqi Yin Abhishek Jindal Vyas Sekar and Bruno Sinopoli A Control-Theoretic Approach for Dynamic Adaptive Video Streaming over HTTP InProceedings of the ACM Conference on Special Interest Group on Data Com-munication SIGCOMM 2015

[60] Yin Zhang and Nick Duffield On the Constancy of Internet Path Properties InProceedings of the ACM SIGCOMM Workshop on Internet Measurement 2001

Abstract

1 Introduction

2 Background and Motivation

21 Background on ABR Algorithms

22 Ensuring High QoE for All Users

3 Design

31 Representing Network State

32 Offline Mapping of Network States

33 Online ABR Tuning

4 Evaluation

41 Metrics

42 Methodology

43 with

44 vs Pensieve

45 with other ABR Algorithms

46 Sensitivity experiments

47 Across Various Settings

48 Overhead

5 Deployment Considerations

6 Discussion and Future Work

7 Related Work

8 Conclusion

Bibliography


the current network state of a client connection specificallyto throughput and throughput variability

Oboersquos design is based on the observation made by priorwork [17 35 38 52 60] that TCP connections are well-modeled as traversing a piecewise-stationary sequence ofnetwork states (sect31) the connection consists of multiplenon-overlapping segments where each segment is in a distinctstationary network state For each possible network stateOboe pre-computes offline the best parameter configurationfor a given ABR algorithm (sect32) It does this by subjectingthe algorithm for each state to different parameter valuesand picking the one that results in the best performance Thenduring video playback Oboe continuously uses a change-point detection algorithm to detect changes in network stateand selects the parameter identified by the offline analysis asbest for the current state Thus if a video session encountersvarying network state during its lifetime Oboe automaticallyspecializes the ABR parameter to each state (sect33)

We have implemented Oboe and demonstrated several as-pects of its performance through testbed experiments andtrace driven simulations First Oboe significantly improvesperformance of QoE metrics for three qualitatively differentABR algorithms one that makes bitrate switching decisionson buffer occupancy alone (BOLA) [48] another that usesboth throughput and buffer occupancy (HYB a widely de-ployed algorithm) and a third that also optimizes decisionsacross a finite lookahead horizon (RobustMPC) [59] In eachof these cases Oboe results in significant improvement Forinstance Oboe reduces sessions with rebuffering from 333to 53 relative to RobustMPC while also significantly im-proving a composite QoE metric

Oboe when applied to RobustMPC also performs signifi-cantly better than a newly proposed approach called Pensievethat learns from real traces (using reinforcement learning)how to adapt to a variety of network conditions For nearly80 of the sessions in our dataset Oboe improves the samecomposite metric with benefits exceeding 20 for 25 of thetraces Compared to Oboe which can specialize parametersto individual network states Pensieve is unable to special-ize across the entire range of network throughputs We havetried training specialized Pensieve models for different rangesof network throughputs and dynamically switching modelsbased on estimated session throughput This helps but a sig-nificant gap between the two approaches still remains (sect44)

While a variety of viable pathways exist to deploying Oboewe focus on an architecture where Oboe and the entire ABRlogic are deployed on the cloud which enables rapid evolutionand fine-grain customizability We show the viability of thisarchitecture with results from a pilot deployment

2 BACKGROUND AND MOTIVATIONThe Internet video delivery ecosystem consists of hundredsof content publishers and hundreds of client side applicationsthat stream video content to diverse user devices Publisherscontent delivery networks and users all seek to improve userquality of experience (QoE) There are many factors thataffect QoE including start up latency the average bitrate for avideo session as well as the rebuffering ratio (the percentageof time playback is stalled because of drained buffer) [23]Video players improve QoE using adaptive bitrate (ABR)algorithms which select bitrates for each chunk while (1)ensuring the bitrate seen by the user is as high as possibleand (2) avoiding rebuffering events at the client Some ABRalgorithms may also try to minimize the number of bitrateswitches to make the playback smooth

Content publishers serve different types of content includ-ing VoD (Video on Demand) or Live broadcasts They mayalso serve streams of different qualities ranging from HD(high definition) to SD (standard definition) These differ-ences impact how they serve videos For example publisherswho serve VoD content can use player buffers as large as 4minutes [32] whereas publishers serving live content mayhave a time-to-live2 requirement between 15-45 seconds Sim-ilarly based on the quality of streams they serve publishersmay use different bitrate levels or chunk sizes Further pub-lishers may have different QoE objectives For example somemay strictly prefer to minimize rebuffering and others mayrelax their tolerance for rebuffering to prioritize higher bi-trates We use the term publisher specifications to denotetheir choice of bitrate levels chunk sizes content type andrebuffering tolerance

21 Background on ABR AlgorithmsABR algorithms fall in two broad categories (i) those thatuse both prediction of network throughput and buffer occu-pancy [34 51 59] and (ii) those that are primarily based onbuffer occupancy [32 48] Within the above two categoriesABR algorithms can be designed using approaches rangingfrom heuristics to stochastic optimization In sect4 we discussa recently proposed ABR algorithm based on a qualitativelydifferent approach reinforcement learning [39]

MPC Throughput prediction and buffer occupancy withlook-ahead Selects bitrate by solving an optimization prob-lem MPC [59] predicts throughput of future chunk down-loads based on throughput samples of recently downloadedchunks then uses this predicted throughput to select bitratesto optimize a given QoE function (sect4) over a look-aheadwindow of 5 future chunks The aggressive version of thealgorithm (FastMPC) directly uses a throughput estimate ob-tained using a harmonic mean predictor To compensate for2For live content the time between the event and its broadcast to users This boundsthe maximum buffer that a player streaming a live event can build





















































119873 sum




















































































































































Abstract

1 Introduction




3 Design




4 Evaluation

41 Metrics

42 Methodology

43 with

44 vs Pensieve




48 Overhead



7 Related Work

8 Conclusion

Bibliography





















































119873 sum




















































































































































Abstract

1 Introduction




3 Design




4 Evaluation

41 Metrics

42 Methodology

43 with

44 vs Pensieve




48 Overhead



7 Related Work

8 Conclusion

Bibliography











































































































































Abstract

1 Introduction




3 Design




4 Evaluation

41 Metrics

42 Methodology

43 with

44 vs Pensieve




48 Overhead



7 Related Work

8 Conclusion

Bibliography

Oboe: Auto-tuning Video ABR Algorithms to Network Conditionsisl/papers/sigcomm18-final128.pdfOboe...

Documents

Transcript of Oboe: Auto-tuning Video ABR Algorithms to Network Conditionsisl/papers/sigcomm18-final128.pdfOboe...