IN YIELD OPTIMIZATION for Sellers of Television Advertising€¦ · Given the importance of...

WWW.FURIOUSCORP.COM

The Role of Forecasting IN YIELD OPTIMIZATION

for Sellers of Television Advertising

Ehud Trainin, PhD Chief Data Scientist | Furious Corp.

WWW.FURIOUSCORP.COM

1. IntroductionForecasting is the foundation for yield optimization in many industries.

Forecasting of a products’ demand informs a variety of key business decisions

such as how many products to produce, how much inventory should be kept

in stock, whether to change production’s capacity, or whether to enter a new

market, etc. In order to ensure the right business decisions are made to optimize

yield, most companies maintain and utilize forecasts of Demand (how much of

what products the market wants to buy and at what price) to plan and optimize

yield.

In the media industry, for sellers of advertising [Sellers], a Demand Forecast

consists of the inventory elements expected to be requested by advertisers, the

price advertisers would be willing to pay and how that changes over time. A

demand forecast is essential, if reliable and accurate, as an input for deciding whether to sign a given deal or

rather leave requested inventory by an advertiser for better future sale opportunities. The decision would be

made based on in which scenario the Seller would sell the most inventory at the highest price, with a sufficient

level of confidence, i.e. optimize yield. The demand forecast also helps Sellers plan and optimize promotional

investment, which is a decision to cannibalize the sale of inventory for the purpose of increasing audience or

the amount of supply of inventory, resulting in foregone revenue.

The dynamics of Supply Forecasting in media are unique. It is difficult and often impossible for media Sellers

to simply increase capacity and produce more products for sale when demand increases. Supply, i.e. the

impressions to be viewed, is not known in advance nor controlled by Sellers, making yield optimization

complex and unique. Effective yield optimization requires a reliable supply forecast, which is the number of

ads delivered or ad views of different shows at different times to unique target audiences. The supply forecast

is used not only to determine the number of impressions guaranteed to advertisers and to plan, price and

allocate ad inventory for sale, but also to help determine which new programs to buy or produce and to

schedule programs.

Given the importance of forecasting, the accuracy of both the supply and demand forecasts have significant

influence on a Seller’s ability to optimize yield and revenue. For example, underestimation of the number of

views means a lost opportunity to sell inventory, while overestimation results in a failure to deliver advertisers

the guaranteed views and liability, requiring make goods or the risk of negatively impacting future deals.

This is true for any seller of advertising, and in particular, for sellers of traditional TV and online video given

the lack of supply of these premium inventory types. Obviously, no forecasting technique provides 100%

perfect accuracy. If it did, a lot of Data Scientists would be out of work and bored. Therefore Sellers must

explore ways, invest, and work to achieve more accurate forecasting techniques, as close to actual values (on

average) as possible with the likely reward of significantly higher revenue as a result.

WWW.FURIOUSCORP.COM

In prediction for any source data set, the smaller the sample size the larger the margin of error. This means

that as more and more Sellers introduce audience targeted or Advanced or Addressable ad products to the

market, the challenge of accurate forecasting increases.

How can a Seller improve their supply and demand forecasting accuracy?

In this paper we will explain high level principles of good forecasting, illustrated with examples from ad views

forecasting. Understanding these principles does not require any background in advanced mathematics,

machine learning or programming. Forecasting and optimization are core competencies of Furious’ data

science capabilities and the foundation of the planning, pricing, inventory allocation and overall yield

optimization capabilities of our platform, PROPHET.

The principles described in this paper, are inherent to any forecasting process, whether done by an individual,

a team, or by artificial intelligence. Humans, as well as animals, forecast all the time. For example, when we

catch a ball, we unconsciously forecast or predict where it is going to fall in order to get there to catch

it. The birth of forecasting algorithms occurred more than 200 years ago with the introduction of curve

fitting techniques by Boscovich, Laplace and Gauss. Nowadays, computers allow us to apply much more

sophisticated algorithms and to use much more data, however as we continue to evolve, develop and

apply better forecasting algorithms, these founding principles still apply and are essential to any accurate

forecasting methodology today.

2. Principles2.1 ConceptsA forecasting model is a function that computes forecasted values. A forecasting

model may have parameters and input variables. For example, in the case of a

forecasting model of the sort:

forecast = a∙t + b

the time (t) is an input variable, while the slope (a) and the intercept (b) are

parameters.

WWW.FURIOUSCORP.COM

Furious leverages Machine Learning (a subfield of Artificial Intelligence) to continually improve and self-

correct its models to improve accuracy. The learning is a process in which a computer algorithm sets a

forecasting model’s parameters. In the examples above that would be setting the values of the slope and the

intercept. It is important to note the difference between the learning algorithm and the forecasting algorithm.

The first computes and/or updates the model by which forecasting is done, while the second uses this model

to produce forecasts.

Learning may be limited to an initial training period, e.g. the first 10 episodes of a program. Alternatively, we

may keep learning continually or on an ongoing basis, i.e. to update the forecasting model’s parameters as

updated actuals or new measurement data arrives. When a forecasting algorithm keeps learning all the time

and continually improves, it is referred to as Adaptive. We may also use the term Adaptiveness to denote

how quickly the forecasting algorithm adapts and tracks changes in the metric behavior. For example,

if views of a certain program, which so far had an upward trend, start to decline, an adaptive forecast

would track this change and forecast a decline rather than stick to the historical rising trend. The faster the

algorithm adapts to such changes, the more adaptive it is.

Robustness is the ability of the forecasting algorithm logic to filter out transient phenomena or changes in

a pattern that are one-off or not true change in direction. For example, if we had an upward trend, yet the

views declined last episode, we may not want the forecasting to change its opinion right away from a rise to

a decline, since very often such a decline turns out to be transient.

2.2 Adaptiveness vs. Robustness While both adaptiveness and robustness are important, they are mathematically and inherently opposing.

Being adaptive means responding quickly to any sign of change, while robustness means delaying the

response, as much as possible, in order to make sure it is not a transient change. Adaptiveness looks more at

the recent history, while robustness looks at the experience gained through a longer history.

The forecast’s accuracy achieved by extreme adaptiveness would be poor. If we take again the simple

example of a linear forecasting model (forecast = a∙t + b), then extreme adaptiveness means a prediction that

is based only upon the two most recent measurements, since two measurements are sufficient to draw a line

from which we can extrapolate into the future. This would cause the forecasting to be very easily disrupted

by random fluctuations. Real time series usually have such fluctuations and often anomalous fluctuations (i.e.

spikes in the graph), which have especially severe effects on highly adaptive forecasting. Yet, even without

such anomalies, the result of an extremely adaptive forecasting methodology is poor accuracy, as shown in

Figure 1.

WWW.FURIOUSCORP.COM

Supernatural

Vie

w (M

illi

ons)

Episode

Supernatural

Vie

w (M

illi

ons)

Episode

Figure 1: Program views against the forecast of an extremely adaptive forecasting algorithm.

Figure 2: Program views against the forecast of an extremely robust forecasting algorithm.

WWW.FURIOUSCORP.COM

Extreme robustness does not achieve accurate forecasts either. In the case of the linear forecasting model

(forecast = a∙t + b), extremely robust methodology results in a forecast that sticks to a straight line learned

during an initial training period. This doesn’t work since a TV program’s views usually change along-side

viewing trends during its lifetime. Figure 2 shows the resulting accuracy from an extremely robust forecasting

methodology.

Although it is inherently impossible to reach perfect robustness and perfect adaptiveness at the same

time, the use of appropriate techniques allows for simultaneous application of both good adaptiveness and

robustness to achieve superior predictive accuracy.

The following best practices combine both adaptiveness and robustness to achieve more accurate forecasts:

1. Find the optimal balance between robustness and adaptiveness for the metrics you are

forecasting.

2. Use and apply additional data sets to better understand the nature of changes in the data and to

better infer if changes are or are not transient.

3. Leverage anomaly detection to better infer if changes are or are not transient.

Supernatural

Vie

w (M

illi

ons)

Episode

Figure 3: Program views against the forecast of a balanced forecasting algorithm.

WWW.FURIOUSCORP.COM

2.3 Finding the Optimal Balance between Robustness and AdaptivenessThe key to finding the optimal balance between robustness and adaptiveness requires correctly determining

how much importance we assign (in the algorithm) to recent values of the forecasted metric (adaptiveness)

compared to older values (robustness). The optimal balance is the one that maximizes the forecasting

average accuracy. There is no one answer for what this optimal balance is due to the fact that, or as we say

at Furious, “Every client and every set of client data is a snowflake”. For some types of time series, a more

robust forecasting methodology would be optimal, while for others a relatively adaptive forecasting would

be optimal. This variance is why a one-size-fits-all forecasting logic does not work. Furious has developed

a library of algorithms for different cases, such as linear and digital, as a starting point from which to

apply machine learning. Machine learning algorithms are capable of learning the behavior of a time series

for highest accuracy, based on historical data of the series itself (we recommend as long as possible of a

period) and/or historical data of other like-programs that share a series’ attributes. When we combine both

robustness and adaptiveness and leverage machine learning to optimize and balance, we see incredible

improvement in forecasting accuracy. See Figure 3 above, which is the forecasting results for the same

series as in Figure 1 and 2, with an optimal balance of robustness and adaptiveness applied.

2.4 Leveraging Additional Data about Change to Determine Transience

As discussed above, the source of the robustness/adaptiveness dilemma is that when a change occurs,

we do not know whether the change is transient or rather an indication of a lasting change. The inherent

nature of forecasting does not allow us to wait to see retrospectively whether a change was transient or

not, therefore we must find other ways to make [accurate] inferences about the nature of change in the

data. This enables us to improve the robustness of our forecasting logic, without sacrificing much of the

adaptiveness. An optimal balance between robustness and adaptiveness is obtained by leveraging additional

and relevant data to determine the transient nature of changes in a time series.

There are many examples of how to leverage additional relevant data, outside of the actual time series. A

great example when forecasting impressions (i.e. views), are changes caused due to holidays. For example,

we can expect a certain decrease of a program’s views on the Sunday before Memorial Day, due to the fact

that many people are traveling or engaged in family activities during this holiday weekend. This is visible in

Figure 4 below. Thus, if we observe the expected decrease on Memorial Day weekend, we have no reason to

suspect this is an indication to a lasting change.

WWW.FURIOUSCORP.COM

Figure 4: An inferred transient change using additional data, in this case holidays and competing shows.

Figure 5: 20/20 views demonstrates how content of a specific show can cause unusual views

Game of Thrones

20/20

Vie

w (M

illi

ons)

Vie

w (M

illi

ons)

Episode

Episode

WWW.FURIOUSCORP.COM

Another cause for change may be a show’s content. For example, in an interview series, a sensational

interview may have much higher viewership than average. This is shown in Figure 5 above, by an example

with the change [spike] in views during the 20/20 interview with Bruce Jenner on April 24, 2015.

Another example of a transient change in a show is for sports, where the teams participating in a game can

impact the viewership. When looking beyond viewership into advertising audience or impressions, for sports,

the geographical origin of the teams will create geographical concentrations of viewers, which informs the

ability to fill geographically targeted advertising campaigns during the show.

The following are some further examples of change causes:

• In some series, the season finale has higher views.

• The views of the first few episodes of a series often are not indicators for the following episodes, due

to the fact they are heavily influenced by promos.

• A preceding show with unusually high views will raise the views, as demonstrated in Figure 6.

• A competitive show with unusual high views will lower the views, as demonstrated in Figure 4.

As you can see by the examples provided, the content type (news, sports, scripted, events, etc.) can impact

how additional data or variables affect viewership, therefore, different forecasting methodologies are

required for different content types. The more we are able to map and quantify the effects of such causes,

the easier it is to distinguish lasting changes from transient fluctuations.

2.5 Use of Anomaly Detection to Infer if Changes are TransientIt is also possible to infer a change is transient by assessing the magnitude and/or shape of the change. For

example, an extremely higher or lower value than usual will often suggest it is due to an unusual transient

cause. This technique is known as Anomaly Detection. Anomaly detection is used when we want to be

alerted that something wrong or out of the ordinary occurred, such as in detection of faults in industrial

processes, credit cards thefts, cyber security attacks, etc. Within media, anomaly detection is useful to infer

if a change is transient. Figure 6 shows an example of an anomaly in viewership. We can leverage additional

data to determine the cause (a preceding Super Bowl game in this specific case), yet, even if we did not have

this additional data, we could know it is a transient change just by its magnitude.

The advantage of anomaly detection is that we don’t need to collect additional data about the wide variety

of external factors that may cause a transient change and quantify their expected impact. In contrast, the

cause based technique allows for determination as to the nature of change in cases, when the magnitude is

not unusual. For these reasons, it is best to apply both.

WWW.FURIOUSCORP.COM

3. Advanced Issues for Consideration3.1 Modeling the Causes of Change in Learning and Forecasting ModelsAs stated previously, the learning algorithm updates the model by which

forecasting is done, while the forecasting algorithm uses this model to produce

forecasts.

The modeling of such causes can be used by both the learning and the

forecasting algorithms. Let’s take, as an example, the lower views on the Sunday

before Memorial Day. The forecasting algorithm may use it to improve the

forecast for this Sunday. The learning algorithm may use it after this Sunday, in order to better update the

forecasting model, rather than projecting these lower views to regular weeks.

Figure 6: Anomalous transient change. Although the cause in this example is known, it is possible to infer it is a transient change just from its magnitude.

New GirlV

iew

(Mil

lion

s)

Episode

WWW.FURIOUSCORP.COM

It is important to note that a cause that is known during learning, is not necessarily known during the

forecasting. This means that an input, which can be used to improve the learning algorithm, is not necessarily

useful to improve the forecasting. If we take the sensational interview example, it would not have been

possible to forecast the high views during the show, had the forecast been done a few months earlier given

that the interview content was not known at that time. After the interview, we know this interview was

unusual and use this for learning, by avoiding wrong expectations for high views in the future. That is an

example of why the forecasting process must remain dynamic, in both looking back and looking forward.

3.2 Forecasting for Longer PeriodsIn the figures demonstrating robust, adaptive and balanced forecasting, the forecasting shown was for the

next episode, based on all the previous episodes. In practice, we usually provide forecasts for longer periods,

often to the entire next season, both by individual episode and overall. There are numerous considerations as

the period of the forecast increases, which we will not elaborate on here. However, the high-level principles

we demonstrated with these figures are also relevant to the forecasting of longer periods.

3.3 Distribution of Expected ValuesForecasting algorithms do not typically provide a specific value as a result, but rather estimate the

probabilities, or likelihood, of different possible values. Given a certain probability distribution, it is not

obvious which of all the possible values should be selected as the forecast to be used. There are a variety

of commonly used methods to choose a forecast value, such as the value that has the highest probability

(maximum likelihood), the value that minimizes the average squared error, the value that minimizes

the average absolute error, etc. However, it is important to realize, that while each one of such general

approaches optimizes a certain factor (e.g. the mean squared error), none of them is guaranteed to optimize

the yield!

Consider, for example, the number of impressions guaranteed to advertisers. Recall, underestimation of

the number of views means a lost opportunity to sell inventory, while overestimation results in a failure to

deliver advertisers the guaranteed views promised and exposes a Seller to liability risk. Since the outcomes of

overestimation and underestimation have different consequences, it does not necessarily make sense to give

symmetric importance to positive and negative forecasting errors, as many of the generic approaches do.

A Seller may prefer a low liability risk at the expense of increasing the unsold inventory or conversely, can

accept a higher level of liability on behalf of reducing the unsold inventory.

Thus, an advanced forecasting tool would enable a Seller to optimize the yield and reach an optimal

WWW.FURIOUSCORP.COM

balance between overestimation and underestimation. In order to do this, however, it is not sufficient for the

forecasting logic to learn the behavior of a central value, while assuming the actual values are distributed

around it in some generic fashion. Such rigidity will not deliver sustainable optimization of yield, since

different programs have different distributions. For example, the distribution of “The Simpsons” (Figure 8)

is much wider than the distribution of “Criminal Minds” (Figure 7). Thus, the most accurate forecasting logic

needs to learn not only the behavior of a central value, but also the way the actual values are distributed

around it.

3.4 Forecasting New Programs

A new program has no historical viewership data, from which to learn or develop a forecast. In spite of this,

it is possible to learn from historical data of similar programs that share some attributes and apply it to a

new program. Such attributes are, for example, daypart, series budget, genre, actors, promos budget, social

networks’ activity preceding the premiere and more.

The average prediction error of new programs is high, compared to running programs for whom we have

actual historical data. Clearly the accuracy of forecasting will be lower for new programs, nevertheless, it is

better than a random guess and the use of similar programs can positively impact yield for new programs,

despite the number of unknowns in the media planning process.

Figure 7: Criminal minds: the variations around the trend are relatively narrow

Criminal Minds

Vie

w (M

illi

ons)

Episode

WWW.FURIOUSCORP.COM

3.4 Forecasting Views of Audience Segments or Targets Typically, the guaranteed views offered to an advertiser are for a specific targeted audience. In the case of

most linear inventory the targeted audience or group consists of a Nielsen demo, defined by age and gender.

In the case of addressable inventory, the targeting groups are usually unique for each campaign and may

be based on additional information beyond age and gender, such as DMA (geographic location), purchase

history, income, education and additional attributes derived from combined third party data sets.

Forecasting views of a specific audience is quite similar to forecasting the total views. One issue, however,

is that as the audience becomes smaller, the variability increases, resulting in an increase in the forecasting

error. A method to minimize the error amount is to use correlations of each targeted group with other

targeted groups, resulting in a more accurate forecast across all segments.

Figure 8: The Simpsons. The variations around the trend are relatively wide.

The Simpsons

Vie

w (M

illi

ons)

Episode

WWW.FURIOUSCORP.COM

4. Supply Forecasting Application and Value 4.1 Number of Impressions Guaranteed to AdvertisersOur experience shows that the use of advanced, more accurate forecasting

algorithms, customized to the media domain, reduces the mean absolute

forecasting error by more than 50%. This is compared to basic methods (usually

average based methods) commonly used in the industry today. It is important to

note that Furious has observed that most television forecasting for the purpose

of informing pricing, planning, inventory allocation and yield management,

is done in excel today. Excel is used by Sellers to forecast with what are considered in data science basic

functions, such as sum and average.

In our work with clients and through testing, we have found that a basic method of forecasting using

averages for an entire season, results in, on average, a 25% absolute forecasting error per show. The good

news is that the revenue loss is not one to one, and the actual revenue lost is less than 25% for the following

reasons:

1. In the case of overestimation, the Seller would compensate the advertiser with ADUs. ADUs are

Audience Deficiency Units, or inventory that is allocated and reserved for the purpose of ‘making

good’ on deals that under deliver. Therefore, the lost revenue is mainly due to underestimation.

However, this does not mean it is possible eliminate or ‘make good’ for lost revenue entirely, given

that often utilizing a high portion of ADUs may harm future deal budgets. Thus, ADUs enable a Seller

to reduce only a portion of the lost revenue.

2. A Seller may sell some programs in groups (e.g. programs of a certain daypart), such that the

impressions within the group are counted together. Because some programs are underestimated,

while others are overestimated, the overall absolute forecasting error percentage is reduced. In order

to get a figure of this reduction, let’s assume an ideal scenario of programs with similar independent

Gaussian distributions. The error percentage in this ideal case is proportional to the inverse square

root of the programs number, so if, for example, the group contains 4 programs, then the average

absolute error would be reduced to a 1/2. In other words, you can reduce loss due to over or

underestimation if you are able to sell as many programs as possible bundled together.

WWW.FURIOUSCORP.COM

3. Sellers often determine ADUs due based on views accumulated over the entire campaign duration.

Due to the same principles outlined in item number 2, the delivery of ADUs at the end of a campaign,

in lieu of intermittently throughout a campaign may reduce the average absolute error. However,

typically the forecasting errors of different shows of the same program are not independent. In fact,

often a program’s forecast is above (or below) the actual views over the entire campaign. In such

cases the error percentage would not be reduced.

Although the actual lost revenue is not one-to-one as the amount of the forecasting error, due to the reasons

above, revenue loss remains significant with only a basic forecasting method to inform business decisions. In

the examples above, with a 25% forecasting error, the resulting revenue loss is likely to be about 25/4 ~ 6%.

6% of a Billion dollars for a major broadcaster is a lot of money left on the table. Our experience in working

with Sellers, using actual historical sales and delivery data, Furious has found that the use of advanced

forecasting algorithms can reduce the forecasting error by more than 50%, meaning an increase in total

revenue by 3%.

4.2 Using Forecasting to Decide Which New Programs to Buy or ProduceAs noted, the average prediction error of new programs is relatively high, but better than a random guess.

This makes the same supply forecasting tools, which we are discussing for use in advertising planning and

sales, equally useful as an input for deciding which new programs to buy or produce. Simply put, choose new

programs that increase advertising revenue.

4.3 An Input for Scheduling The day of the week and the hour of a program impacts the number of views. For example, Figure 9 shows

viewership numbers per season of “The Big Bang Theory” series. The number of viewers increased from

season 1 through season 6 (on average), after which it stabilized around 19 million. However, season 4 is an

exception during the seasons’ 1 to 6 increase period. The viewership numbers in season 4 decreased.

Interestingly, the season 4 decrease could have been forecasted based on the program’s weekday. The total

number of TV viewers varies among the various weekdays and has an impact on the programs aired each day.

Between season 3 and 4 the program changed its schedule from Monday, a strong weekday for TV programs,

to Thursday, a weaker weekday for TV programs. It is important to note, this does not necessarily mean the

decision to move “The Big Bang Theory” from Monday to Thursday was wrong, since we must also consider

the benefit to the program scheduled instead on Monday and the one that was replaced on Thursday.

WWW.FURIOUSCORP.COM

Since the schedule influences the views of each program, a forecasting algorithm that is capable of and

models this influence is useful for optimizing a programs’ schedule and thus can further contribute to

increasing revenue and yield.

5. Demand Forecasting Application and Value The examples and figures used in this paper to demonstrate forecasting are

based on public data about views of primetime programs. It is important to

emphasize that the demand forecast, which would be derived from financial and

sales data is not less important in optimizing yield for a Seller.

In other industries, demand forecasting is commonly used for production

planning. In a somewhat analog way, it is possible to use demand forecasting for

deciding how much to invest in promotion of different programs in the context

of media.

Figure 9: The Big Bang theory. Viewership number per season.

The Big Bang TheoryV

iew

(Mil

lion

s)

Season

WWW.FURIOUSCORP.COM

A more important application, in regards to impacting revenue, of demand forecasting is in the making of

decisions related to deals with advertisers and overall ad sales decisions. In order to understand how, we

must note that in TV advertising, impressions of the same show may be sold with different CPMs across

multiple deals. There is a separate negotiation with each advertiser, who does not know the prices others

pay. The fact that one advertiser paid a high price does not imply other advertisers would be willing to pay

the same price. TV Sellers may accept lower prices in order to increase sell through or minimize unsold

inventory since the product (impressions) has an expiration date (the airing date of the show) after which it

cannot be sold anymore. We are not suggesting the alternative is a good business strategy either, which is to

simply agree to whatever [low] price an advertiser is willing to pay in an effort to optimize yield.

Therefore, a Seller needs to decide whether the price an advertiser is willing to pay is above or below the

price it would be possible to get for the requested inventory in alternative deals, but before the inventory

expires. The most fundamental question however, is can this inventory be sold to someone else, or is there

demand for it? If the price is below the expected future opportunities, and can be sold to another advertiser,

then the deal offer should be rejected.

Often, deal making requires considering many things, not just demand and price, but also the following:

1. Volume discount. The larger the volume bought by an individual advertiser, the lower the chances

to sell it in the future at a higher price.

2. Loyalty discount. This is due to impact on sale of inventory in future years.

Given the above, demand forecasting is needed for deciding whether to sign a deal or not. The complexity

of demand forecasting makes it near impossible to manage it with confidence in a manual process and

only excel. However, the number of advertisers, deals, varying expiration dates, etc. make NOT forecasting

demand a potentially significant cause of lost revenue.

We have found that most deal making decisions are not done systematically, based on demand forecasts.

Our experience shows that using demand forecasting for deal making decisions increases revenue by 5-10%.

WWW.FURIOUSCORP.COM

6. Summary Media forecasting is complex and requires answering the following questions:

6.1 What do we need to forecast? The simple answer is: A Seller needs to forecast both demand and supply.

With supply (impressions) forecasting, we are interested in forecasting for

different targeted groups, different periods ahead, different type of programs,

for both running programs as well as new programs, and varied points of

balance between overestimation and underestimation risks, to have confidence

in a Seller’s ability to maximize yield.

With demand forecasting, we are interested in the demand for the requested

inventory, as well as the expected future deals with the advertiser and all

advertisers.

6.2 Which inputs would be useful for the forecast?Unlike pure mathematics, data science requires expertise in the domain to which it is applied. This includes

knowledge about useful inputs and the way they influence the forecasted metrics, where each forecasted

case has its own relevant inputs. For example, in the case of forecasting views of a running program we may

use the history of the program’s views, the date, the weekday, the hour, competing programs, the previous

program, content at the single show level, as well as third party data like weather.

6.3 How do we develop appropriate algorithms for media forecasting?Advanced forecasting has a learning algorithm and a forecasting algorithm. The learning algorithm updates

a model by which the forecasting is done, while the forecasting algorithm uses this model to produce

forecasts.

The learning algorithm should be optimally balanced between robustness and adaptiveness.

We should also note, to state the obvious, that algorithms customized for media forecasting have a

significant advantage over generic algorithms. The reason for this is that generic algorithms suffer from one

WWW.FURIOUSCORP.COM

of two problems. First, many algorithms assume very simple behavior of the forecasted metric, such that all

measurements are independent and taken from the same normal distribution. Since these assumptions don’t

hold true in reality, the outcome is bad. Other generic algorithms make less prior assumptions about the

metric, but then require much more time to learn because they need to learn more things. Recall that a TV

series typically has a history of only a hundred, rather than ten thousand episodes. Customized algorithms

avoid these traps by using correct assumptions about the behavior of the forecasted metric, based on an

offline analysis.

6.4 How can we use the forecast?There are at least 3 applications of supply forecasting which increase yield:

1. Setting the number of impressions guaranteed, at what price, to advertisers.

2. Deciding which new programs to buy or produce.

3. Planning programs’ schedule.

The two most important applications of demand forecasting to increase yield are:

1. Deciding whether to accept or reject a deal.

2. Optimizing the investment in a programs’ promotion.

6.5 What is the true economic value of media forecasting?The amount of increase in revenue with an advanced and more accurate forecasting process vary amongst

Sellers, and depend on a variety of things. However, Furious has found the following to be true across the

multiple Sellers whose data we have analyzed and studied:

• Advanced methods of supply forecasting can increase revenue by at least 3%

• Using demand forecasting for deal making decisions can increase revenue by 5-10%

Advanced forecasting is more than a tool for Sellers. It is a strategic business process that is a pillar in

ensuring revenue and inventory usage is maximized, and yield optimized. The application of demand and

supply forecasting helps an organization to think more critically, due to the introduction of reliable and

actionable information at the point-in-time that a decision to make a deal with an advertiser is made, as to

whether or not it is the best decision. Thriving in the current business climate and shift from program based

to audience based buying by advertisers requires that Sellers rethink and transform the way they plan, price

and allocate inventory and run their advertising businesses.

WWW.FURIOUSCORP.COM

Furious is a cross-platform, enterprise yield optimization solution for media companies and distributors.

Consistently, our experience shows that significantly higher yield can be achieved when combining human

expertise with established techniques from AI, data science, machine learning, and operational research.

Furious’ platform, PROPHET™, does just that, leveraging the world’s leading data science and machine

learning to unify and automate campaign and portfolio reporting, forecasting and planning. A horizontal

SaaS platform that sits atop and connects a variety of advertising systems and data sets, PROPHET is

custom-configured to help media companies automate the key workflows of running an advertising business,

resulting in higher yield, lower operational costs, and increased profitability.

Learn more at www.furiouscorp.com or send us your feedback at [email protected].

About Furious

IN YIELD OPTIMIZATION for Sellers of Television Advertising€¦ · Given the importance of...

Documents

Transcript of IN YIELD OPTIMIZATION for Sellers of Television Advertising€¦ · Given the importance of...