AWS re:Invent 2016: Searching Inside Video at Petabyte Scale Using Spot (WIN307)
-
Upload
amazon-web-services -
Category
Technology
-
view
114 -
download
0
Transcript of AWS re:Invent 2016: Searching Inside Video at Petabyte Scale Using Spot (WIN307)
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Tim Sullivan and Ari Bixhorn, Panopto
December 2, 2016
Searching Inside Video at
Petabyte-Scale using Spot
What to Expect from the Session
Primer on inside-video search
Dive into how we use Spot to search video at scale
Overview of our cross-platform architecture
Best practices for scaling Spot Instances elastically
Searching Inside Videos
Video: A Last-mile Problem for Search
30 trillion web pagesEmail and documentsFile system contentsVideo?
3 minutes, 53 seconds
15 - 90 minutes
Title: An Introduction to Network Security
Description: A broad overview of network
security as defined by today’s hybrid
corporate WANs.
Tags: Network security, intrusion detection,
corporate WAN, firewall, authentication!?
125 words per minute
5,625 words spoken
The network is the entry point to your application. It provides the first gatekeepers that
control access to the various servers in your environment. Servers are protected with
their own operating system gatekeepers, but it is important not to allow them to be
deluged with attacks from the network layer. It is equally important to ensure that network
gatekeepers cannot be replaced or reconfigured by imposters. In a nutshell, network
security involves protecting network devices and the data that they forward.
The basic components of a network, which act as the front-line gatekeepers, are the
router, the firewall, and the switch. An attacker looks for poorly configured network
gatekeepers to exploit. Common vulnerabilities include weak default installation settings,
wide-open access controls, and unpatched devices.
50%
5,625 words spoken
50% have no search value
2,813 words with search value
With10 tags, you’ve
only covered 0.3%
of valuable content
Six Types of Video Content Indexing
1. Manually entered metadata
2. Transcription
3. Automatic Speech Recognition (ASR)
4. Optical Character Recognition (OCR)
5. Slide extraction
6. Viewer notes
Demo – Video Search
What Led Us to Spot?
Our Challenge
2013-01 2014-01 2015-01 2016-01
Running on AWS since 2009
Growing exponentially
Need to index every video – quickly & cost-efficiently
15 years of video (400TB) content uploaded monthly
Need to extract metadata out of 4PB of video
122M unique images have been indexed for OCR
>3TB SOLR index
* Numbers are inclusive of both enterprise and education accounts; numbers do not include on-premises customers
Option 1: On-Demand Amazon EC2 Instances
Hours of Content
$
Budget
Today
Cost-prohibitive to
offer to all
customers
Cost
Enable
ASR/OCR
Content Ingestion
Windows and
Mac Clients
Mobile Apps
Video Capture
Appliance
Remote Capture
Client
Other Ingestion
Content DiscoveryContent Management Content DeliveryContent
Consumption
Transcoding
Editing
Search Indexing
Governance
Option 2: Make Search an Upsell Capability
Analytics
Access Control
Video CMS
Public Hosting
SmartSearch™
Email and Social
Integrations
Search
Federation
Panopto
Streaming
CDN Integration
P2P Streaming
Panopto ECDN
WAN Op
Solutions
Interactive
Player
Panopto Mobile
Audio Podcast
Embedded
Player
Quizzing and
Polls
Option 3: Use Reserved Instances (RIs)
Theoretically would save costs
RIs work best for predictable workloads
30 sec SLA to begin indexing results in spiky demand curve vs. flat line
Upfront Monthly Effective
Hourly
Savings over
On-Demand
On-Demand
Hourly
$0 $213.16 $0.292 30%
$0.42$1304 $75.92 $0.253 40%
$2170 $0.00 $0.248 41%
c3.2
xla
rge
Option 3: Use Reserved Instances (RIs)
RI
Delayed
Start
WasteWaste
# Instances
t
Option 3: Use Reserved Instances (RIs)
RI
Overspend Overspend Overspend
Waste Waste
# Instances
t
Option 4: Buy Our Own Hardware
Option 5: Spot Instances
Excess EC2 capacity auctioned at steeply discounted prices
Spot Instances can be accessed on demand to meet our variable needs
On-Demand
Instances
Spot Instances added
when bid ≥ market
Pre-configured or custom machine images
Configure security and network access
Choose from instance types and locations
Use static IP endpoints
Attach persistent block storage to instances
Pay fixed price by the hour
On-Demand vs. Spot Instances
Pre-configured or custom machine images
Configure security and network access
Choose from instance types and locations
Use static IP endpoints
Attach persistent block storage to instances
Pay variable by the hour
Hours of Content
$
Budget
Today
On-Demand
Spot
The Spot Auction
Set a bid price (for example, $0.27)
Instance runs while bid ≥ market price
Instances terminate bid < market price
Instances run
Instances terminate
Spot Considerations
Is your workload appropriate for potential volatility?
How to deal with a lack of capacity?
Can you run on a wide range of instance types
(via Spot Fleet)?
Look at historical bid prices for your instance types and
regions to estimate your savings.
Our Implementation
The Importance of Windows to our
Architecture
Single codebase for cloud and on-premises
For on-prem customers, Windows is often a requirement
Windows is therefore critical to our cloud architecture as well
On-Prem Cloud
Panopto Cloud on AWSDistributed across Availability Zones
Cross-Platform Implementation
Web Servers
App Servers
Database
Speech Recognition
Apache SOLR
Using Auto Scaling Groups
Demand
Running Instances
Using AWS CloudFormation
Define ASGs and auto-scale rules
From On-Demand to Spot
OnDemandLaunchConfig : {
Type : AWS::AutoScaling::LaunchConfiguration
Properties : {
SecurityGroups : { Ref : backendSecurityGrpIds },
IamInstanceProfile : { Ref : BackendEncoders...},
ImageID : { Ref : ami },
InstanceType : { Ref : instanceType },
InstanceMonitoring : false,
AssociatePublicIpAddress : true,
EbsOptimized : { Ref : ebsOptimized },
BlockDeviceMappings : [
{
DeviceName : xvdca
}
]
}
}
SpotLaunchConfig : {
Type : AWS::AutoScaling::LaunchConfiguration
Condition : CreateSpotGroup,
Properties : {
SecurityGroups : { Ref : backendSecurityGrpIds },
IamInstanceProfile : { Ref : BackendEncoders...},
ImageID : { Ref : ami },
InstanceType : { Ref : instanceType },
SpotPrice : { Ref : spotPrice },
InstanceMonitoring : false,
AssociatePublicIpAddress : true,
EbsOptimized : { Ref : ebsOptimized },
BlockDeviceMappings : [
{
DeviceName : xvdca
}
]
}
}
Bidding Strategy: Start SimpleSealed-bid, second-price auction
Set your bid to market price
of an On-Demand Instance
$0.14
$0.24
$0.34
On-Demand
Instance Price: $0.84
The Challenge of Long-Running Jobs
The longer the job, the greater the
chance of instance revocation
Short window to determine how best
to failover (2 minutes)
Job Length
Cha
nce o
f In
sta
nce R
evoca
tion
Managing Jobs in the Face of Instance Revocation
$Market price
increase
Spot
“Spotter”
service
Wait until
T-30s Is Job
Done?
Yes
No Action
No1. Save State
2. Kill Job
3. Reallocate
!
Scaling Up with Predictive Job Modeling
1. Number of waiting jobs
2. Number of jobs currently processing
3. When current jobs expected to finish
4. Incoming jobs in the last <interval>
5. Number of jobs expected to arrive
6. Time to spin up new machine
7. SLA by job
Inputs
More processing
capacity required?
Data
Scientists
?
Amazon CloudWatch Dashboards
Scaling Down
Active
Active
Hold
Hold
If the rate of incoming and in-process jobs is less than current processing capacity,
then we’re in a scale-down state.
Identify instances, not processing jobs. Then identify those within 15 minutes of a billing hour.
Active
Hold
Scale
Down
Scale
Down
Hold
Active
Active HoldScale
Down
Scale
DownActive
But what if there’s a deficit of Spot capacity?
Operate two Auto Scaling groups for each backend worker pool
One for Spot ASG, one for on-demand ASG
When actual Spot capacity < desired capacity, offload to on-demand
Automatic Speech Recognition
Spot
On-Demand
Spot Futures at Panopto
Move to Spot Fleet
Ability to launch the most cost-efficient
instance type for any job
Lower prices with diversified resources
Ability to apply custom weighting (create
capacity units based on our app needs)
Challenge: no accounting for the cost of
EBS
Challenge: lacking ASG’s health checks
Challenge: lacking ASG’s tag propagation
From Immutable to Dynamic
Instance Configuration
Need to account for different processing capacity of different instance types
Will need to optimize number of workers being run in parallel on each VM
Substantial cost savings potential
Today: Immutable
Pro: Spin up instances quickly
Con: Could be more cost-efficient
Future: Dynamic
Choose the best Availability Zone,
instance type based on market price
Subdivide job
for grid processing
Future
Painful to cancel a 90% complete,
30 minute OCR indexing job
Today
Subdividing Jobs
Grid processing minimizes impact of Spot Instance loss
Also allows greater parallelization for faster user-visible time to task completion
In Summary
53%Cost Reduction
Scenarios Spot has Unlocked for Panopto
Scale our inside-video search
technology across our entire
customer base.
Accelerate business growth. The
money saved with Spot is being
reinvested in expanding our team.
Remember to complete
your evaluations!