EQR Reporting: Rails + Amazon EC2

32
Platform 3: To Infinity and Beyond January, 2009 Summit XI

description

 

Transcript of EQR Reporting: Rails + Amazon EC2

Page 1: EQR Reporting:  Rails + Amazon EC2

Platform 3: To Infinity and BeyondJanuary, 2009

Summit XI

Page 2: EQR Reporting:  Rails + Amazon EC2

Gartner’s Hype Cycle

-2-

Page 3: EQR Reporting:  Rails + Amazon EC2

Overview

• Architecture• Video• Reporting

-3-

Page 4: EQR Reporting:  Rails + Amazon EC2

Architecture: What’s that?

• The structures of the system– The externally visible parts and the relationships between them

-4-

Page 5: EQR Reporting:  Rails + Amazon EC2

Architecture: Goals

• Performance– Every page needs to yield a response within 5 seconds

• Availability/Reliability– Always there!

• Scalability– Dynamically add RAM/CPU– Dynamically add more servers

• Agile/Flexible– Can easily be adapted– Follow best practices

• Accuracy– No response left behind– Quality Assurance

-5-

Page 6: EQR Reporting:  Rails + Amazon EC2

Architecture: Performance

• How do we achieve great performance?– Using the right software

• Ruby on Rails– Twitter, LinkedIn, Hulu

– Good application design• Reporting has different needs than Authoring/Runtime

– Testing / Benchmarking / Tuning• Rails has lots of good built-in utilities to make these easy• We’re writing test code, right?

– Dedicating time for maintenance / new features• As data grows• As more complexity is brought in to application environment• As we get smarter

-6-

Page 7: EQR Reporting:  Rails + Amazon EC2

Architecture: Performance• Good Application Design – Separation of Concerns• Separating databases for Runtime and Reporting is a

Good thing!– Runtime is OLTP

• OLTP, refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing. It has also been used to refer to processing in which the system responds immediately to user requests. - Wikipedia

– Reporting is OLAP• OLAP, is an approach to quickly provide answers to analytical queries

that are multi-dimensional in nature. Databases configured for OLAP employ a multidimensional data model , allowing for complex analytical and ad-hoc queries with a rapid execution time. - Wikipedia

– Analytical processing on Reporting doesn’t impact performance on Runtime (ie Active Surveys in the field) because they are physically different systems.

-7-

Page 8: EQR Reporting:  Rails + Amazon EC2

Architecture: Availability/Reliability• Co-location

– Uptime• eApps

– 99.98% over past 1000 days

• Colo4Dallas– Guarantees 100%, reality? 99%+

• Amazon Web Services – 99.95%

• Redundancy– Servers have different profiles for different services

• Databases• Web / Application servers• Proxy / Load balancing

– Server profiles are duplicated and online for… • Hardware failures • Load balancing during peak demand

-8-

Page 9: EQR Reporting:  Rails + Amazon EC2

Architecture: Scalability• Reporting

– www.eqrtools.com hosted at eApps • Runs on an $70/month plan (1.2 GB RAM Virtual Private Server)• Pre-packaged with Java, Rails, MySQL, mail server, etc.• Can upgrade package in minutes and add servers via web interface• Cancel anytime

– Amazon Web Services• S3 = Simple Storage Service• EC2 = Elastic Cloud Computing• CloudFront = Content Delivery Network

• Authoring/Runtime– Hosted at Colo4Dallas

• n Front End Web/Application servers• n Database servers

– Wowza• Streaming Video Service via Amazon EC2

-9-

Page 10: EQR Reporting:  Rails + Amazon EC2

Architecture: Amazon Web Services• Simple Storage Service (S3)

– In use at Equation with JTS for 2+ years– Expanding use for storing more stuff

• Images – plain, rollover, etc.• Documents – PDF reports• Videos• EC2 Machine Images

• Elastic Cloud Computing (EC2)– Provides ability to add servers (Linux/Windows flavors) for

specific services• i.e. Wowza Video Streaming• Grabs content from S3• Can be expanded to other uses – Rails application hosting/database

• CloudFront– Provides Content Delivery Network (CDN) to push to edge

• Content that we move into S3• Moves content closer to clients reducing network latency

-10-

Page 11: EQR Reporting:  Rails + Amazon EC2

Architecture: EC2 Simplified

• Virtual Machines/Servers– Scalability in two dimensions

• Use as many machines as you need• Various machine sizes available

• High availability• High bandwidth

-11-

Page 12: EQR Reporting:  Rails + Amazon EC2

Architecture: EC2 Instance Types

-12-

• EC2 supports different instance types – Small Instance

• 1.7 GB memory, 32-bit platform, I/O Performance: Moderate• 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit)• 160 GB instance storage (150 GB plus 10 GB root partition)• Price: $0.10 per instance hour

– Large Instance• 7.5 GB memory, 64-bit platform , I/O Performance: High• 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)• 850 GB instance storage (2 x 420 GB plus 10 GB root partition)• Price: $0.40 per instance hour

– Extra Large Instance• 15 GB memory, 64-bit platform, I/O Performance: High• 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each)• 1,690 GB instance storage (4 x 420 GB plus 10 GB root partition)• Price: $0.80 per instance hour

Page 13: EQR Reporting:  Rails + Amazon EC2

-13-

CloudFront: Content Delivery Network

• and how it works…

Page 14: EQR Reporting:  Rails + Amazon EC2

Amazon – CloudFront CDNCopies of files in S3 bucket are accessed/cached from edge servers around the world.

-14-

Amazon: CloudFront

Page 15: EQR Reporting:  Rails + Amazon EC2

Architecture: Amazon

• Benefits– No upfront investments

• No contract• No hardware to purchase, install/fit, maintain• Pay for what we use

– Offer variety of uses – Content hosting, machine hosting, streaming video

– Competitors often charge upfront and monthly fees and don’t offer one-stop-service

– We can dynamically add/remove machines as we need them

• Additional applications built on EC2 are also available…– Wowza Video Streaming– Jungle Disk (backup/recovery)– GigaVox Media (Podcast hosting)– Morph (Application hosting)– RightScale (Application hosting/monitoring)– Scalr (Load Balancing/farm)

-15-

Page 16: EQR Reporting:  Rails + Amazon EC2

Architecture: Quality Assurance

• Code Coverage

-16-

Page 17: EQR Reporting:  Rails + Amazon EC2

Architecture: Quality Assurance

• Example – Question controller

-17-

Page 18: EQR Reporting:  Rails + Amazon EC2

Rich Media: Audio, Images and Video

-18-

Page 19: EQR Reporting:  Rails + Amazon EC2

Serving Video is like… TV

• Content – (i.e. The Ad)

• Delivery– (i.e. Cable, Satellite, Rabbit ears)

• Viewer– (i.e. – The television box)

Page 20: EQR Reporting:  Rails + Amazon EC2

The Content: Preparation

• There are many source formats to video– AVI (early Windows format), Quicktime (.mov), Windows

Media, MPEG, Flash

• Files are large and not optimized for web delivery– Encoded for other mediums

Page 21: EQR Reporting:  Rails + Amazon EC2

Content conversion

• The Old Way– Sorensen Squeeze

• A desktop tool where we manually took a file and converted into multiple varying bitrate Flash files

– Uploaded file(s) to third party hosted Flash Video service

• The New Way– File uploader– ffmpeg (under the covers)

• An open source utility that has been wrapped with Ruby packages to provide compression in the P3 Application

• Media is compressed for optimal playback experience• Media is still formatted to flash

– Most commonly served format on Internet (> 92%)

– Converted file uploaded to Amazon • File resides in S3 folder• Streamed via Wowza server hosted on EC2 instance

Page 22: EQR Reporting:  Rails + Amazon EC2

Video: ffmpeg

• Still a bit of magic involved…– Reduce this, increase that…

-23-

Page 23: EQR Reporting:  Rails + Amazon EC2

Video: ffmpeg conversion

• But at least we’ve built tools!

-24-

Page 24: EQR Reporting:  Rails + Amazon EC2

Video: Delivery

• Progressive Download– Copy of video is made on your local temp drive and then

buffered back through the player as it downloads• Lacks IP protection

– ESPN– Video is sent to player over http from file system on host server– Some companies will block content

• by MIME type• video over http on port 80 is the easiest way to get past security

• Streaming– Video is streamed in real time from streaming video server

• No local copy made

– Near instantaneous playback– Uses rtmp protocol– Important to size/compress correctly for intended audience

-25-

Page 25: EQR Reporting:  Rails + Amazon EC2

Video: Delivery• Factors impacting Client reception

– Other programs running• How much available CPU/RAM does the respondent’s web-enabled

device have?

– Bandwidth• DSL, Cable, dialup?• Bandwidth varies during a video session (i.e. 30 second Ad)

-26-

Page 26: EQR Reporting:  Rails + Amazon EC2

Video: The Player

• The swf file– Hosted on server, embedded in page– Skinnable

• Remove controls

– Plays either progressive or streaming– JW Player is the most ubiquitous

-27-

Page 27: EQR Reporting:  Rails + Amazon EC2

P3 Reporting

-28-

Page 28: EQR Reporting:  Rails + Amazon EC2

Reporting: Online Analytical Processing (OLAP)

-29-

Page 29: EQR Reporting:  Rails + Amazon EC2

Reporting: The Update Algorithm

• Scheduled Batch– Go update all the surveys every x minutes…

• Open and recently closed

• On Demand– Update this survey now

• Real-time– Asynchronously, grab queued responses from a MQ with

updates from the Runtime

-30-

Page 30: EQR Reporting:  Rails + Amazon EC2

Reporting: On demand

-31-

Page 31: EQR Reporting:  Rails + Amazon EC2

Reporting: Key features

• View results by Question• Filtering

– By status– Compound filters based on question/choice sets

• Crosstabs– Question v Question crosstabs– Filter by status

• Quotas / Segments– View current / total counts

• Monitor survey progress– Total, Last day, Last hour…

-32-

Page 32: EQR Reporting:  Rails + Amazon EC2

Reporting: What’s left?

-33-

• More testing…• Report generation

– PDF– Other formats

• Email notification• More slicing/dicing tools• Migration to Scalr???• Beta with select clients• User feedback

– Incorporate into future releases