AideRSS - AWS The Start-Up Project

13
A Company and Technology Overview Read What Matters

Transcript of AideRSS - AWS The Start-Up Project

A Company and Technology Overview

Read What Matters

What is RSS?

History• First standard published in 1997, very quickly picked up by Netscape• In 1999, BBC introduces RSS to millions in a form of a ‘news ticker toolbar’

• Since then, hundreds of millions of blogs, newspapers, magazines, and other content distributors have published their own RSS feeds:

RSS news feeds allow you to see when websites have added new content. You can get the latest headlines in one place, as soon as its published, without having to visit the websites you have taken the feed from.

2003 2007

The Problem - Information Overload

Read what matters

PostRank™ Filtering

PostRank™ Filtering

Read How You Want

Launch day @ 00:00 - 10 EC2 instances - 5,000 most popular RSS feeds - ~20,000 new stories a day

Launch day @ 12:00 - 30 EC2 instances - 12,000 RSS feeds - 85,000 new stories

Launch day @ 24:00 - 100 EC2 instances - 35,000 RSS feeds - 440,000 new stories

Today - 20-30 EC2 instances - 70,000 RSS feeds - 200,000-400,000 new stories / day

AideRSS Launch Stats

Startup math

Amazon AWS Platform Development: Testing cluster (on-demand) + 5 CPUs / 5 months + 250mb/s uplink (Free!) + S3 for messaging ------------------------------- $180 total

(1,400% savings!)

Production: Dynamic cluster for updates + 20-100 CPUs + 250mb/s uplink (Free!) + SQS for messaging ------------------------------- less than $5,000/ month

(> 300% savings / month!)

Dedicated Route Development: Testing cluster (on-demand) + 5 CPUs / 5 months + 250mb/s uplink (Scary!) + Own cloud service ----------------------------- (5 servers x $100) x 5 Bandwidth + Cloud ------------------------------ at least $2,500 / month

Production: Dynamic cluster for updates + 20-100 CPUs + 250mb/s uplink (Free!) + SQS-alike for messaging ------------------------------- (20-100) x $150 / month Bandwidth + Messaging ------------------------------- greater than $15,000 / month

10

Our Architecture (1)

AideRSS Database

S3

EC2

Scheduler

11

Our Architecture (2)AideRSS Database

SQS

EC2

SQS

Migrating to SQS (Distributed Queue) - 256kb per message (max) - Serialize & compress - August: 500,000 messages - September: 1,000,000+ messages

Budget: $150

Yesterday - 69,856 RSS feeds - 245,422 new stories - 20 EC2 instances (avg.) - Dynamic cluster scaling - 2 SQS queues

Next - More optimizations! - Clustering - Recommendations - Keyword filtering - and much more…

Scheduler

12

Cluster monitoring

Ilya GrigorikFrancis Lau

Kevin Thomason

[email protected]@aiderss.com [email protected]

“Read What Matters”

Questions?

We are looking for partnerships.

http://www.aiderss.com