Big data in the cloud

16
Big Data in the Cloud @bensullins

description

Comparison of Google Big Query and Amazon Redshift

Transcript of Big data in the cloud

Page 1: Big data in the cloud

Big Data in the Cloud@bensullins

Page 2: Big data in the cloud
Page 3: Big data in the cloud

Amazon Redshift Architecture

Columnar DB MPP Architecture Speed!

Page 4: Big data in the cloud

Amazon Redshift Scalability

2TBXL Node

High Storage Extra Large (XL) DW Node:CPU: 2 virtual cores - Intel Xeon E5Memory: 15 GiBStorage: 3 HDD with 2TB of local attached storageNetwork: ModerateDisk I/O: ModerateAPI: dw.hs1.xlarge

High Storage Eight Extra Large (8XL) DW Node:

CPU: 16 virtual cores - Intel Xeon E5Memory: 120 GiBStorage: 24 HDD with 16TB of local attached storageNetwork: 10 Gigabit Ethernet with support for cluster placement groupsDisk I/O: Very HighAPI: dw.hs1.8xlarge

16TB8XL Node

Page 5: Big data in the cloud

Amazon Redshift CostOn-Demand Pricing

DW Node Class (On-Demand) Hourly

XL Node - 2TB storage (Per Node) $0.850 per Hour

8XL Node - 16TB storage (Per Node)

$6.800 per Hour

DW Node Class (Reserved) Up front Hourly

XL Node - 2TB storage (Per Node) $2,500 $0.215 per Hour

8XL Node - 16TB storage (Per Node) $20,000 $1.720 per Hour

DW Node Class (Reserved) Up front Hourly

XL Node - 2TB storage (Per Node) $3,000 $0.114 per Hour

8XL Node - 16TB storage (Per Node) $24,000 $0.912 per Hour

Reserved Instance 1yr (41% savings)

Reserved Instance 3yr (73% savings)

Page 6: Big data in the cloud

Amazon Redshift Ease of Use

Fully Managed

Fault Tolerant

Automated Backups

Web Interface

Page 7: Big data in the cloud

Amazon Redshift Security

AES-256 bit Encryption Amazon VPC Firewall

Page 8: Big data in the cloud

Amazon Redshift Compatibility

Page 9: Big data in the cloud

BigQuery

Page 10: Big data in the cloud
Page 11: Big data in the cloud

Google Big Query Architecture

Columnar DB Speed!Tree Architecture

Page 12: Big data in the cloud

Google BigQuery on Speed“Dremel can

Scan 35 Billion Rows without an Index in

Tens of Seconds” – Solutions Architect, Google Cloud Solutions Team

Page 13: Big data in the cloud

Google BigQuery Scalability

?

Page 14: Big data in the cloud

Google BigQuery CostResource Pricing

Storage $80 (per TB/month)

Interactive Queries $35 (per TB processed)

Batch Queries $20 (per TB processed)

On-Demand Pricing

Data Cost

100 TB $3,300 per month ($33 per TB)

400 TB $12,000 per month ($30 per TB)

1,500 TB $40,500 per month ($27 per TB)

4,000 TB $100,000 per month ($25 per TB)

Packaged Pricing

• Packages are billed in full at the end of each month, whether the package is used or not.

• If you use more data than the amount in your chosen package, on-demand rates apply for any additional data.

Page 15: Big data in the cloud

Google BigQuery: Compatibility

Page 16: Big data in the cloud

Cloud Big Data Sources Comparison

Amazon Redshift

Columnar + MPP

Petabytes in Scale

Easy management interface

Straight forward billing ($1K/TB/Yr)

Great connectivity w/ BI Tools

Google BigQuery

Columnar + Tree

Infinite Scalability

No Management Required

Confusing Pricing Model

Fair Connectivity w/ BI Tools