Amazon Redshift for Business Intelligence
-
Upload
amazon-web-services -
Category
Technology
-
view
11.881 -
download
0
description
Transcript of Amazon Redshift for Business Intelligence
for
introducing
AMAZON REDSHIFTBUSINESS INTELLIGENCE
a presentation at MICROSTRATEGY WORLD 2013
by
DR. MATT WOOD
Hello.
Thank you.
IData, data everywhere
I IICollection &
storageData, data everywhere
I II IIIData
securityData, data everywhere
Collection &storage
I II III IVData
movementData, data everywhere
Data security
Collection &storage
I II III IVData, data everywhere
Datamovement
Data security
Collection &storage
0.Amazon web
Services
Building blocks.
Compute, storage & databases.
Retail Merchantservices
Web services
Blinding flash of the obvious.
Available.
Low cost.
Flexible.
Every day, AWS adds enough server capacity to power amazon.com in 2003, when it was a $5B enterprise
Data, data everywhere
I
Data for competitive advantage.
Customer segmentation,financial modeling, system analysis, line of sight, business intelligence...
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
Cost of data generation is falling.
Kindle Fire HD, Kindle Fire, Kindle Paperwhite and Kindle hold the top four spots on the Amazon world wide best seller chart since launch.
devices
Amazon Appstore selection tripled in 2012.
apps and games
Amazon customers purchased more than one toy per second on mobile devices.
commerce
most gifted kindle book
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
lower cost,increased throughput
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
highly constrained
Gap.
1990 2000 2010 2020
The Data Analysis Gap
Enterprise Data Data in Warehouse Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
Generated data
Available for analysis
Data volume
Enter AWS.
Utility.
Remove constraints.
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
highly constrained
Generation
Collection & storage
Analytics & computation
Collaboration & sharing
Full value.
Close the gap.
Reduced time to market.
Identify and meet new business opportunities.
Lower costs.
Collection & Storage
II
One schema to rule them all.
One schema to rule them all.
Lots of data.Lots of users.Lots of uses.
Lots of locations.
Cost.
Multipliers.
Object storage.
99.999999999%durability
Relational databases.
NoSQL data stores.
HDFS based stores.
Undi!erentiated heavy lifting.
Lower costs. Ease of use.
Lower costs. Ease of use.Lower costs.
no capital investment
pay as you go
no subscriptions
only pay for what you use
Lower costs. Ease of use.Ease of use.
programmable
zero admineasy to
configure
integrate with existing tools
Data warehousing.
Expensive. Complicated.
Enterprises average between 3 and 4 DBAs per data warehouse.
Source: Gartner. Critical factors in calculating the data warehouse TCO, July 2009
Source: Oracle technology global price list 11/1/2012
Expensive. Complicated.
Unobtainable.
Amazon Redshift.
Fast. Powerful. Petabyte scale.
Managed service.
Automated deployment & configuration.
SQL access and BI tool integration.
Parallel execution.
LeaderNode
ComputeNode
ComputeNode
ComputeNode
LeaderNode
ComputeNode
ComputeNode
ComputeNode
LeaderNode
10gigE full bisection network.
ComputeNode
ComputeNode
ComputeNode
LeaderNode
ComputeNode
ComputeNode
ComputeNode
LeaderNode
Common BI Tools
JDBC/ODBC
Certified for use withMicrostrategy.
Data compression.
Automated backup to S3.
Data encrypted in transit& at rest.
Streaming recovery.
ComputeNode
ComputeNode
ComputeNode
LeaderNode
Common BI Tools
JDBC/ODBC
ComputeNode
ComputeNode
ComputeNode
LeaderNode
Common BI Tools
JDBC/ODBC
ComputeNode
ComputeNode
ComputeNode
LeaderNode
Common BI Tools
JDBC/ODBC
Elastic.
ComputeNode
ComputeNode
ComputeNode
LeaderNode
Common BI Tools
JDBC/ODBC
ComputeNode
ComputeNode
ComputeNode
LeaderNode
Common BI Tools
JDBC/ODBC
ComputeNode
ComputeNode
ComputeNode
ComputeNode
ComputeNode
LeaderNode
Common BI Tools
JDBC/ODBC
Data warehouse node types.
15GB RAM2TB local attached storage3 drives2 virtual cores
High Storage Extra Large (XL)
High Storage Extra Large (XL)
15GB RAM2TB local attached storage3 drives2 virtual cores
8 High Storage Extra Large (8XL)
120GB RAM16TB local attached storage24 drives16 virtual cores
Pay as you go.
2 TB nodes 16 TB nodes
On-demand $0.850 $6.80
1 Year Reservation $0.50 $4.00
3 Year Reservation $0.228 $1.824
Hourly Prices
2 TB nodes 16 TB nodes
On-demand $0.850 $6.80
1 Year Reservation $0.50 $4.00
3 Year Reservation $0.228 $1.824
Hourly Prices
$999 per TB
Don’t pay for the leader node.
No additional storage charge for backups of active clusters.
VPC ready.
Low cost. Easy to use.
Focus on analysis.
Private beta today.
Available early this year.
aws.amazon.com/redshift
2 billion row dataset. 6 representative queries.
Compared to 32 nodes. 128 CPUs. 4.2 TB RAM. 1.6 PB storage. 2 billion row data set.Amazon Redshift: 2 instance cluster
12x to 150x faster
29 minutes 58 secondsdown to
12 seconds
Data security.
III
Security is our number one priority.
Shared responsibility.
Choose your region.
Availability zones.
ITAR
FIPS 140-2
MPAAISO 27001
SOC 2 ISAE 3402 PCI DSS
HIPAA
FISMA Moderate
“You basically turn yourself into a polymorphic surface to which the attack guy has a much tougher time getting at. That, ultimately, is the real key advantage to drive security and make things much better for us across the board.”
Gus Hunt, CTOCentral Intelligence Agency
Virtual Private Cloud.
Network isolated environment.
Public and private subnets.
Redshift, relational databases, Hadoop can run inside the VPC.
Extend your VPN.
Identity and access federation.
Identity and access management.
Data movement.
IV
“How do I get my data into the cloud?”
Generated and stored in the AWS cloud.
Inbound transfer if free.
Multipart upload.
Aspera, IRODS.
Physical media.
AWS Direct Connect.
1Gbps or 10Gbps
Built in AZ replication.
Regional replication.
“How do I integrate my data?”
Amazon DynamoDB
HDFS (Amazon EMR)
Amazon S3
Amazon Redshift
On Premise
Amazon RDS
AWS Data Pipeline
Data-intensive orchestration& automation.
Reliable, scheduled data movement and analytics.
aws.amazon.com/datapipeline
aws.amazon.com
IData, data everywhere
I IICollection &
storageData, data everywhere
I II IIIData
securityData, data everywhere
Collection &storage
I II III IVData
movementData, data everywhere
Data security
Collection &storage
Thank you.
for
introducing
AMAZON REDSHIFT
BUSINESS INTELLIGENCE
get in [email protected]
or
@MZA
AWS.AMAZON.COM