Introducing Amazon RDS for PostgreSQL (DAT210) | AWS re:Invent 2013
-
Upload
amazon-web-services -
Category
Technology
-
view
2.807 -
download
4
description
Transcript of Introducing Amazon RDS for PostgreSQL (DAT210) | AWS re:Invent 2013
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
DAT210 – Introducing Amazon RDS for PostgreSQL
Srikanth Deshpande - Senior Product Manager, AWS
Nick Hertl – Software Development Manager, AWS
Gabe Arnett – Senior Director, Moody’s Analytics
November 14, 2013
Amazon Relational Database Service RDS is a managed relational database service that is simple to deploy,
easy to scale, reliable, and cost-effective
Managed Service
Easy to Scale and Operate
Choice of Database Engines
High Availability
High Performance
Amazon Relational Database Service (RDS)
Backups and Disaster Recovery
Push-Button Scaling
Multi-AZ Deployments
Security Internet
IAM
VPC
DB Parameter Groups
{DBInstanceClassMemory/12582880} Filter=“connection”
Amazon RDS for PostgreSQL • Database version: PostgreSQL 9.3.1
• Includes valuable Amazon RDS functionality – Fast deployment
– Backups and point-in-time recovery
– Snapshots and restore
– Compute and storage scaling
– Multi-AZ
– Provisioned IOPS
8
Launching a Postgres DB Instance
Select Production Use (or not)
Instance Details
Additional Configuration
Management Options
Running Instance
Connecting
Permissions superuser role (Postgres)
rds_superuser role (RDS provided)
Load and use extensions
View and kill sessions
Create tablespace
Assign replication role
…
Extensions
• PostGIS available
• rds.extensions parameter: – btree_gin
– btree_gist
– chkpass
– citext
– cube
– dblink
– dict_int
– dict_xsyn
– earthdistance
– fuzzystrmatch
– hstore
– intagg
– intarray
– isn
– ltree
– pgcrypto
– pgrowlocks
– pg_trgm
– plperl
– plpgsql
– pltcl
– postgis
– postgis_tiger_geocoder
– postgis_topology
– sslinfo
– tablefunc
– tsearch2
– unaccent
– uuid-ossp
High Performance
16,500+ Read and 8,500+ Write = 25,000+ IOPS
Getting Started
• Launch an instance from AWS Management
Console
• Configure network
• Load extensions
• Export from existing database using pg_dump
• Import to RDS using pg_restore
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
RDS PostgreSQL
Gabe Arnett, Senior Director, Moody’s Analytics
November 14, 2013
• Moody’s Analytics offers unique tools and best practices for measuring and
managing risk through expertise and experience in credit analysis,
economic research, and financial risk management.
• Product offerings include leading-edge software, advisory services, and
credit and economic research.
• A subsidiary of Moody's Corporation (NYSE: MCO), which reported revenue
of $2.7 billion in 2012, employs approximately 7,200 people worldwide and
maintains a presence in 29 countries.
Legacy Platform
Calculation Engine
DataPortal
Source 2
Sybase 1
Sybase 2
Sybase 3
Oracle
Source 3
Source 7
Source 4
Source 8
Source 5
Source 6
Source 1
Ftp server
Routine 2
Routine 3
excl EJV
Legacy App Db
4x daily
EDF
Data
Transfer
App1x daily
1x daily
4x daily
1x daily
Front End Db’s
(11 servers)
MS SQL Server
Routine 4
Standalone
C++ App
Vendor
Data
Engine /
Calculator
1x monthly
Routine 11x daily
App Server 2
Job 2
(4x daily)
Job 3
(1x daily)
App Server 3
Job 4
(1x daily)
App Server 1
Job 1
(4x daily)
Pasta,
anyone?
Overhaul
Write Master (AZ 1)
Read Replica 1
(Warm Standby AZ 2)
Read Replica 2 (AZ 1)
WAL
Cascading Replication
ETL Cluster
Calculation Engine
(reads from RR2 and writes results to WM)
External Data Sources
Amazon Simple Storage
Service
Summer Fun
Pros
• We learned a tremendous amount and could probably write a solid
blog post or whitepaper
• No cost other than infrastructure
• Support/maintenance tasks now very reasonable and can be done
with existing resources without incurring additional costs
Cons
• Lots of time spent finding the write configurations, trial and error,
testing and more testing
• I have to convince really talented Java, Python, and .NET
developers that they have to be PostgreSQL system admins
• We are an enterprise, and as such I have to have an enterprise level
of support
Write Master (Multi-AZ)
Read Replica
ETL Cluster
Calculation Engine
(Reads from RR2 and writes results to WM)
External Data Sources
Amazon Simple Storage
RDS PostgreSQL
Warm Standby
Region 2
Snapshot Copy
futu
re
Write Master (Multi-AZ)
Read Replica
ETL Cluster
Calculation Engine
(Reads from RR and writes results to WM)
External Data Sources
Amazon Simple Storage
RDS PostgreSQL
Warm Standby
Region 2
Snapshot Copy
futu
re
• Achieve the same performance as existing setup on Amazon EC2, if
not better, in a matter of minutes
• We get built-in backup/recovery/replication/fault tolerance/multi-AZ
• More robust operational support built in, and my developers can get
back to the business of development
Why Amazon RDS PostgreSQL?
Up and Running
Self-Managed(hours)
• Launch Amazon EC2 w/EBS
• Mount and Raid0 Amazon EBS
• Install PostgreSQL
• Move data and logs
• Edit .conf files
• Create users
• Load/Use DB
• Create snapshot, and then…
RDS PostgreSQL(minutes)
• Add/Edit CIDR/IP block to security
group (pg_hba.conf)
• Edit DB parameter group to apply
configuration settings
(postgresql.conf)
• Launch RDS instance
• Load/Use DB
• Sit back and monitor or let
Amazon CloudWatch do it for us…
Backup/Retention
• Single-click backup policy upon creation
• No schedule to implement or forget
• Snapshots are easy to find
– All easily found in the AWS Management Console and searchable
• One-click restore to point in time = AWESOME!!!
Monitoring
• Amazon CloudWatch metrics alongside instance details
– A challenge to find and consolidate all the EBS volumes + EC2 instances
• Logs are in the console
– Not fun to dig through the logs, assuming we actually had that kind of time
• Event subscriptions for faults
– Extra pro-active protection
Scale and Redundancy
• At launch, RDS PostgreSQL is multi-AZ enabled with a click
– We had to spin up a second instance and then configure WAL and hope and
pray
– Bit of configuration and tuning to get the correct performance for this without
impacting write performance and ensuring near real-time reads
– Lossless factor is a risk if the write master fails
Next
• Additional legacy data platforms
• Extending PostgreSQL
– Developing key/value store for near real-time data ingestion
– Integrating with Solr
– Front end datamart
• Redshift for BI use cases
Please give us your feedback on this
presentation
As a thank you, we will select prize
winners daily for completed surveys!
DAT210