AmazonRedshift

24
Amazon Redshift ~ Ahasan Habib Technical Project Manager, Ixora Solution Dhaka, Bangladesh

Transcript of AmazonRedshift

Page 1: AmazonRedshift

Amazon Redshift~ Ahasan Habib

Technical Project Manager, Ixora SolutionDhaka, Bangladesh

Page 2: AmazonRedshift

Data warehouse conceptWhat is Data warehouse?

●Relational database

●Query & analysis

●Transaction processing data => Historical data

●Transaction workload => Analysis work load

●Extract, Transform, Load

Page 3: AmazonRedshift

Data warehouse architecture

Page 4: AmazonRedshift

Big DataSo large and complex traditional data processing applications are not adequate

characteristics:

●Volume:

The amount or quality of data.

●Velocity

The rate at which data is created.

●Variety

The different types of data.

Page 5: AmazonRedshift

Big data Architecture

Page 6: AmazonRedshift

Operational Data●Transactional data.

●Event data.

●Realtime data

●Helps to run day to day system/business operation.

Page 7: AmazonRedshift

Analytical Data●Historical data

●Numerical values, measure, matrix (numerical measurement)

●Business intelligence & decision making

Page 8: AmazonRedshift

Rows Vs Columnar Database

Page 9: AmazonRedshift

What is Redshift ?●A data warehouse management tool.

●Develop and manage by Amazon.

●Cloud hosted large data management system.

●Distributed data management system.

●Columnar data storage.

Page 10: AmazonRedshift

Redshift Speciality1.Extremely fast.

2.Web service API based communication.

3.Massive parallel processing.

4.Full ANSI SQL support.

5.Columnar database.

6.Learning is very easy.

Page 11: AmazonRedshift

Redshift Product History●November 2012 Bita release

●Feb 14 2014 Initial release

●POSTGRESQL 8.0.2

Page 12: AmazonRedshift

Redshift Architecture

Page 13: AmazonRedshift

Advantages using Redshift●Extremly faster for analytical data processing.

●Support ANSI SQL syntax.

●Cloud based solution.

●Highly secured (context of data & system access)

Page 14: AmazonRedshift

Redshift data warehouse design1.Start schema

Page 15: AmazonRedshift

2. Snowflakes Schema

Page 16: AmazonRedshift

3. Denormalized Fact TableCustomer Id

Customer Name

Customer Address

State

City

Country

Product Id

Product Name

Product Category

Gross Sales Amount

Net Sales Amount

Page 17: AmazonRedshift

Index and Constraints1.Sort Key

2.Distribution Key

3.Primary-key/Foreign Key

4.Triggers

Page 18: AmazonRedshift

Data TypesData Type Alias Description

SMALLINT INT2 Signed 2 byte

INTEGER INT4 Signed 4 byte

BIGINT INT8 Signed 8 byte

DECIMAL NUMERIC Selectable precision

REAL, Double Precision Float4, Float8 Single, Double Precision (32,64)

CHAR CHARACTER,NCHAR Fixed Length (4096)

VARCHAR NVARCHAR, TEXT Variable Length (65535)

DATE, TIMESTAMP Calendar Date, Date & Time (UTC)

BOOLEAN BOOL True/False

Page 19: AmazonRedshift

Data Loading● S3

●COPY command

●Data Pipeline

Page 20: AmazonRedshift

Query●CRUD

●Dynamic query

●Metadata Query

●Query execution Plan

Page 21: AmazonRedshift

Other database objects●Built in Function

●User defined Function

●Stored Procedures

●Transactions

Page 22: AmazonRedshift

Security●User Management

●Role Management

●Schema Management

Page 23: AmazonRedshift

Client Development Tools●Navicat

●SQL Server Management Studio

●Various Drivers:

Linux

Visual Studio

Scala

Python

Page 24: AmazonRedshift

Q & A