Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon...

50
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. SUMMIT Building a Data Anonymization Platform on AWS Laszlo Török, Engineering Lead, Telefónica NEXT BIG DATA / ANALYTICS / STREAMING Özkan Can, Senior Solutions Architect, Amazon Web Services

Transcript of Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon...

Page 1: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Building a Data AnonymizationPlatform on AWS

Laszlo Török,Engineering Lead,Telefónica NEXT

B I G D A T A / A N A L Y T I C S / S T R E A M I N G

Özkan Can,Senior Solutions Architect,Amazon Web Services

Page 2: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Cloud-native application storage

Özkan Can,Senior Solutions Architect,Amazon Web Services

S T G 5

Page 3: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Agenda

1. Why cloud-native?

2. What ist cloud-native?

3. What does this mean for Storage?

Page 4: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 5: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Why cloud-native?

1. Reliability

2. Frugality

3. Security

4. Elasticity & scale

5. Performance

6. Better design

Page 6: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Page 7: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 8: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Assess and

prioritize,

app by app

Pick path to

modernization

Lift & shift:

data center → EC2

Re-platform:

VMs → containers

Refactor:

monolith →microservices

Re-invent:

host fleets → serverless

Paths to cloud-native

Page 9: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Traditional app. example, in-cloud

Amazon

Aurora MySQL

Amazon EC2

Auto Scaling

Application

Load Balancer

Page 10: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Traits of modern applications

Security in every layer.

Built to scale on demand.

Continously integrated and deployed.

Microservice-oriented and API-backed.

Leveraging purpose-built databases and cloud-native storage options.

Page 11: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 12: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Block StorageRaw StorageData organized as an array of unrelated blocksHost File System places data on diske.g.: Microsoft NTFS, Unix ZFS

File StorageUnrelated data blocks managed by a file (serving) system

Native file system places data on disk

Object StorageStores Virtual containers that encapsulate the data, data attributes, and metadata

API Access to data

Metadata Driven, Policy-based, etc

File vs Block vs Object

Page 13: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

File vs Block vs Object

File ObjectBlock

Amazon Simple Storage

Service (S3)

Amazon S3 Glacier

Amazon Elastic Block

Store (EBS)

Amazon Elastic File

System

Amazon FSx for Lustre

NEW

Amazon FSx for Windows

File Server

NEW

Page 14: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Amazon

EFS

AWS Storage Gateway Family

Amazon

S3

NEW!Amazon

FSx for

Lustre

Amazon FSx

for Windows

File Server

NEW!

Amazon

EBS

Amazon

EC2

Storage options for cloud-native applications

Page 15: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Storage Services on AWS

Amazon Elastic Block

Store (EBS)

Amazon FSx for LustreAmazon Elastic File

System

Amazon FSx for Windows

File Server

Amazon S3 Glacier AWS Snowball EdgeAmazon Simple Storage

Service (S3)

AWS Snowball

AWS Storage GatewayAWS BackupAWS Snowmobile

NEW NEW

NEW

Page 16: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Magic Quadrant

Magic Quadrant for Public Cloud Storage Services, Worldwide – 2018

Positioned furthest for completeness of vision and highest for ability to execute in each report since inception in 2014

Magic Quadrant for Public Cloud Storage Services, July 2018 – Raj Bala, Julia Palmer

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Amazon Web Services. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Page 17: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Enterprise applications

Benefits of Amazon S3

Website hosting

Media Master files

Big Data

File Sharing

Content Distribution

Archive

Data Analytics

Backup & Restore

Dynamic Websites

Mobile sync & backup

Disaster Recover

Re-creatable data

Page 18: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Amazon Simple Storage Service (S3)

Designed for

99.999999999% durability

Unmatched security and

compliance capabilities

Replication options across

regions

On-demand analytics

Built-in support for SQL

expressions with S3 Select

Detailed data on usage

patterns and access

The most ways to

move data in/out

Security that

helps the CISO

Automated cost reduction

tools

Collect AnalyzeStore

Page 19: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Your choice of Amazon S3 storage classes

Access FrequencyFrequent Infrequent

• Active, frequently

accessed data

• Milliseconds access

• > 3 AZ

• From: $0.0210/GB

• Data with changing

access pattern

• Milliseconds access

• > 3 AZ

• From: $0.0210 to

$0.0125/GB

• Monitoring fee per obj.

• Min storage duration

• Infrequently accessed

data

• Milliseconds access

• > 3 AZ

• From: $0.0125/GB

• Retrieval fee per GB

• Min storage duration

• Min object size

S3 Standard S3 Standard-IA S3 One Zone-IA S3 Glacier

• Re-creatable less

accessed data

• Milliseconds access

• 1 AZ

• From: $0.0100/GB

• Retrieval fee per GB

• Min storage duration

• Min object size

• Archive data

• Minutes to hours

access

• > 3 AZ

• From: $0.0040/GB

• Retrieval fee per

GB

• Min storage

duration

• Min object size

S3 Intelligent-

Tiering

S3 Glacier

Deep Archive

• Archive data

• Hours access

• > 3 AZ

• From: $0.00099/GB

• Retrieval fee per GB

• Min storage

duration

• Min object size

N E W ! N E W !

Page 20: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Lakes from AWS

Data Lake on AWS

Lowest cost

Scalable and durable

Secure

Open and comprehensiveAnalyticsMachine Learning

Real-time Data Movement

On-premisesData Movement

Page 21: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Tiered Storage to Optimize Price/PerformanceLowest Cost

• Tiered storage to optimize price/performance• S3 Standard

• S3 Standard—Infrequent Access

• S3 One Zone—Infrequent Access

• Amazon Glacier

• Migrate between tiers based on lifecycle policies

• Store data at $0.023/GB/month with S3

• Store data at $0.004/GB/month with Glacier

S3

StandardS3 Standard

Infrequent Access

S3 One Zone-IA

Glacier

Page 22: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Building a Data AnonymizationPlatform on AWS

Laszlo Török,Engineering Lead,Telefónica NEXT

B I G D A T A / A N A L Y T I C S / S T R E A M I N G

Page 23: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 24: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

We are part of the global

telco group

Telefónica Germany‘s

network has the most

customer lines

Page 25: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

An average day of the Telefónica Germany Network

45MCustomer

Lines

5B+Network

Events

Page 26: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Very valuable dataHelp predict capacities in regional transportin Berlin-Brandenburg in the ProTrainproject, supported by the German Ministryof Transport

Model city traffic for special events like football matches with partner Intraplan

Retail applications: store location planing

Page 27: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

https://next.telefonica.de/so-bewegt-sich-deutschland

Page 28: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx
Page 29: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Very sensitive data

Protected by German privacylaws (TMG, BDSG)

We are commited to protect

privacy and give users control

over their data

Page 30: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Valuable insights vs Individual privacy ?

Most value comes from aggregate analysis of groups, not individuals.

Page 31: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Telefónica‘s Data Anonymization Platform

Cellular Signal

Data produced

by regular

mobile

network use

Anonymization

and aggregation

of Data – opt-out

possible

Analysis of

anonymized

data

Solutions for

society and

economy

Page 32: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 33: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Context

Page 34: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Requirements for our Data Hub

Page 35: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 36: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Page 37: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Hub v0.9

Page 38: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Fully programmable provisioning

Page 39: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Can we do better?

Page 40: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Optimizing S3 storage costs

Page 41: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

First month of production

Top 5 items on our end of month AWS bills

1. Lambda

2. Kinesis

3. Cloudwatch (?!?)

4. S3

5. KMS (~ on par with S3)

Page 42: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Use compression – fewer Kinesis shards needed

Page 43: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

What went wrong?

Page 44: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Data Hub v2 – Fargate to the rescue

Page 45: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Access Management via IAM

Page 46: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Monitoring via Cloudwatch

Page 47: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Page 48: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Lessons learnt

Superpower Less DIY ops

not always significant€€€ savings

Page 49: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT

Thank you!

SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

László Török

Engineering Lead, Big Data Privacy Services

[email protected]

https://next.telefonica.de

@telefonicaNEXT

We are hiring!

linkedin.com/company/telefónicanext

Page 50: Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon Elastic Block Store (EBS) Amazon Elastic File Amazon FSx for Lustre System Amazon FSx

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.