Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon...
Transcript of Building a Data Anonymization Platform on AWS Marketing... · Storage Services on AWS Amazon...
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Building a Data AnonymizationPlatform on AWS
Laszlo Török,Engineering Lead,Telefónica NEXT
B I G D A T A / A N A L Y T I C S / S T R E A M I N G
Özkan Can,Senior Solutions Architect,Amazon Web Services
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Cloud-native application storage
Özkan Can,Senior Solutions Architect,Amazon Web Services
S T G 5
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Agenda
1. Why cloud-native?
2. What ist cloud-native?
3. What does this mean for Storage?
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Why cloud-native?
1. Reliability
2. Frugality
3. Security
4. Elasticity & scale
5. Performance
6. Better design
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Assess and
prioritize,
app by app
Pick path to
modernization
Lift & shift:
data center → EC2
Re-platform:
VMs → containers
Refactor:
monolith →microservices
Re-invent:
host fleets → serverless
Paths to cloud-native
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Traditional app. example, in-cloud
Amazon
Aurora MySQL
Amazon EC2
Auto Scaling
Application
Load Balancer
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Traits of modern applications
Security in every layer.
Built to scale on demand.
Continously integrated and deployed.
Microservice-oriented and API-backed.
Leveraging purpose-built databases and cloud-native storage options.
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Block StorageRaw StorageData organized as an array of unrelated blocksHost File System places data on diske.g.: Microsoft NTFS, Unix ZFS
File StorageUnrelated data blocks managed by a file (serving) system
Native file system places data on disk
Object StorageStores Virtual containers that encapsulate the data, data attributes, and metadata
API Access to data
Metadata Driven, Policy-based, etc
File vs Block vs Object
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
File vs Block vs Object
File ObjectBlock
Amazon Simple Storage
Service (S3)
Amazon S3 Glacier
Amazon Elastic Block
Store (EBS)
Amazon Elastic File
System
Amazon FSx for Lustre
NEW
Amazon FSx for Windows
File Server
NEW
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon
EFS
AWS Storage Gateway Family
Amazon
S3
NEW!Amazon
FSx for
Lustre
Amazon FSx
for Windows
File Server
NEW!
Amazon
EBS
Amazon
EC2
Storage options for cloud-native applications
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Storage Services on AWS
Amazon Elastic Block
Store (EBS)
Amazon FSx for LustreAmazon Elastic File
System
Amazon FSx for Windows
File Server
Amazon S3 Glacier AWS Snowball EdgeAmazon Simple Storage
Service (S3)
AWS Snowball
AWS Storage GatewayAWS BackupAWS Snowmobile
NEW NEW
NEW
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Magic Quadrant
Magic Quadrant for Public Cloud Storage Services, Worldwide – 2018
Positioned furthest for completeness of vision and highest for ability to execute in each report since inception in 2014
Magic Quadrant for Public Cloud Storage Services, July 2018 – Raj Bala, Julia Palmer
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Amazon Web Services. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Enterprise applications
Benefits of Amazon S3
Website hosting
Media Master files
Big Data
File Sharing
Content Distribution
Archive
Data Analytics
Backup & Restore
Dynamic Websites
Mobile sync & backup
Disaster Recover
Re-creatable data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Amazon Simple Storage Service (S3)
Designed for
99.999999999% durability
Unmatched security and
compliance capabilities
Replication options across
regions
On-demand analytics
Built-in support for SQL
expressions with S3 Select
Detailed data on usage
patterns and access
The most ways to
move data in/out
Security that
helps the CISO
Automated cost reduction
tools
Collect AnalyzeStore
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Your choice of Amazon S3 storage classes
Access FrequencyFrequent Infrequent
• Active, frequently
accessed data
• Milliseconds access
• > 3 AZ
• From: $0.0210/GB
• Data with changing
access pattern
• Milliseconds access
• > 3 AZ
• From: $0.0210 to
$0.0125/GB
• Monitoring fee per obj.
• Min storage duration
• Infrequently accessed
data
• Milliseconds access
• > 3 AZ
• From: $0.0125/GB
• Retrieval fee per GB
• Min storage duration
• Min object size
S3 Standard S3 Standard-IA S3 One Zone-IA S3 Glacier
• Re-creatable less
accessed data
• Milliseconds access
• 1 AZ
• From: $0.0100/GB
• Retrieval fee per GB
• Min storage duration
• Min object size
• Archive data
• Minutes to hours
access
• > 3 AZ
• From: $0.0040/GB
• Retrieval fee per
GB
• Min storage
duration
• Min object size
S3 Intelligent-
Tiering
S3 Glacier
Deep Archive
• Archive data
• Hours access
• > 3 AZ
• From: $0.00099/GB
• Retrieval fee per GB
• Min storage
duration
• Min object size
N E W ! N E W !
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Lakes from AWS
Data Lake on AWS
Lowest cost
Scalable and durable
Secure
Open and comprehensiveAnalyticsMachine Learning
Real-time Data Movement
On-premisesData Movement
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Tiered Storage to Optimize Price/PerformanceLowest Cost
• Tiered storage to optimize price/performance• S3 Standard
• S3 Standard—Infrequent Access
• S3 One Zone—Infrequent Access
• Amazon Glacier
• Migrate between tiers based on lifecycle policies
• Store data at $0.023/GB/month with S3
• Store data at $0.004/GB/month with Glacier
S3
StandardS3 Standard
Infrequent Access
S3 One Zone-IA
Glacier
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Building a Data AnonymizationPlatform on AWS
Laszlo Török,Engineering Lead,Telefónica NEXT
B I G D A T A / A N A L Y T I C S / S T R E A M I N G
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
We are part of the global
telco group
Telefónica Germany‘s
network has the most
customer lines
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
An average day of the Telefónica Germany Network
45MCustomer
Lines
5B+Network
Events
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Very valuable dataHelp predict capacities in regional transportin Berlin-Brandenburg in the ProTrainproject, supported by the German Ministryof Transport
Model city traffic for special events like football matches with partner Intraplan
Retail applications: store location planing
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
https://next.telefonica.de/so-bewegt-sich-deutschland
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Very sensitive data
Protected by German privacylaws (TMG, BDSG)
We are commited to protect
privacy and give users control
over their data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Valuable insights vs Individual privacy ?
Most value comes from aggregate analysis of groups, not individuals.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Telefónica‘s Data Anonymization Platform
Cellular Signal
Data produced
by regular
mobile
network use
Anonymization
and aggregation
of Data – opt-out
possible
Analysis of
anonymized
data
Solutions for
society and
economy
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Context
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Requirements for our Data Hub
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Hub v0.9
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Fully programmable provisioning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Can we do better?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Optimizing S3 storage costs
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
First month of production
Top 5 items on our end of month AWS bills
1. Lambda
2. Kinesis
3. Cloudwatch (?!?)
4. S3
5. KMS (~ on par with S3)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Use compression – fewer Kinesis shards needed
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
What went wrong?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Data Hub v2 – Fargate to the rescue
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Access Management via IAM
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Monitoring via Cloudwatch
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Lessons learnt
Superpower Less DIY ops
not always significant€€€ savings
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMIT
Thank you!
SUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
László Török
Engineering Lead, Big Data Privacy Services
https://next.telefonica.de
@telefonicaNEXT
We are hiring!
linkedin.com/company/telefónicanext
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.SUMMITSUMMIT © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.