Taking Hadoop to Enterprise Security Standards
-
Upload
hadoopsummit -
Category
Technology
-
view
105 -
download
1
description
Transcript of Taking Hadoop to Enterprise Security Standards
![Page 1: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/1.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Taking Hadoop to Enterprise Security StandardsKarthik Ramasamy
Harsh Singhal
Arvind Mani
![Page 2: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/2.jpg)
Access Control
![Page 3: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/3.jpg)
How many of you need or have access control in Hadoop?
![Page 4: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/4.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Users First Internal Threat
Keeping Data Secure
External Threat
![Page 5: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/5.jpg)
More granular the access controls are more people can have access to
the data
![Page 6: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/6.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Hadoop – Status Quo
Multiple Query Execution Engines
Custom Code Execution
Auditing
![Page 7: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/7.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
User ID Email Address IP address Billing address
Security Customer Service Data Scientist
Adding & Removing group membership can take up to few hours
HDFS file permissions are very coarse (at file level)
HDFS File Permissions
![Page 8: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/8.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Other Access Control Solutions
![Page 9: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/9.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Mixed Data Multiple Data Processing Systems
Data for Everyone
Challenges
![Page 10: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/10.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Extensible
Authorization
Fine Grain Control
Fast Changes to Authorization
Rules
What do we need?
![Page 11: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/11.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Our Solution: Access Control via Encryption
Apache Kafka
HDFS
Event name
Symm
etric Encryption Key
Key Server
Parq
uet
ETLEncrypted Events
![Page 12: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/12.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
User A’s Job
User B’s Job
User C’s Job
Producer Job
ETL User
Parquet File
User Columns
A 5
B 2, 5
Key Server
Access Control via Encryption
![Page 13: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/13.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Columnar Storage
Page 0
Page 1
Page 2
Column a Column b
Row
gro
up
Parquet Format
Brief Overview of Parquet
![Page 14: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/14.jpg)
©2014 LinkedIn Corporation. All Rights Reserved. *Yet to be integrated into open source Parquet
Field mode
Page
Column
| Page Mode | Hybrid Mode
Encryption Support in Parquet*
![Page 15: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/15.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Examples Emails – Analysts need it to join with other tables but may not require
access to individual emails
N Values (Page)
Encrypt each value at a time
xxxxxxx
yyyyyyy
yyyyyyy
zzzzzzz
Field Mode
![Page 16: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/16.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Field Mode
Joins Counts Distribution Analysis
No/Low compression
![Page 17: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/17.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Page Mode
No information is leaked except entropy of the data Better performance than other modes
N Values (Page)
Encode Compress Encrypt
![Page 18: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/18.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Hybrid Mode
More fine grain control of information Increase in overhead due to double encryption/decryption
N Values (Page)
Encrypt each value Encrypt
![Page 19: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/19.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Plain Text | Encrypted Value |No Access
Field Mode Page Mode
Hybrid Mode
![Page 20: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/20.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Key Versioning
Each key is versioned and specific for a source (File/Event name) Reduces the exposure incase of key leakage Time based access control
– All users by default can access only last 30 days of data– Give users access to data in specific time period
Authentication of producers can be done separately
![Page 21: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/21.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
Better Auditing Coverage
Retention Enforcement
Key Server Features
Multifactor Authentication
![Page 22: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/22.jpg)
©2014 LinkedIn Corporation. All Rights Reserved.
PIG Usage
![Page 23: Taking Hadoop to Enterprise Security Standards](https://reader035.fdocuments.in/reader035/viewer/2022062617/54c653064a7959ad7b8b464e/html5/thumbnails/23.jpg)
Thank you!