Sentry - An Introduction
-
Upload
alexander-alten-lorenz -
Category
Technology
-
view
130 -
download
0
description
Transcript of Sentry - An Introduction
Sentry: Open Source Authorization for Hive & ImpalaAlexander Alten-Lorenz | Senior Field Engineer, Cloudera Wednesday, 7th November 2013
Defining Security Func/ons
!2
Perimeter Guarding access to the
cluster itself !!!
Technical Concepts: Authen3ca3on
Network isola3on
Data Protec3ng data in the
cluster from unauthorized visibility
!!
Technical Concepts: Encryp3on
Data masking
Access Defining what users and applica3ons can do with
data !!
Technical Concepts: Permissions Authoriza3on
Visibility Repor3ng on where data came from and how it’s
being used !!
Technical Concepts: Audi3ng Lineage
Enabling Enterprise Security
!3
Perimeter Guarding access to the
cluster itself !!!
Technical Concepts: Authen3ca3on
Network isola3on
Data Protec3ng data in the
cluster from unauthorized visibility
!!
Technical Concepts: Encryp3on
Data masking
Access Defining what users and applica3ons can do with
data !!
Technical Concepts: Permissions Authoriza3on
Visibility Repor3ng on where data came from and how it’s
being used !!
Technical Concepts: Audi3ng Lineage
Sentry Kerberos | Oozie | Knox Cloudera NavigatorCer3fied Partners
Available 7/23
Hive Overview
!4
SQL Access to Hadoop § MapReduce: great massively scalable batch processing framework; required development for each new job
§ Hive opened up Hadoop for more users with standard SQL !
Key Challenges § Batch MapReduce too slow for interac3ve BI/analy3cs § No concurrency, no security !
OpEons Today § Impala designed for low-‐latency queries § HiveServer2 delivers concurrency, authen3ca3on
Our OpenSource ac/vity
!5
CDH 4.1 (HiveServer2) § Concurrency and Kerberos authen3ca3on for Hive § JDBC and Beeline clients
CDH 4.2
§ HDFS impersona3on authoriza3on as stop-‐gap § Pluggable authen3ca3on API § JDBC LDAP username/password
ODBC
§ Supports Kerberos authen3ca3on and LDAP § Extended partner cer3fica3on
Current State of Authoriza/on
!6
Insecure Advisory Authoriza3on Users can grant themselves permissions Intended to prevent accidental dele3on of data Problem: Doesn’t guard against malicious users
HDFS Impersona3on Data is protected at the file level by HDFS permissions Problem: File-‐level not granular enough Problem: Not role-‐based
Two Sub-‐OpEmal Choices for SQL on Hadoop
Authoriza/on Requirements
!7
Secure Authoriza3on Ability to control access to data and/or privileges on data for authen3cated users
Fine-‐Grained Authoriza3on Ability to give users access to a subset of data (e.g. column) in a database
Role-‐Based Authoriza3on Ability to create/apply templa3zed privileges based on func3onal roles
Mul3-‐Tenant Administra3on Ability for central admin group to empower lower-‐level admins to manage security for each database/schema
The Next Step: Introducing Sentry
!8
Unlocks Key RBAC Requirements Secure, fine-‐grained, role-‐based authoriza3on Mul3-‐tenant administra3on
Open Source Intent to donate to ASF
Available and Fully Supported Hiveserver2 & Impala 1.1 ini3ally
AuthorizaEon module for Hive & Impala
Key Benefits of Sentry
!9
Store Sensi3ve Data in Hadoop
Extend Hadoop to More Users
Enable New Use Cases
Enable Mul3-‐User Applica3ons
Comply with Regula3ons
Key Capabili/es of Sentry
!10
Fine-‐Grained Authoriza3on Specify security for SERVERS, DATABASES, TABLES & VIEWS
Role-‐Based Authoriza3on SELECT privilege on views & tables INSERT privilege on tables TRANSFORM privilege on servers ALL privilege on the server, databases, tables & views ALL privilege is needed to create/modify schema
Mul3-‐Tenant Administra3on Separate policies for each database/schema Can be maintained by separate admins
Apache Ecosystem and Sentry
Shared Hive Metastore (with HCatalog)
Extensibility plug-‐in for HiveServer2
Inline support in Impala 1.1
Poten3al extension to Pig, MapReduce, REST
Possible future development
!11
HCatalog
SentryHive Metastore
M RE
Sentry Architecture
!12
Binding Layer
Impala
Impala Hive
Policy Engine
Future
Policy Provider
File Database
HiveServer2
Authoriza<on Provider Evalua3on, Valida3on
Parsing
Interface
Interface
Local FS/HDFS
QueryMR
SQL
Query Execu/on Flow
!13
Parse
Build
Check
Plan
Sentry
Validate SQL grammar
Construct statement tree
Validate statement objects • First check: Authoriza3on
Forward to execu3on planner
Example Security Policy[databases]
# Defines the location of the per DB policy file for the
# ‘customers’ DB (schema)
customers = hdfs://ha-nn-uri/etc/access/customers.ini
![groups]
# Assigns Hadoop groups to their respective set of roles
manager = analyst_role, junior_analyst_role
analyst = analyst_role
jranalyst = junior_analyst_role
customers_admin = customers_admin_role
admin = admin_role
![roles]
# Roles that can import or export data to the the URIs defined,
# i.e. a landing zone. Since the server runs as the user "hive,"
# files in this directory must either have the “hive” group set
# with read/write or be set world read/write.
analyst_role = server=server1->db=analyst1, \
server=server1->db=jranalyst1->table=*->action=select \
server=server1->uri=hdfs://ha-nn-uri/landing/analyst1
(Continued on next column)
!# Role controls everything for the ‘customers’ DB on server1.
!
junior_analyst_role = server=server1->db=jranalyst1, \
server=server1->uri=hdfs://ha-nn-uri/landing/jranalyst1
!# Privileges for ‘customers’ can be defined in the global policy
# file even though ‘customers’ has its only policy file.
# Note that the privileges from both the global policy file and
# the per-db policy file are merged. There is no overriding.
customers_admin_role = server=server1->db=customers
!# Role controls everything on server1.
admin_role = server=server1
!14
Live Demo & Give Aways
!15
Closes gap between HDFS and Metastore
Easy to implement
RFC 2307 compilant (Kerberos)
Enable Mul3-‐User Applica3ons in one Hive WH
Enables Mul3 Tendency per Row and Column
About
���16
[email protected] [email protected]
@mapredit mapredit.blogspot.com
!
Web: http://wiki.apache.org/incubator/SentryProposal