SAS1844 - Securing Hadoop Clusters while Still Retaining Your Sanity
Secure Hadoop clusters on Windows platform
-
Upload
remus-rusanu -
Category
Software
-
view
75 -
download
1
Transcript of Secure Hadoop clusters on Windows platform
about:me
• SQL Server engine developer since 2001
• Worked on HDInsight service (Azure Hadoop offering) and on PDW appliance Hadoop region
• Hive contributor: vectorized execution engine HIVE-4160
• Hadoop contributor: Windows secure YARN containers YARN-2190
• @rusanu
• Stack Overflow user 105929
Integrate Hadoop with Windows Security
• Integrate your cluster with the existing Active Domain
• Integrated security• Use Windows domain users
• No need for local users, local passwords
• Single sign-on• Only provide password when opening OS session
• Group membership provided from AD groups
Benefits
• Group membership based access control• Domain\HadoopUsers: Granted access to Hadoop cluster
• Domain\NewHire is added to HadoopUsers
• Domain\NewHire has access to Hadoop cluster
• Centralized password control• Only administer the Active Domain
• Integrate with the rest of the enterprise that uses AD
What can leverage AD based Access Control
• HDFS
• M/R queues
• HTTP interfaces (Web UI)
• Hadoop ecosystem stack• Oozie proxy (Hadoop super)
Secure Hadoop clusters
• “Kerberized” cluster• Users are authenticated using Kerberos• Services authenticate each other using Kerberos
• Data encryption in traffic• RPC, block transfer, HTTP• No data encryption at rest
• Permission control for containers (task)• Containers cannot access service (NM) files
• Process isolation• Containers cannot access each other files
• SecureMode: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SecureMode.html
Windows and Secure Hadoop Clusters
• Windows does support integrated authentication and single sign on with a Kerberized cluster• Can be a Linux Kerberized cluster too, with proper KDC configuration
• Requires allowtgtsessionkey key in registry
• See KB308339: Registry Key to Allow Session Keys to Be Sent in Kerberos Ticket-Granting-Ticket: http://support.microsoft.com/kb/308339
• Not allowed for LUA, see KB2627903 Access to Session Keys not possible using a restricted Token: http://support.microsoft.com/kb/2627903
• This solves the problem of authenticating the user at cluster periphery (job submit)
Securing Hadoop Services
• Same as Linux configuration• Hadoop.security.authentication: Kerberos• Hadoop.security.authorization: True• Hadoop.http.filter.initializers: org.apache.Hadoop.security.AuthenticationFilterInitializer• Hadoop.http.authentication.type: Kerberos• Etc. etc. Refer to your installation Secure Mode guide.
• Use ktpass.exe to obtain keytab files for NT domain users• https://technet.microsoft.com/en-us/library/cc753771.aspx
• Configure KDC in krb5.ini for the realm (domain)
• Enable AES128 and AES256 for the user accounts in AD• msDS-SupportedEncryptionTypes
• Not required for Hadoop services to run as the service accounts, Hadoop will use principal names and keytabfiles anyway• I recommend it none the less, confusing otherwise
• Java runtime must contain the Unlimited Strength JCE policy files
Group Membership Provider
• LDAP provider already works• hadoop.security.group.mapping: org.apache.hadoop.security.LdapGroupsMapping
• HDFS access control
• M/R queues access control
• Add LDAP_MATCHING_RULE_IN_CHAIN• hadoop.security.group.mapping.ldap.search.attr.member:
member:1.2.840.113556.1.4.1941:
• This rule is limited to filters that apply to the DN. This is a special "extended match operator that walks the chain of ancestry in objects all the way to the root until it finds a match
• See https://msdn.microsoft.com/en-us/library/aa746475%28v=vs.85%29.aspx
Windows Secure Container Executor
• Windows platform equivalent of LinuxContainerExecutor• http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-
site/SecureContainer.html
• Yarn-2190
• Leverages the S4U (Self4User) Kerberos extension• A process that has the SE_TCB (Trusted Computing Base) privilege can
impersonate an arbitrary user w/o providing a password for said user
• Creates an isolated environment for a container and then launches the container impersonating the user
Configuring WSCE
• Requires a privileged NT service: • %HADOOP_HOME%\bin\winutils /service
• Must run as LocalSystem
• Equivalent of LinuxContainerExecutor’s container executor binary with setuidset and owned by root
• Requires %HADOOP_HOME%\etc\hadoop\wsce_site.xml• impersonate.allowed: users allowed to be impersonated
• impersonate.denied: users explicitly forbidden from being impersonated
• Very powerful: launch a process as arbitrary user• Validates that wsce_site.xml is writable only by Administrators
//TODO
• Forests, domain trust etc.• Currently works only with one single domain
• Hadoop infrastructure modeled after Linux security model, does not support “domain\user”
• Delegation• S4U extension does not support delegation
• Container cannot access resource outside the node host• Eg. sqoop access SQL Server under Integrated Security: won’t work
• Deployment/configuration support• Ambari (Hortonworks)