Hortonworks technical workshop operations with ambari
-
Upload
hortonworks -
Category
Technology
-
view
562 -
download
0
Transcript of Hortonworks technical workshop operations with ambari
Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Operations With Apache Ambari We Do Hadoop.
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ambari
Apache Ambari is the open source operational platform to provision, manage and monitor Hadoop clusters
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
How Do People Use Ambari?
Health Checks, Alerts Stacks, Views
Lifecycle controls, Rolling Restarts, Decommission/
Re-commission
Host Groups, Versioning, Compare, Revert,
Recommendations, Security Setup
Install Wizard (UI), Blueprints (API)
Config Management
Extensibility Monitoring
Service Management
Cluster Provisioning
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Recent Ambari Releases
Ambari 1.7.0 Dec 2014
Ambari 1.6.0 May 2014
Introduced Ambari Blueprints
Introduced Ambari Views
Ambari 2.0.0 Apr 2014
HDP 2.2 GA
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What’s New in Ambari 2.0 Core Platform Simplified Kerberos Setup (AMBARI-7204)
Ambari Alerts (AMBARI-6354)
Ambari Metrics (AMBARI-5707)
Automated (Rolling) Upgrade (AMBARI-7804)
Stack Support HDP 2.2: Ranger, Spark, Phoenix
Hive Metastore HA (AMBARI-6684)
HiveServer2 HA (AMBARI-8906)
Oozie HA (AMBARI-6683)
Ambari Platform Handle umask 027 setting (AMBARI-7796)
Ambari Agent non-root (AMBARI-1596)
Blueprints API Add Host (AMBARI-8458)
For a complete list of changes https://issues.apache.org/jira/browse/AMBARI/fixforversion/12327486
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Lab Setup
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari 2.0 Security Lab Steps – 4 node cluster • Detailed steps available at: http://bit.ly/1J4IbIs • Install Ambari server and agents • Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD • Use blueprint/API to provision a minimal Hadoop cluster with custom services • Use Add service wizard to also install Hive • Configure Ambari to sync/recognize business users in OpenLDAP • Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service • Install Ranger as Ambari service and configure it to recognize LDAP users • Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access
and audit consumption
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Core Platform
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Stack Components Support
HDP 2.2 HDP 2.1 HDP 2.0
HDFS, YARN, MapReduce, Hive, HBase, Pig, ZooKeeper, Oozie,
Sqoop
Tez, Storm, Falcon, Flume
Knox, Slider, Kafka
Ranger, Spark, Phoenix NEW in Ambari 2.0 install/manage/monitor
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Admin > Stack and Versions
List of Stack Services Installed or Add Service
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari 2.0.0 High Availability Support High Availability Mode Ambari 1.6.1 Ambari 1.7.0 Ambari 2.0.0
HDFS: NameNode HDP 2.0+ Active/Standby
YARN: ResourceManager HDP 2.1+ Active/Standby
HBase: HBaseMaster HDP 2.1+ Multi-master
Hive: HiveServer2 HDP 2.1+ Multi-instance
Hive: Hive Metastore HDP 2.1+ Multi-instance
Oozie: Oozie Server* HDP 2.1+ Multi-instance
* Oozie Server needs external load balancer to complete HA solution
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hive HA
Services > Hive > Service Actions + Add Hive Metastore
+ Add HiveServer2
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Agent Non-Root
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Non-Root Ambari Agent
Agent Runs Commands From the Ambari Server • Configuration Change
• Service Start • Service Stop
Some Command require root level access • /bin/su hdfs -‐l -‐s /bin/bash -‐c /usr/hdp/current/hadoop-‐client/sbin/hadoop-‐
daemon.sh -‐-‐config /etc/hadoop/conf start datanode
Sudo Leveraged • Configuration for:
– Customizable Users (su hdfs, yarn, etc.)
– Non-Customizable Users (su mysql)
– Commands (yum, mkdir, touch, test, etc.)
Ambari Agent Ambari
Agent Ambari Agent
python
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Configuring Agent for Non-Root
1. Create and configure a sudoer account 2. Manually bootstrap Ambari Agents
3. Set run_as_user in ambari-agent.ini for the sudoer account
Details http://docs.hortonworks.com/HDPDocuments/Ambari-2.0.0.0/bk_ambari_reference_guide/content/ch_amb_ref_configuring_ambari_for_non-root.html
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Updated umask Handling
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What about umask? Unix Permissions Basics: (user, group, other) 4 – read 2 – write 1 – execute rwxr-‐xr-‐x == 755 Previous Behavior: • If (umask > 022); Warning during agent pre-req check • Installations would fail if ignored New Behavior: • If (umask > 027); Warning during agent pre-req check • Installation will fail if ignored
Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerts
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerts
Summary • Migrated away from Nagios as the Ambari alerting system
• No longer offer option to install or manage a Nagios service
• Replaced with built-in alerting system
Motivation
• Avoids Nagios package conflicts in customer environments
• More flexibility with alerts in Ambari Stacks
• Platform independence
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerts
• Ambari Alerts are installed and configured by default • Ambari Web provides centralized management of Health Alerts
Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Modifying Alerts
• Control thresholds, check intervals and response text
Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Alert Groups
• Create and manage groups of alerts
• Group alerts further controls what alerts are dispatched which notifications
• Assign group to notifications • Only dispatch to interested parties
Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Alert Notifications
• What: Create and manage multiple notification targets
• Control who gets notified when
• Why: Filter by severity • Send only certain notifications to certain
targets based on severity
• How: Control dispatch method • Support for EMAIL + SNMP
Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Alerting System 1. User creates or modifies cluster 2. Ambari reads alert definitions
from Stack 3. Ambari sends alert definitions to
Agents and Agent schedules instance checks
4. Agents reports alert instance status in the heartbeat
5. Ambari responds to alert instance status changes and dispatches notifications (if applicable)
Ambari Server
1
2
4
Stack definition alerts.json
5
Ambari Agent(s)
3
emailsnmp
Page 25 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Notable Alert REST APIs
REST Endpoint Description /api/v1/clusters/:clusterName/alert_definitions The list of alert definitions for the cluster.
/api/v1/clusters/:clusterName/alerts The list of alert instances for the cluster.
Example: find all alert instances that are CRITITAL /api/v1/clusters/c1/alerts?Alert/state.in(CRITICAL) Example: find all alert instances for “ZooKeeper Process” alert def /api/v1/clusters/c1/alerts?Alert/definition_name=zookeeper_server_process
/api/v1/clusters/:clusterName/alert_groups The list of alert groups.
/api/v1/clusters/:clusterName/alert_history The list of alert instance status changes.
/api/v1/alert_targets/ The list of configured alert notification targets for Ambari.
Page 26 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Metrics
Page 27 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Metrics
Summary • Migrated from Ganglia as the Ambari metrics collection system
• No longer offer option to install or manage a Ganglia service
• Replaced with built-in metrics system “Ambari Metrics”
Motivation
• Avoids Ganglia package conflicts in customer environments
• More flexibility to retain metrics in Hadoop
• Platform independence
Page 28 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Terminology
Term Definition Ambari Metrics (“AMS”) The built-in metrics collection system for Ambari (“AMS”).
Metrics Collector
The standalone server that collects metrics, aggregates metrics, serves metrics from the Hadoop service sinks and the Metrics Monitor. Analogous to gmetad.
Metrics Monitor Installed on each host in the cluster to collect system-level metrics and forward to the Collector. Analogous to gmond.
Metrics Hadoop Sinks Plugs into the Service sinks to send Hadoop metrics to the Collector.
Page 29 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Metrics Collection System
1. Metric Monitors send system-level metrics to Collector
2. Sinks send Hadoop-level metrics to Collector
3. Metrics Collector service stores and aggregates metrics
4. Ambari exposes REST API for metrics retrieval
Ambari Server
Metrics Monitor
Metrics Collector
Host1
Sink(s)
3
Metrics Monitor
Host1
Sink(s) Metrics Monitor
Hosts
Sink(s)
1 2
4
Page 30 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Metrics Collector
Built using Hadoop technologies Default uses local filesystem for metrics storage (“embedded”) **
Local Filesystem **
HBase ATS
Phoenix
** Tech Preview “distributed” storage option to use existing HDFS
Page 31 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Automated Rolling Upgrade For HDP Stack
Page 32 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Rolling vs. In Place Upgrades
In Place Upgrades Upgrade Stack with one or more service disruptions. Explicit stop all services.
Rolling Upgrades Ambari 2.0
Update Stack with minimized service disruption and degradation.
Page 33 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Upgrading the Stack with Ambari 2.0
Source HDP Version
Target HDP Versions Method
HDP 2.0.x HDP 2.0.x HDP 2.1.x HDP 2.2.x
In Place
HDP 2.1.x HDP 2.1.x HDP 2.2.x In Place
HDP 2.2.x HDP 2.2.x Rolling NEW!!!
Page 34 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Rolling Upgrade Process
Pre-requisites Prepare Rolling
Upgrade Finalize
Rolling Downgrade
Rollback NOT Rolling. Shutdown all
services.
Page 35 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Process: Rolling Upgrade
HDP has a certified process for Rolling Upgrades
Services are switched over to new version in rolling fashion
ZooKeeper
Ranger
Core Masters
Core Slaves
Hive
Oozie
Falcon
Clients
Kafka
Knox
Storm
Slider
Flume
Finalize
HDFS, YARN, MR, Tez, HBase, Pig. Hive
HDFS YARN HBase
Page 36 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Process: Rolling Downgrade
ZooKeeper
Ranger
Core Masters
Core Slaves
Hive
Oozie
Falcon
Clients
Kafka
Knox
Storm
Slider
Flume
Downgrade V
V
V
V
V
V
V
V
V
V
V
V
V
Finalize
Page 37 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Wizard Driven Experience Register Install Perform Upgrade Finalize
With verification and validation
Page 38 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Process: Service Disruption by Component Component Service Disruption Zookeeper No Service Disruption
Ranger No Service Disruption
HDFS No Service Disruption
YARN No Service Disruption
HBase No Service Disruption
Hive No Service Disruption
Oozie No Service Disruption
Falcon Yes – Requires Stop/Start
Kafka Yes – Requires Stop/Start
Knox Yes – Requires Stop/Start
Storm Yes – Requires Stop/Start
Flume No Service Disruption
Slider applications Yes – Requires Stop/Start
Hue Yes – Requires Stop/Start
Accumulo Yes – Requires Stop/Start
Page 39 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari Extensibility Stacks, Blueprints and Views
Page 40 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Extensibility Features
• To add new Services (ISV or otherwise) beyond HDP Stack • To customize a Stack for customer specific environments
• To use Ambari for automating cluster installations • To share best practices on layout and cluster configuration
• To extend and customize the Ambari Web UI • Add new capabilities, customize existing capabilities
Stacks
Blueprints
Views
Goal: Extend Ambari without hard-coding in Ambari
Page 41 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
New in Ambari 2.0 - Blueprints Add Host
Add hosts to a cluster based on a host group from a Blueprint Add one or more hosts with a single call
POST /api/v1/clusters/MyCluster/hosts{ "blueprint" : "myblueprint", "host_group" : "workers", "host_name" : "c6403.ambari.apache.org"}
POST /api/v1/clusters/MyCluster/hosts[ { "blueprint" : "myblueprint", "host_group" : "workers", "host_name" : "c6403.ambari.apache.org" }, { "blueprint" : "myblueprint", "host_group" : "workers", "host_name" : "c6403.ambari.apache.org" }]
https://issues.apache.org/jira/browse/AMBARI-8458
Page 42 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
More on Ambari Extensibility
http://hortonworks.com/partners/learn/#ambari
Page 43 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Simplified Kerberos Setup
Page 44 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Quick Kerberos Overview REALM
• EXAMPLE.COM
Principals (Humans)
Service Principals (Services)
• hbase/[email protected]
Tickets
• “[email protected] is authenticated and can access the HBASE service”
KDC – Key Distribution Center
• Grant’s authenticated users tickets
Client
• r1u2m1.example.com (.example.com maps to realm EXAMPLE.COM)
• EXAMPLE.COM’s KDC is hosted on r1u2m3.example.com
Page 45 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
KDC Implementation Options
• Microsoft Active Directory • Users
• Service Principals
• MIT Kerberos • Users • Service Principals
• MIT Kerberos + Microsoft Active Directory (Trust Relationship) • Users in Active Directory • Service Principals in MIT
Page 46 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
What Ambari 2.0 will do
• Step-by-Step wizard to setup Kerberos
• Supports existing MIT KDC and Active Directory (AD) infrastructure
• Deploys and manages Kerberos Clients, and configuration
• First Time Setup as well as New Service/Host/Component • Automated creation of principals • Automated generation of keytabs
• Automated distribution of keytabs
• Support for regeneration and distribution of keytabs
Page 47 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Prerequisites
Category Requirements
General • Ambari Server must be part of cluster • Ambari Server and all hosts must have JCE installed • Ambari Server and all hosts must have network access to the KDC
KDC Admin • KDC admin account credentials are on-hand • !!! Ambari does not retain KDC admin credentials !!!
Active Directory
• Security LDAP (LDAPS) connectivity has been configured • User container for principals has been created and is on-hand • Admin account has delegated control of “Create, delete and manage
user accounts”
Page 48 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Terminology
Term Definition
Service Principals Principals required for HDP Service Components.
Ambari Principals Headless principals used by Ambari to perform “smoke tests” and “health alert checks”.
KDC Admin Account An administrative account that will be used by Ambari to create principals and generate keytabs in KDC.
Page 49 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Principal and Keytab Generation and Distribution
1. User provides KDC Admin Account credentials to Ambari
2. Ambari connects to KDC, creates principals (Service and Ambari) needed for cluster
3. Ambari generates keytabs for the principals
4. Ambari distributes keytabs to Ambari Server and cluster hosts
5. Ambari discards the KDC Admin Account credentials
Ambari Server KDC
1 2
4
3
5
HDP Cluster
Page 50 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari + Service Keytab Files
Ambari Server
HDP Cluster Hosts
Keytabs for Ambari
Principals
Keytabs for Service + Ambari
Principals
KDC Service Principals
Ambari Principals
Ambari and Service
Principals
Page 51 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Wizard Driven and Automated
Page 52 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
KDC and KDC Admin Information
Page 53 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Customizable Principal Attributes
Page 54 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Kerberos Clients
Ambari installs Kerberos clients on cluster hosts Optional to not have Ambari manage krb5.conf client config
OS Client RHEL/CentOS/OEL krb5-workstation SLES 11 krb5-client Ubuntu 12 krb5-user, krb5-config
Page 55 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Configure Ambari and Service Identities
Page 56 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Post-Kerberos Scenarios
Ambari does not retain KDC admin credentials User is prompted for KDC Admin credentials:
• Add/Delete Host
• Add Service
• Add/Delete Component
• Regenerate Keytabs
• Disable Kerberos
Page 57 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Security Lab
Page 58 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Security today in Hadoop with HDP
Authorization Restrict access to explicit data
Audit Understand who did what
Data Protection Encrypt data at rest & in motion
• Kerberos in native Apache Hadoop
• HTTP/REST API Secured with Apache Knox Gateway
Authentication Who am I/prove it?
• Wire encryption in Hadoop
• Orchestrated encryption with partner tools
• HDFS, Hive and Hbase, Storm and Knox
• Fine grain access control
• Centralized audit reporting
• Policy and access history H
DP
2.1
Ran
ger
Centralized Security Administration
More on Security: http://hortonworks.com/partners/learn/#secure
Page 59 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ambari 2.0 Security Lab Steps • Detailed steps available at: http://bit.ly/1J4IbIs • Install Ambari server and agents • Deploy custom Ambari service folders for OpenLDAP, KDC/kadmin, NSLCD • Use blueprint/API to provision a minimal Hadoop cluster with custom services • Use Add service wizard to also install Hive • Configure Ambari to sync/recognize business users in OpenLDAP • Authentication: Run Ambari security wizard to enable kerberos using KDC/kadmin service • Install Ranger as Ambari service and configure it to recognize LDAP users • Authorization/Audit: Setup HDFS/Hive plugins to allow users to set policies to control access
and audit consumption