April 2014 HUG : Integrating HUE with Multi-tenant cluster

download April 2014 HUG : Integrating HUE with Multi-tenant cluster

of 50

Embed Size (px)

description

April 2014 HUG : Integrating HUE with Multi-tenant cluster

Transcript of April 2014 HUG : Integrating HUE with Multi-tenant cluster

  • INTEGRATE HUE WITH YOUR HADOOP CLUSTER Romain Rigaux Y! HUG Apr 16, 2014
  • WHAT IS HUE? WEB INTERFACE FOR MAKING HADOOP EASIER TO USE Suite of apps for each Hadoop component, like Hive, Pig, Impala, Oozie, Solr, Sqoop2, HBase...
  • VIEW FROM 30K FEET Hadoop Web Server You and even that friend that uses IE9 ;)
  • YARN JobTracker Oozie Pig HDFS HiveServer2 Hive Metastore Cloudera Impala Solr HBase Sqoop2 Zookeeper LDAP SAML Hue Plugins ECOSYSTEM AND APPS
  • TARGET OF HUE GETTING STARTED WITH HADOOP BEING PRODUCTIVE EXPLORING DIFFERENT ANGLES OF THE PLATFORM ! LET ANY USER FOCUS ON BIG DATA PROCESSING BEING COMPATIBLE WITH ANY HADOOP VERSION (0.20/1.2.0/2.3.0)
  • OPEN SOURCE ~3000 COMMITS 33 CONTRIBUTORS 648 STARS 212 FORKS ! github.com/cloudera/hue
  • THE CORE TEAM PLAYERS team.gethue.com ABRAHAM ELMAHREK ROMAIN RIGAUX ENRICO BERTI CHANG BEER
  • TALKS Meetups and events in NYC, Paris, LA, Tokyo, SF, Stockholm, Vienna, San Jose, Singapore Coming up in London, West coast AROUND THE WORLD RETREATS Nov 13 Koh Chang, Thailand May 14 Curaao, Netherlands Antilles
  • FAST PACE LAST 30 DAYS 41 issues created and 38 resolved. Core team + Community
  • TREND: GROWTH gethue.com
  • HISTORY HUE 1 Desktop-like in a browser, did its job but pretty slow, memory leaks and not very IE friendly but denitely advanced for its time (2009-2010).
  • HISTORY HUE 2 The rst at structure port, with Twitter Bootstrap all over the place.
  • HISTORY HUE 2.5 New apps, improved the UX adding new nice functionalities like autocomplete and drag & drop.
  • HISTORY HUE 3 ALPHA Proposed design, didnt make it.
  • HISTORY HUE 3.5+ Where we are now, new UI, several new apps, the most user friendly features to date.
  • WHICH VERSION TO USE? 6 months 1k commits later1-2 years old HUE 2.X HUE 3.X HUE 3.5 + 1/2 3.6
  • WHICH DISTRIBUTION? Advanced preview The most stable and cross component checked Very latest GITHUB CDH / CMTARBALL HACKER ADVANCED USER NORMAL USER
  • WHERE TO PUT HUE? IN ONE MACHINE
  • WHERE TO PUT HUE? INSIDE THE CLUSTER
  • WHERE TO PUT HUE? OUTSIDE THE CLUSTER
  • WHAT DO YOU NEED? Python 2.4 2.6 Thats it if using a packaged version. If building from the source, here are the extra packages SERVER CLIENT Web Browser IE 9+, FF 10+, Chrome, Safari
  • HOW DOES THE HUE SERVICE LOOK LIKE? Process serving pages and also static content 1 SERVER 1 DB For cookies, saved queries, workflows,
  • HOW TO CONFIGURE HUE HUE.INI Similar to core-site.xml but with .INI syntax ! Where? /etc/hue/conf/hue.ini or $HUE_HOME/desktop/conf/ pseudo-distributed.ini [desktop] [[database]] # Database engine is typically one of: # postgresql_psycopg2, mysql, or sqlite3 engine=sqlite3 ## host= ## port= ## user= ## password= name=desktop/desktop.db
  • AUTHENTICATE / LOGIN [desktop] [[auth]] # - django.contrib.auth.backends.ModelBackend (entirely Django backend) # - desktop.auth.backend.AllowAllBackend (allows everyone) # - desktop.auth.backend.AllowFirstUserDjangoBackend # - desktop.auth.backend.LdapBackend # - desktop.auth.backend.OAuthBackend # ... ## backend=desktop.auth.backend.AllowFirstUserDjangoBackend
  • USERS Can give and revoke permissions to single users or group of users ADMIN USER Regular user + permissions
  • DB BACKEND
  • LDAP BACKEND Integrate your employees: LDAP How to guide
  • LIST OF GROUPS AND PERMISSIONS A permission can: - allow access to one app (e.g. Hive Editor) - modify data from the app (e.g drop Hive Tables or edit cells in HBase Browser) CONFIGURE APPS AND PERMISSIONS A list of permissions
  • PERMISSIONS IN ACTION User test belonging to the group hiveonly that has just the hive permissions CONFIGURE APPS AND PERMISSIONS
  • HOW HUE INTERACTS WITH HADOOP YARN JobTracker Oozie Hue Plugins LDAP SAML Pig HDFS HiveServer2 Hive Metastore Cloudera Impala Solr HBase Sqoop2 Zookeeper
  • RCP CALLS TO ALL THE HADOOP COMPONENTS HDFS EXAMPLE WebHDFS REST DN DN DN DN NN http://localhost:50070/webhdfs/v1/?op=LISTSTATUS
  • HOW Host/port of all services like Oozie, Yarn, HDFS, HBase APIs are specied in hue.ini on sections, e.g. [hbase] by major service, Hue core [desktop] or Hue lib [liboozie] [hbase] # Comma-separated list of HBase Thrift servers for # clusters in the format of '(name|host:port)'. hbase_clusters=(Cluster|localhost:9090) ! [liboozie] # The URL where the Oozie service runs on. # oozie_url=http://hue.ent.cloudera.com:11000/oozie RCP CALLS TO ALL THE HADOOP COMPONENTS Full list
  • KERBEROS 1 Hue ticket/ principal - no user ticket ! Hue uses its ticket for authenticating to every other service (HDFS, Oozie, ) read more on the Hue Security Guide
  • HUE KERBEROS TICKET kadmin: addprinc -randkey hue/hue.server.fully.qualified.domain.name@YOUR-REALM.COM Add Hue user principal to Kerberos $ kinit -k -t /etc/hue/hue.keytab hue/hue.server.fully.qualified.domain.name@YOUR-REALM.COM Test Ticket should be renewable (krb5.conf and kdc.conf) [desktop] [[kerberos]] # Path to Hue's Kerberos keytab file hue_keytab=/etc/hue/hue.keytab # Kerberos principal name for Hue hue_principal=hue/FQDN@REALM # add kinit path for non root users kinit_path=/usr/kerberos/bin/kinit hue.ini
  • HOW Hue is a super proxy Client could be on a Windows machine, phone and interact with all the Hadoop services http://localhost:50070/webhdfs/v1/tmp? op=GETFILESTATUS&user.name=hue&doas=bob IMPERSONATION hadoop.proxyuser.hue.hosts * hadoop.proxyuser.hue.groups * Call for getting the information about an HDFS le WebHDFS, add to core-site.xml
  • HTTPS SSL DBSSL WITH HIVESERVER2 READ MORE AUDITING OTHER SECURITY FEATURES
  • 2 Hue instances HA proxy Multi DB Performances: like a website, mostly RPC calls HIGH AVAILABILITY HOW
  • DEMO TIME
  • SUM-UP Enable Hadoop Service APIs for Hue as a proxy user Configure hue.ini to point to each Service API Get help on @gethue or hue-user Install Hue on one machine + Hue Kerberos ticket Use an LDAP backend INSTALL CONFIGUREENABLE HELPLDAP
  • CONFIGURATIONS ARE HARD GIVE CLOUDERA MANAGER A TRY! vimeo.com/91805055
  • MISSED SOMETHING? learn.gethue.com
  • LINKS TWITTER @gethue USER GROUP hue-user@ WEBSITE http://gethue.com LEARN http://learn.gethue.com
  • GET HUE Try in advance the latest and greatest but youll have to configure everything on your own. Get to play with Hue and various Hadoop components in 5 minutes. Its a self contained CDH environment ready to use. Newer version than HDP, close to the original 2.5 minus apps like HBase, Impala, Sqoop, Search. The newest addition, ships Hue 3.0 through the GreenButton products. Stable and highly tested releases perfectly integrated with the Hadoop ecosystem, automagically configured by Cloudera Manager. In HDP theres an old forked version of Hue 2.3. CLOUDERAS CDH TARBALL CLOUDERAS DEMO VM HORTONWORKS* MAPR* HP CLOUD* * YOUR MILEAGE MAY VARY. BIGTOP EMBEDDED/DEMO IN IND. COMPANIES
  • WHAT ARE YOUR USE CASES? WHICH COMPONENTS DO YOU USE? WHAT WOULD YOU LIKE TO SEE IN HUE? INTERESTED IN CONTRIBUTING? WANNA SAY HELLO? DO YOU WANT A TAILOR MADE TEAM RETREAT? QUESTIONS? TEAM@ GETHUE.COM