HBase Client APIs (for webapps?)
-
Upload
nick-dimiduk -
Category
Documents
-
view
5.646 -
download
9
description
Transcript of HBase Client APIs (for webapps?)
HBase Client API(for webapps?)
Nick DimidukSeattle Scalability Meetup
2013-03-27
1
2
3
What are my choices?switch (technology) {
case ‘ ’: ...
case ‘ ’: ...
case ‘ ’: ...}
4
Apache HBase
5
Java client Interfaces
• Configuration holds details where to find the cluster and tunable settings. Roughly equivalent to JDBC connection string.
• HConnection represents connections to to the cluster.
• HBaseAdmin handles DDL operations (create, list, drop, alter, &c.)
• HTablePool connection pool for table handles.
• HTable (HTableInterface) is a handle on a single HBase table. Send "commands" to the table (Put, Get, Scan, Delete, Increment)
6
Java client Example
public static final byte[] TABLE_NAME = Bytes.toBytes("twits");public static final byte[] TWITS_FAM = Bytes.toBytes("twits");
public static final byte[] USER_COL = Bytes.toBytes("user");public static final byte[] TWIT_COL = Bytes.toBytes("twit");
private HTablePool pool = new HTablePool();
https://github.com/hbaseinaction/twitbase/blob/master/src/main/java/HBaseIA/TwitBase/hbase/TwitsDAO.java#L23-L30
7
Java client Exampleprivate static class Twit {
private Twit(Result r) { this( r.getColumnLatest(TWITS_FAM, USER_COL).getValue(), Arrays.copyOfRange(r.getRow(), Md5Utils.MD5_LENGTH, Md5Utils.MD5_LENGTH + longLength), r.getColumnLatest(TWITS_FAM, TWIT_COL).getValue()); }
private Twit(byte[] user, byte[] dt, byte[] text) { this( Bytes.toString(user), new DateTime(-1 * Bytes.toLong(dt)), Bytes.toString(text)); }
https://github.com/hbaseinaction/twitbase/blob/master/src/main/java/HBaseIA/TwitBase/hbase/TwitsDAO.java#L129-L143
8
Java client Example
private static Get mkGet(String user, DateTime dt) { Get g = new Get(mkRowKey(user, dt)); g.addColumn(TWITS_FAM, USER_COL); g.addColumn(TWITS_FAM, TWIT_COL); return g;}
https://github.com/hbaseinaction/twitbase/blob/master/src/main/java/HBaseIA/TwitBase/hbase/TwitsDAO.java#L60-L65
9
Ruby, Python client Interface
10
Ruby, Python client InterfaceJRuby, Jython
: '(
11
Thrift client Interface
1. Generate bindings
2. Run a “Gateway” between clients and cluster
3. ... profit?write code!
12
Sidebar: Architecture Recap
HBase Cluster
HBase Clients
13
Thrift Architecture
HBase Cluster
Thrift Clients
ThriftGateway
14
Thrift client Interface
• Thrift gateway exposes a client to RegionServers
• stateless :D
• ... except for scanners :'(
15
Thrift client Example
transport = TSocket.TSocket(host, port)transport = TTransport.TBufferedTransport(transport)protocol = TBinaryProtocol.TBinaryProtocol(transport)client = Hbase.Client(protocol)transport.open()
https://github.com/hbaseinaction/twitbase.py/blob/master/TwitBase.py#L17-L21
16
Thrift client Example
columns = ['info:user','info:name','info:email']scanner = client.scannerOpen('users', '', columns)row = client.scannerGet(scanner)while row: yield user_from_row(row[0]) row = scannerGet(scanner)client.scannerClose(scanner)
https://github.com/hbaseinaction/twitbase.py/blob/master/TwitBase.py#L33-L39
17
Thrift client Example
def user_from_row(row): user = {} for col,cell in row.columns.items(): user[col[5:]] = cell.value return "<User: {user}, {name}, {email}>".format(**user)
https://github.com/hbaseinaction/twitbase.py/blob/master/TwitBase.py#L26-L30
18
REST client Interface
1. Stand up a "REST Gateway" between your application and the cluster
2. HTTP verbs translate (roughly) into table commands
3. decent support for basic DDL, HTable operations
19
REST Architecture
HBase Cluster
RESTGatewayREST Clients
20
REST client Interface
• REST gateway exposes a client to RegionServers
• stateless :D
• ... except for scanners :'(
21
REST client Example
$ curl -H "Accept: application/json" http://host:port/{ "table": [ { "name": "followers" }, { "name": "twits" }, { "name": "users" } ]}
22
REST client Example$ curl -H ... http://host:port/table/row [/family:qualifier]{ "Row": [ { "key": "VGhlUmVhbE1U", "Cell": [ { "$": "c2FtdWVsQGNsZW1lbnMub3Jn", "column": "aW5mbzplbWFpbA==", "timestamp": 1338701491422 }, { "$": "TWFyayBUd2Fpbg==", "column": "aW5mbzpuYW1l", "timestamp": 1338701491422 }, ] } ] }
23
REST client Example
<Rows> <Row key="VGhlUmVhbE1U"> <Cells> <Cell column="aW5mbzplbWFpbA==" timestamp="1338701491422"> c2FtdWVsQGNsZW1lbnMub3Jn </Cell> <Cell ...> ... </Cells> </Row></Rows>
24
Beyond Apache
25
asynchbase
• Asynchronous non-blocking interface.
• Inspired by Twisted Python.
• Partial implementation of HTableInterface.
• HBaseClient provides entry-point to data.
https://github.com/OpenTSDB/asynchbasehttp://tsunanet.net/~tsuna/asynchbase/api/org/hbase/async/HBaseClient.html
26
asynchbase
output to => [next state] /input => [this state] \ => [error state] Exception
BooleanPut response
Interpret response
3
UpdateResultobject
UpdateFailedException
27
asynchbase Examplefinal Scanner scanner = client.newScanner(TABLE_NAME);scanner.setFamily(INFO_FAM);scanner.setQualifier(PASSWORD_COL);
ArrayList<ArrayList<KeyValue>> rows = null;ArrayList<Deferred<Boolean>> workers = new ArrayList<Deferred<Boolean>>();while ((rows = scanner.nextRows(1).joinUninterruptibly()) != null) { for (ArrayList<KeyValue> row : rows) { KeyValue kv = row.get(0); byte[] expected = kv.value(); String userId = new String(kv.key()); PutRequest put = new PutRequest( TABLE_NAME, kv.key(), kv.family(), kv.qualifier(), mkNewPassword(expected)); Deferred<Boolean> d = client.compareAndSet(put, expected) .addCallback(new InterpretResponse(userId)) .addCallbacks(new ResultToMessage(), new FailureToMessage()) .addCallback(new SendMessage()); workers.add(d); }}
https://github.com/hbaseinaction/twitbase-async/blob/master/src/main/java/HBaseIA/TwitBase/AsyncUsersTool.java#L151-L173
28
OthersFull-blown schema
managementReduce day-to-day
developer pain
Spring-DataHadoop
[Orderly]
Phoenix
Kiji.org
https://github.com/ndimiduk/orderlyhttp://www.springsource.org/spring-data/https://github.com/forcedotcom/phoenix
http://www.kiji.org/29
Apache Futures
• Protobuf wire messages (0.96)
• C client (TBD, HBASE-1015)
• HBase Types (TBD, HBASE-8089)
30
So, Webapps?
http://www.amazon.com/Back-Point-Rapiers/dp/B0000271GC
31
Software Architecture
• Isolate DAO from app logic, separation of concerns, &c.
• Separate environment configs from code.
• Watch out for resource contention.
32
Deployment Architecture
• Cache everywhere.
• Know your component layers.
33
HBase Warts
• Know thy (HBase) version 0.{92,94,96} !
• long-running client bug (HBASE-4805).
• Gateway APIs only as up to date as the people before you require.
• REST API particularly unpleasant for “Web2.0” folk.
34
Thanks!
Nick Dimiduk github.com/ndimiduk @xefyr n10k.com
M A N N I N G
Nick Dimiduk Amandeep Khurana
FOREWORD BY Michael Stack
hbaseinaction.com
35