MyLife with HBase or HBase three flavors
-
Upload
responseteam -
Category
Technology
-
view
457 -
download
0
description
Transcript of MyLife with HBase or HBase three flavors
MyLife with HBase OR
HBase three flavors
HBase: In brief
I could talk about…
Operational HBase
HBase: In brief I could talk about…
ZooKeeper quorums
Source: aazk.org
HBase: In brief I could talk about…
Compaction
Source: www.wasteprousa.com
HBase: In brief
I could talk about…
How HBase is ImplementedHDFSBlocks
RegionsMETA table
Etc…
HBase: In brief
I could talk about…
HBase VSCassandra
RedisMySQL
Etc…
HBase: In brief
However none of those are my primary view as a developer.
As a developer I want to talk about what HBase can do for me. How it can make MyLife (pun intended)
easier.
HBase: In brief
“I choose a lazy person to do a hard job. Because a lazy person will find
an easy way to do it.”
HBase: In brief
“I choose a lazy person to do a hard job. Because a lazy person will find
an easy way to do it.” –Bill Gates
HBase: In brief
So what does HBase do for me the developer?
TL;DRIT STORES DATA!
HBase: In brief
How does HBase store data?
HBase: In brief
As a Map
HBase: In brief
As a MapOf Maps
HBase: In brief
As a MapOf MapsOf Maps
HBase: In brief
As a MapOf MapsOf MapsOf Maps
A Data Structures Interlude
Key == Last Name, First Name, Middle Initial
Value == ExtensionI.e.
Example,Dude,X x555
A Data Structures Interlude
So now that we know what a map is what would a map of maps looks
like? An HBase like analogy.
A Data Structures Interlude
An analogy ( a dated analogy if someone can think of a current one please please let me
know) to HBase is an index file in a library by ISBN. You look up the a book by ISBN. The ISBN is your key. The value in this case is a
book that contains a list of books!
Key == ISBNValue == Book that lists other books!
0786704810 Author, Title, Publisher, Year
HBase: In brief SortedMap[RowKey,
SortedMap[ColumnFamilyName, SortedMap[Qualifier,
SortedMap[Timestamp,Value]]]]
HBase: In brief
Some quick facts:Column families are defined ahead of time and require the table to disabled to be altered.Only Column families are fixed. Everything under that level of maps in flexible.
Qualifiers can be added or removed on the fly. Along with their versions
“The Map” itself is also defined ahead of time
HBase: In brief
What does this look like?DEMO TIME!
HBase: Implementations
The Test CaseThe Ideal Case
The Awesome Case
HBase: The Test Case
One of the services we provide to our users is a message stream. This stream can include
email. Which works like an email client (i.e. outlook or mail.app or on your phone) storing
your email messages so you can get them quickly.
We found ourselves storing 100’s of gigabytes of email contents in our Oracle RAC database.
HBase: The Test Case
Since this data is only accessed by key it made sense to move out of Oracle and into HBase.
HBase: The Test Case
Key ==accountId_providerAccountId_messageId_bodyId
HBase: The Test Case
Key ==accountId_providerAccountId_messageId_bodyId
This is is a nice key because all the messages for a particular user are together by prefix.
Since HBase maintains the keys sorted we can use a Scan to grab them all quickly at one time.
HBase: The Test Case
That’s it!
HBase: The Test Case
Advantages vs Previous solution:Faster
CheaperLess DB load
HBase: The ideal case
Another service we offer our users is the ability to import their social and email connections so
they can have one unified view of all their connections across providers. Allowing users to
manage data by person rather than by account.
HBase: The ideal case
This has two main pieces of data:1.The social profile information2.The relationship between that profile and an Identity
HBase: The ideal case
What makes this ideal for HBase? 1. The profile is sparse data that is only
accessed by key!
HBase: The ideal case
What makes this ideal for HBase? 2. The relationship between a profile and its
identity is only a key-value pair and it reverse!
A Data Structures Interlude
Key == Last Name, First Name, Middle Initial
Value == ExtensionI.e.
Example,Dude,X x555
A Data Structures Interlude
Key == ExtensionValue == Last Name, First Name,
Middle InitialI.e.
x555 Example,Dude,X
HBase: The ideal case Dataflow
1.Get profile from provider2.Check if the profile maps to an existing Identity in HBase
1. If it doesn’t exist store a version of the profile in HBase with providerId as key and profile information as values
3.Associate profile with identity 1. create row in HBase with identityId_providerId as
key4.Update profile with the identity it is associated with
HBase: The ideal case
Coprocessors!What are Coprocessors?
Another feature of HBase which work like triggers.
A coprocessor is a piece of logic attached to an HBase put that is executed on the HBase
cluster.
HBase: The Awesome Case
User stream availability
HBase: The Awesome Case
Originally this system used local caching to store user stream data but has the stream grew this
became impractical.
The solution here was a distributed cache great!
HBase: The Awesome Case
Distributed cache allows us to scale but unless we have a huge grid some user streams will still get evicted from the cache. Which means when the user visits again we have to fetch their streams
from the source which is slow…
HBase: The Awesome Case
Enter HBase from great to awesome!
To fix the latency associated with eviction we added HBase as a backing store to our distributed cache. This means that records in our cache are
periodically written to HBase and are written HBase before being evicted from the cache.
HBase: The Awesome Case
Distributed cache + HBase == Awesome!Why?
Persistence – user streams now live in HBase for as long as we want them to.
Speed – read through from HBase are fastTransparency – as far as application is concerned
everything is just in the cache
HBase: The Awesome Case
Distributed cache + HBase == Awesome!Why?
Reliability – HBase been solid and all the data is stored redundantly
That’s all folk!
Questions?