How NOSQL Paid off for Telenor

56
How NoSQL Paid Off for Telenor JavaZone 13 September 2012 - Oslo

description

This presentation describes how NOSQL solutions such as the Neo4j graph database and Lucene/Solr index was used in a classic middleware stack in Telenor to solve perfomance and scalability issues.

Transcript of How NOSQL Paid off for Telenor

Page 1: How NOSQL Paid off for Telenor

How NoSQL Paid Off for Telenor

JavaZone

13 September 2012 - Oslo

Page 2: How NOSQL Paid off for Telenor

Sebastian Verheughe

Architect and developer

Telenor - mobile middleware services (COS)

Katrina Sponheim

Architect and developer

Telenor – business self service solutions

Page 3: How NOSQL Paid off for Telenor

Telenor NoSQL Experience

o The problem

o The business case

o The solution

o The challenges

o The results

o My 5 cents

Page 4: How NOSQL Paid off for Telenor

The Problem

Page 5: How NOSQL Paid off for Telenor

Min Bedrift

Self service portal where Telenor's corporate customers can manage their entire portfolio of

products.

From small businesses to large corporations

Page 6: How NOSQL Paid off for Telenor

Telenor's Corporate Customer Structure

Customer Acme Corporation

Customer Acme Consulting

Customer Acme Development

Account Construction

Account Demolition

Subscription Huey

Subscription Dewey

Subscription Louie

Page 7: How NOSQL Paid off for Telenor

The Challenge With Large Corporate Customers

Customers with large portfolios presented a couple of challenges for the self service solution Min Bedrift:

1. Middleware Services - Not Designed for Search

The middleware services were not designed for managing large data volumes, resulting in a lot of processing in the client, and the need for extensive caching there.

2. Resource Authorization – Long Calculation Time

User access to resources required the middleware to calculate and cache all accesses at logon, something that could take up to many minutes.

Page 8: How NOSQL Paid off for Telenor

The Nightly Logon & Pre-fetch Solution

In order to achieve acceptable response times in MinBedrift, administrators were logged on and customer data was pre-fetched and put in a cache each night.

However, as the usage of the solution grew, it became obvious that the time window available for pre-fetching each night was closing fast.

0

6

3 9

2012

0

6

3 9

2013

0

6

3 9

2014

Page 9: How NOSQL Paid off for Telenor

The Future - Unhandled

Telenor calculated that the pre-fetch time window would soon be filled, and a increasing percentage of the customers would experience logon response times above the acceptable x sec.

Login

time

Customer Portfolio Size

Portfolio Size

Pre-fetched

x

Not pre-fetched

Page 10: How NOSQL Paid off for Telenor

The Business Impact

In the end, Telenor would risk losing corporate customers due to deteriorated customer experience

Page 11: How NOSQL Paid off for Telenor

Other Caching Drawbacks

o Stale data up to 24 hours old

o Refresh/login for new users still takes a lot of time

o Memory challenges in Min Bedrift

o Unwanted network/middleware/database load

Page 12: How NOSQL Paid off for Telenor

The Business Case

Page 13: How NOSQL Paid off for Telenor

Business Case

The business case is built on the negative consequence of NOT addressing the problem.

Loss of customers (revenue)

Reduced sales transactions (revenue)

Other

Increased manual support (expenses)

Page 14: How NOSQL Paid off for Telenor

The Solution

Page 15: How NOSQL Paid off for Telenor

Solution Requirements - High Level

The middleware search services should be designed to support large data sets in a better way for the all clients.

Resource authorization must be fast enough to deliver real time calculations on demand.

Page 16: How NOSQL Paid off for Telenor

The Previous Architecture

Master Data

RDBMS

Multiple Sources

Client Client Client Client Client Client

Middleware Services

Page 17: How NOSQL Paid off for Telenor

The New Architecture

One Master (r/w) – Several Replicas (r)

Middleware Services

Multiple Sources

Client Client Client Client Client Client

Solr / Lucene Neo4j

Master Data

Sybase RDBMS Search Res Auth

Page 18: How NOSQL Paid off for Telenor

Domain Event Messaging

Solr / Lucene Neo4j

Master Data

RDBMS

Domain Event

Domain Event

Raw

DB

Event

Apache Camel

Res Auth Search Messaging

Page 19: How NOSQL Paid off for Telenor

Putting it All Together

Min Bedrift MW Search MW Auth

search

getAuthorizedResources

filteredSearch

authorized match

Page 20: How NOSQL Paid off for Telenor

Lucene / Solr Solution

Page 21: How NOSQL Paid off for Telenor

Search Service

Today implemented in Min Bedrift

o Cached nightly

o Simple, and iterates over the nodes when searching

o With memory/GC challenges

Page 22: How NOSQL Paid off for Telenor

New Search Service

Data stored in Solr/Lucene search engine

o New middleware module exposing WS using tomcat

o Everything indexed makes search extremely fast

o De-normalized data does not require joins

o Search by relevance, paging, sorting and much more

Page 23: How NOSQL Paid off for Telenor

Solr Cores

Search

Service

Subscription

Customer

Account Client

Page 24: How NOSQL Paid off for Telenor

Entity Denormalization

Subscription

Customer

Account

also contains account name & id

also contains customer name & id

An entity may include data from several tables

Page 25: How NOSQL Paid off for Telenor

Solr/Lucene - Denormalized List View

Arthur | Jackson Total 2341 555 21 1234

Customer

Subscription

Account

Lisa | Simpson Youth 3435 555 64 3634

John | Brown Pro 5352 555 25 5433

user

has

Page 26: How NOSQL Paid off for Telenor

Searching by Relevance

Search some or all rows, and return hits by relevance (or sorted)

User Name Subscription Phone Number Account Ref. Score Rank

Jane Youth 555 21 3253 3253 15

Paul Premium 555 23 4365 5262

John Standard 555 95 1436 7346

Nina Standard 555 15 3263 3734

Lydia Youth 555 92 3253 7334 5

Tom Standard 555 02 6394 3212

Neil Premium 555 03 2583 3523

Subscription

10 5

5

1

2

Page 27: How NOSQL Paid off for Telenor

Neo4j Solution

Page 28: How NOSQL Paid off for Telenor

Resource Authorization Service

Stored procedure in RDBMS calculating all accesses

o Uses several minutes to calculate for large customers

o Cached for up to 24 hours

o Extremely complex to understand (1500 lines of sql)

o Tightly coupled with other services querying the database

Page 29: How NOSQL Paid off for Telenor

New Resource Authorization Service

Customer structure stored in Neo4j graph database

o New middleware module exposing WS using tomcat

o Designed to focus on the relationships between objects

o Very fast – independent of total amount of objects stored

Page 30: How NOSQL Paid off for Telenor

Nodes and Relationships

C

C

A

S

A A

S S

S S S

S

U

o Relationships with type and direction

o Nodes (with type as property)

User Customer Account Subscription

USER_ACCESS (with prop inherit: true/false)

A

S S

C

S

PART_OF

CONTROLLED_BY

SUBSCRIBED_BY

Page 31: How NOSQL Paid off for Telenor

Traversal (query)

C

C

A

S

A A

S S

S S S

S

U

All traversals start from a single node

User Customer Account Subscription

A

S S

C

S

A

S S

The start node is often the user node in our case

Page 32: How NOSQL Paid off for Telenor

Following the Relationships

One custom PathExpander class

o Only follow valid relationships and direction

o Only follow necessary relationships

o Check inheritance rules for current path

Just override the expand method

Iterable<Relationship> expand(Path, BranchState)

Page 33: How NOSQL Paid off for Telenor

Picking the Nodes

Custom Evaluator

o Decide to include or exclude

o Delegate to filter that fits your search

o Filter may further evaluate neighbor nodes

Just override the evaluate method

Evaluation evaluate(Path path){

if (resourceFilter.filter(path)

return Evaluation.INCLUDE_AND_CONTINUE

return Evaluation.EXCLUDE_AND_CONTINUE

}

Page 34: How NOSQL Paid off for Telenor

Example Access Authorization

C

C

A

S

A A

S S

S

S S

S

U

Retrieve all subscriptions using a fan out search

User Customer Account Subscription

A

S S

A

S S

C

S

Page 35: How NOSQL Paid off for Telenor

Example Access Authorization

C

C

A

S

A A

S S

S

S S

S

U

Has access to resource using a reverse search to limit number

of nodes to evaluate. Find all paths, and validate one of them.

User Customer Account Subscription

A

S S

A

S S

C

S

Page 36: How NOSQL Paid off for Telenor

The Challenges

Page 37: How NOSQL Paid off for Telenor

Lucene/Solr

o Using a document store in a relational world – updates

o Change mindset to search by relevance, not sorting

o The time is in the small stuff – not difficult but needs learning

o What type of queries to search on this platform, and NOT

o Scaling & Distribution – Actually, not a challenge…

Page 38: How NOSQL Paid off for Telenor

Neo4j

o Competence

o New way of thinking

o Making them really fast (profile & understand graph impl)

o Getting the classic middleware take use of the new service

o What type of queries to search on this platform, and NOT

o Scaling, not easy across servers (not needed for now)

Page 39: How NOSQL Paid off for Telenor

Messaging

o Mapping from a relational model (need cache)

Page 40: How NOSQL Paid off for Telenor

The Results

Page 41: How NOSQL Paid off for Telenor

Project State

o Phase 1 in production (subscription only, nightly populated)

o Phase 2 in system test (the rest + live population)

The following results are from the test environment now

Page 42: How NOSQL Paid off for Telenor

Neo4J Runtime Environment

Initial State: Prewarmed at startup, all data in heap

Population: ~20 M nodes (all indexed)

~20 M node properties (only 1 per node)

~50 M relationships

Batchwise (50 K nodes) in 35 minutes

Base heap usage: 10 GB (of 16 GB)

Load: Minimal (not measured with heavy load)

Page 43: How NOSQL Paid off for Telenor

Neo4J Measured Performance

Customers measured for performance:

X Y Z

Corporation Customer Accounts Subscriptions

X 160 1 300 147 000

Y 32 000 23 000 52 000

Z 7 18 95 000

Page 44: How NOSQL Paid off for Telenor

Neo4J Measured Performance

X: 1 300 Y: 23 000 Z: 18

10 ms 260 ms 2 ms

Find accounts

Page 45: How NOSQL Paid off for Telenor

Neo4J Measured Performance

X: 147 000 Y: 52 000 Z: 95 000

1 700 ms 750 ms 1300 ms

Find subscriptions

Page 46: How NOSQL Paid off for Telenor

Neo4J Measured Performance

X Y Z

2 ms 2 ms 2 ms

Has access to subscription

Page 47: How NOSQL Paid off for Telenor

Solr/Lucene Measured Performance

X Y Z

7 ms 58 ms 4 ms

Find subscriptions

Page 48: How NOSQL Paid off for Telenor

Service Performance from Min Bedrift

120 ms

55 ms

“Google” search for corporation x: 120 ms

Auth Graph

Min Bedrift

Search Solr

searchAllResources

findAuthorizedResources

Page 49: How NOSQL Paid off for Telenor

Old vs. New Resource Authorization Service

Calculate All Resources RDBMS Graph

X 12 min 18 sec < 2 sec

Y 22 min 58 sec < 2 sec

Z 3 min 15 sec < 2 sec

Cold Warm Heap

Page 50: How NOSQL Paid off for Telenor

Min Bedrift

Demo

Page 51: How NOSQL Paid off for Telenor

Summary

Scalable It allows customer growth

Fast logon On demand resource authorization

Fast search Server side search engine much faster

Reusable All clients may use new services

Fresh data Not up to 24 hours old – almost live

Page 52: How NOSQL Paid off for Telenor

Alternatives

In-Memory Database (Sybase)

This option was discussed, but license cost and the uncertainty if it would be enough made us go for the NoSQL option.

Other NoSQL Solutions

We chose to prototype Neo4j and Lucene/Solr because they were popular and seemed to fit us well, and since it worked we stuck with them.

Page 53: How NOSQL Paid off for Telenor

How We Started Using NoSQL Technology

o Downloaded and prototyped technology very early

o Got training on site to accelerate the development startup

o At the end of development, did a review/QA of the solution

For Lucene/Solr, we got training and support from local Solr/Lucene expert consultant Jan Høydahl

For Neo4j, we got training and excellent support from NeoTech directly

Page 54: How NOSQL Paid off for Telenor

My 5 Cents

Page 55: How NOSQL Paid off for Telenor

Think About…

No Language Standard for Graph Databases

How simple (or possible) is it to change the NoSQL provider?

Working With Relationships

Graph databases are intuitive and fast to work with when interested in how objects are related to each other.

Complexity

You introduce complexity, so make sure it is worth it!

New Technology

Do you have enough in-house competence, or can you easily buy the necessary competence? Also when maintaining the code.

Gentle NoSQL Introduction

Easier to start using when supporting a specific and limited services

Page 56: How NOSQL Paid off for Telenor

The End Questions?