GraphConnect 2014 SF: From Zero to Graph in 120: Scale
-
Upload
neo4j-the-open-source-graph-database -
Category
Software
-
view
189 -
download
2
description
Transcript of GraphConnect 2014 SF: From Zero to Graph in 120: Scale
Scaling Neo4j Applica0ons
SAN FRANCISCO | 10.22.2014
powered by!
powered by!
@iansrobinson
The Burden of Success
• More users • Larger datasets • More concurrent requests • More complex queries
Scaling is a Feature
• It doesn’t come for free • Condi0ons of success: – Understand current needs
• Design for an order of magnitude growth
– Itera0ve and incremental development – Unit tests
• Bedrock of asserted behaviour – Performance tests
Overview
• Scaling Reads – Latency – Throughput
• Scaling Writes • Hardware
Scaling Reads -‐ Latency
Query Latency
latency = f(search_area)
Query Latency
latency = f(search_area)
Query Latency
latency = f(search_area)
Query Latency
latency = f(search_area)
Query Latency
latency = f(search_area)
Query Latency
latency = f(search_area)
Search Area
search_area = f(domain_invariants)
Search Area
search_area = f(domain_invariants)
Absolute Every user has 50 friends
Search Area
search_area = f(domain_invariants)
Absolute Every user has 50 friends
Search Area
search_area = f(domain_invariants)
Absolute Every user has 50 friends Rela,ve Every user is friends with 10% of the user base
Search Area
search_area = f(domain_invariants)
Absolute Every user has 50 friends Rela,ve Every user is friends with 10% of the user base
Reducing Read Latency
• The Blackadder solu0on
Reducing Read Latency
• The Blackadder solu0on • Improve the Cypher query • Change the model • Use an Unmanaged Extension
Improve Cypher Query
• Small queries, separated by WITH• Start from low-‐cardinality nodes
h\p://thought-‐bytes.blogspot.co.uk/2013/01/op0mizing-‐neo4j-‐cypher-‐queries.html h\p://wes.skeweredrook.com/pragma0c-‐cypher-‐op0miza0on-‐2-‐0-‐m06/
Change the Model
Goal Do less work (in the query) – By exploring less of the graph
How? Iden0fy inferred rela-onships – Replace with use-‐case specific shortcuts
Change the Model -‐ From
MATCH (:Person{username:'ben'}) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (colleague:Person)
Change the Model -‐ From
MATCH (:Person{username:'ben'}) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (colleague:Person)
Change the Model -‐ To
MATCH (:Person{username:'ben'}) -[:WORKED_WITH]- (colleague:Person)
Tradeoff
More expensive writes More data
Cheaper reads
When to add the new rela0onship? • With tx • Queue for subsequent tx • Periodic/batch
Refactor Exis0ng Data
MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person)WHERE NOT ((p1)-[:WORKED_WITH]-(p2))WITH DISTINCT p1, p2 LIMIT 10MERGE (p1)-[r:WORKED_WITH]-(p2)RETURN count(r)
Select Batch
MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person)WHERE NOT ((p1)-[:WORKED_WITH]-(p2))WITH DISTINCT p1, p2 LIMIT 10MERGE (p1)-[r:WORKED_WITH]-(p2)RETURN count(r)
Batch size
Add New Rela0onship
MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person)WHERE NOT ((p1)-[:WORKED_WITH]-(p2))WITH DISTINCT p1, p2 LIMIT 10MERGE (p1)-[r:WORKED_WITH]-(p2)RETURN count(r)
Con0nue While count(r) > 0
MATCH (p1:Person) -[:WORKED_ON]->(:Project)<-[:WORKED_ON]- (p2:Person)WHERE NOT ((p1)-[:WORKED_WITH]-(p2))WITH DISTINCT p1, p2 LIMIT 10MERGE (p1)-[r:WORKED_WITH]-(p2)RETURN count(r)
Use Unmanaged Extensions
REST API Extensions
/db/data/cypher /my-extension/service
RESTful Resource
@Path("/similar-skills")public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); }}
JAX-‐RS Annota0ons
@Path("/similar-skills")public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); }}
Inject Database/Cypher Execu0on Engine
@Path("/similar-skills")public class ColleagueFinderExtension { private static final ObjectMapper MAPPER = new ObjectMapper(); private final ColleagueFinder colleagueFinder; public ColleagueFinderExtension( @Context CypherExecutor cypherExecutor ) { this.colleagueFinder = new ColleagueFinder( cypherExecutor.getExecutionEngine() ); } @GET @Produces(MediaType.APPLICATION_JSON) @Path("/{name}") public Response getColleagues( @PathParam("name") String name ) throws IOException { String json = MAPPER .writeValueAsString( colleagueFinder.findColleaguesFor( name ) ); return Response.ok().entity( json ).build(); }}
1. Get Close to the Data
Applica0on
MATCH MATCH CREATE DELETE MERGE MATCH
Single request, many opera0ons – Reduce network latencies
2. Mul0ple Implementa0on Op0ons
REST API Extensions
Cypher Traversal Framework Graph Algo Package Core API
3. Control Request/Response Format
{ users: [ { id: 1234}, { id: 9876} ] }
JSON, CSV, protobuf, etc
1a 03 08 96 01 Domain-‐specific representa0ons – Compact – Conserve bandwidth
4. Control HTTP Headers
GET /my-extension/service/top-10
Reverse Proxy
Applica0on
HTTP/1.1 200 OK Cache-Control: max-age=60
5. Integrate with Backend Systems
REST API Extensions
Applica0on
RDBMS LDAP
Migra0ng to Extensions
• Re-‐implement original query inside extension • Modify request/response formats and headers
• Refactor implementa0on to use lower parts of the stack where necessary
• Measure, measure, measure
Scaling Reads -‐ Throughput
Scale Horizontally For High Read Throughput
Applica0on
Scale Horizontally For High Read Throughput
Applica0on
Master Slave Slave
Load Balancer
Scale Horizontally For High Read Throughput
Applica0on
Master Slave Slave
Read Load Balancer
Write Load Balancer
Configure HAProxy as Read Load Balancer global daemon maxconn 256defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000msfrontend http-in bind *:80 default_backend neo4j-slavesbackend neo4j-slaves option httpchk GET /db/manage/server/ha/slave server s1 10.0.1.10:7474 maxconn 32 check server s2 10.0.1.11:7474 maxconn 32 check server s3 10.0.1.12:7474 maxconn 32 checklisten admin bind *:8080 stats enable
Configure HAProxy as Read Load Balancer global daemon maxconn 256defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000msfrontend http-in bind *:80 default_backend neo4j-slavesbackend neo4j-slaves option httpchk GET /db/manage/server/ha/slave server s1 10.0.1.10:7474 maxconn 32 check server s2 10.0.1.11:7474 maxconn 32 check server s3 10.0.1.12:7474 maxconn 32 checklisten admin bind *:8080 stats enable
404 Not Found false
404 Not Found UNKNOWN
200 OK true
Master
Slave
Unknown
This Isn’t The Throughput You Were Looking For
Applica0on
1 2 3
Load Balancer
MATCH (c:Country{name:'Australia'})... MATCH (c:Country{name:'Zambia'})... MATCH (c:Country{name:'Norway'})...
Cache Sharding Using Consistent Rou0ng
Applica0on
1 2 3
Load Balancer
MATCH (c:Country{name:'Australia'})... MATCH (c:Country{name:'Zambia'})... MATCH (c:Country{name:'Norway'})...
A-‐I 1 J-‐R 2 S-‐Z 3
MATCH (c:Country{name:'Zimbabwe'})... MATCH (c:Country{name:'Japan'})... MATCH (c:Country{name:'Brazil'})...
Configure HAProxy for Cache Sharding global daemon maxconn 256defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000msfrontend http-in bind *:80 default_backend neo4j-slavesbackend neo4j-slaves balance url_param country_code server s1 10.0.1.10:7474 maxconn 32 server s2 10.0.1.11:7474 maxconn 32 server s3 10.0.1.12:7474 maxconn 32listen admin bind *:8080 stats enable
Configure HAProxy for Cache Sharding global daemon maxconn 256defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000msfrontend http-in bind *:80 default_backend neo4j-slavesbackend neo4j-slaves balance url_param country_code server s1 10.0.1.10:7474 maxconn 32 server s2 10.0.1.11:7474 maxconn 32 server s3 10.0.1.12:7474 maxconn 32listen admin bind *:8080 stats enable
Scaling Writes -‐ Throughput
Factors Impac0ng Write Performance
• Managing transac0onal state – Crea0ng and commilng are expensive opera0ons
• Contending for locks – Nodes and rela0onships
Improving Write Throughput
• Delay taking expensive locks • Batch/queue writes
Delay Expensive Locks
• Iden0fy contended nodes • Involve them as late as possible in a transac0on
Add Linked List Item + Update Pointers
Add Linked List Item + Update Pointers
Locked
Add Linked List Item + Update Pointers
Locked
Add Linked List Item + Update Pointers
Locked
Add Linked List Item
Add Linked List
Add Linked List
Add Linked List
Add Pointers
Locked
Batch Writes
• Mul0ple CREATE/MERGE statements per request – Good for integra0on with backend systems
• Queue – Good for small, online transac0ons
Single-‐Threaded Queue
Write
Write Write
Queue
Single Thread Batch
Queue Loca0on Op0ons
Applica0on Applica0on
Benefits of Batched Writes
• Less transac0onal state management – Create/commit per batch rather than per write
• No conten0on for locks – No deadlocks
• Query consolida0on – Reduce the amount of work inside the database
Query Consolida0on
MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH samMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address1CREATE address2DELETE address1MATCH samCREATE sam-[:LIVES_AT]-address2
Eliminate Duplicate Lookups
MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH samMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address1CREATE address2DELETE address1MATCH samCREATE sam-[:LIVES_AT]-address2
Eliminate Duplicate Lookups
MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH samMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address1CREATE address2DELETE address1MATCH samCREATE sam-[:LIVES_AT]-address2
Eliminate Duplicate Lookups
MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address1CREATE address2DELETE address1CREATE sam-[:LIVES_AT]-address2
Eliminate Duplicate Lookups
MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address1CREATE address2DELETE address1CREATE sam-[:LIVES_AT]-address2
Eliminate Unnecessary Writes
MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address1CREATE address2DELETE address1CREATE sam-[:LIVES_AT]-address2
Eliminate Unnecessary Writes
MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address1CREATE address2DELETE address1CREATE sam-[:LIVES_AT]-address2
Eliminate Unnecessary Writes
MATCH samMATCH jennyCREATE sam-[:KNOWS]-jennyMATCH sarahCREATE sam-[:KNOWS]-sarahCREATE address2CREATE sam-[:LIVES_AT]-address2
Tradeoff
Latency
Higher throughput
In-‐memory or durable queues? • Lost writes in event of crash • Transac0onal dequeue?
Further Reading
h\p://maxdemarzi.com/2013/09/05/scaling-‐writes/ h\p://maxdemarzi.com/2014/07/01/scaling-‐concurrent-‐writes-‐in-‐neo4j/
Hardware
Memory
• SLC (single-‐level cell) SSD w/SATA • Lots of RAM – 8-‐12G heap – Explicitly memory-‐map store files
Object Cache
• 2G for 12G heap • No object cache – consistent throughput at expense of latency
AWS
• HVM (hardware virtual machine) over PV (paravirtual)
• EBS-‐op0mized instances • Provisioned IOPS