Nuxeo World Session: Scaling Nuxeo Applications

47
1 Scaling Nuxeo Applications Building scalable content apps with Nuxeo Benoit Delbosc / Thierry Delprat

description

Presentation from Nuxeo World.

Transcript of Nuxeo World Session: Scaling Nuxeo Applications

Page 1: Nuxeo World Session: Scaling Nuxeo Applications

1

Scaling Nuxeo ApplicationsBuilding scalable content apps with Nuxeo

Benoit Delbosc / Thierry Delprat

Page 2: Nuxeo World Session: Scaling Nuxeo Applications

2

Performance questions ...

What CPU should I use to host 1TB data ?

Will 2 servers be enough for 1000 users ?

Can I run Nuxeo DM with 512 MB of heap ?

Is DELL PExyz ok to host Nuxeo ?

How can I ensure 0.1s response time ?

Page 3: Nuxeo World Session: Scaling Nuxeo Applications

3

Why is this not that simple ?● Nuxeo EP is an ECM platform

● there are several ways to use it– hundreds of possible user actions– lot of screens

● there are several distributions● there are several possible confgurations

(security, fling plan, doc types ...)

● you can not size without knowing● the hypothesis ● the constraints

Page 4: Nuxeo World Session: Scaling Nuxeo Applications

4

Agenda● Approach for managing performance● Nuxeo Architecture and performance● Performance testing● Performance tuning● Benchmark fgures

Page 5: Nuxeo World Session: Scaling Nuxeo Applications

5

Ensuring good performance● Carefully defne your hypothesis

➔ what processing will be needed ?➔ batch processing vs interactive processing

➔ what data you will be processed ?➔ big image transformation vs simple page rendering

➔ what are the expectations ?➔ how many concurrent users ?➔ are users hyper-active ?

➔ be able to defne a usage scenario

Page 6: Nuxeo World Session: Scaling Nuxeo Applications

6

Ensuring good performance● Defne your architecture according to

● constraints ● initial benchmark➔ mono server, multi-servers, cluster …

● Organize periodic benchmarks➔ ideally integrate in CI chain

● Plan some tuning➔ DB, memory, connections …➔ and code too

➔ this is not a one shot task

Page 7: Nuxeo World Session: Scaling Nuxeo Applications

7

Impacting factors● Security policies

● ACLs inheritance and custom security policies

● Web Layer● Stateless vs Stateful● JSF traps and screen design

● Document types● complex document types impact DB schema

● Number of active documents● number of rows ● size of indexes vs Memory

Page 8: Nuxeo World Session: Scaling Nuxeo Applications

8

Impacting factors● Simultaneous access

● how many concurrent requests/s● include batch processing

● Application server● Tomcat 6 is signifcantly faster than JBoss 5

● DB choice● OS

● a 32 bit OS is too limiting for JVM Memory● JVM seems to run faster under Linux 64

Page 9: Nuxeo World Session: Scaling Nuxeo Applications

9

No (or low) impact factors● Total volume of binary fles

● only network and low level storage are impacted● Average number of document per folder

● VCS has not the same limitations as JackRabbit● Number of parallel sessions

● can only be impacting in JSF● Documents that are almost never accessed

● DB caches should do their job

Page 10: Nuxeo World Session: Scaling Nuxeo Applications

10

Scaling Nuxeo ApplicationsArchitecture considerations

Page 11: Nuxeo World Session: Scaling Nuxeo Applications

11

Architecture solutions

● 3 possibles axises

● Simple Clustering

● Spreading services on multiple JVMs

● Multiple repositories

Page 12: Nuxeo World Session: Scaling Nuxeo Applications

12

VCS Cluster● VCS Cluster is simple

● only one confg parameter to turn on● do not rely on App Server level cluster● “Nuxeo Boxes” are swappable

● VCS Cluster scales well● as long as the backend DB scales !

➔ VCS Cluster is a good solution for both➔ scaling out➔ providing redundancy

Page 13: Nuxeo World Session: Scaling Nuxeo Applications

13

1 node deployment

FS

DBNuxeoInstance

Page 14: Nuxeo World Session: Scaling Nuxeo Applications

14

2 nodes deployment

NAS

DB

NuxeoInstance

NuxeoInstance

NLB

Page 15: Nuxeo World Session: Scaling Nuxeo Applications

15

3 nodes deployment

NAS

DB

NuxeoInstance

NuxeoInstance

NuxeoInstance

NLB

Page 16: Nuxeo World Session: Scaling Nuxeo Applications

16

Multi VM deployment● Nuxeo Services can be spread across JVMs

● externalize batch processing (mass i/o)● externalize heavy transformations● externalize slow interaction with external app● …➔ build dedicated processing servers

● Nuxeo services can be coupled with a GRID● integration with GRID Gain

Page 17: Nuxeo World Session: Scaling Nuxeo Applications

17

Multi VM : mono node

FS

DBNuxeoInstance

batch

Page 18: Nuxeo World Session: Scaling Nuxeo Applications

18

Multi VM : 2 nodes

FS

DBNuxeoInstance

batchNuxeo

Instance

Page 19: Nuxeo World Session: Scaling Nuxeo Applications

19

Multi VM : GRID

FS

DBNuxeoInstance

batchNuxeo

Instance

batchNuxeo

Instance

batchNuxeo

Instance

GRID

Page 20: Nuxeo World Session: Scaling Nuxeo Applications

20

Multi VM : perspectives● Technically each service could be accessed

remotely via RMI (on JEE server)● rendering layer● wf engine ● …

● But it's a little bit more complex● Tx management● binding confguration

➔ In most case VCS Cluster is much more simple

Page 21: Nuxeo World Session: Scaling Nuxeo Applications

21

Multi-Repositories● A single Nuxeo Application can be bound to

several repositories● Each repository

● has its own database and cache➔ scale out solution if DB is the bottleneck

➔ Useful to do data partitioning➔ manage documents with different constraints

(ex: Live documents and archives)

➔ manage documents for different user groups(ex: multi-tenant)

Page 22: Nuxeo World Session: Scaling Nuxeo Applications

22

Multi-Repositories

Live documents

Archived documents

Repo1

Repo2

Database2

FS2

Database1

FS1

NuxeoServer

Page 23: Nuxeo World Session: Scaling Nuxeo Applications

23

Scaling Nuxeo ApplicationsPerformance testing

Page 24: Nuxeo World Session: Scaling Nuxeo Applications

24

Performance testing overview

Page 25: Nuxeo World Session: Scaling Nuxeo Applications

25

Setup a realistic environment

● Replicate as much as possible the production environment

● Use historical data when possible● Populate the database

● Custom mass importer tools (nuxeo-platform-importer)

● Load generating tools● Feed the database at the SQL level

Page 26: Nuxeo World Session: Scaling Nuxeo Applications

26

Monitoring● Performance testing without monitor don't bother● Collecting general information:

● Hardware metrics: type and number of cpu, memory, disk usage

● Software versions: os, jvm, middlewar, application● Application confguration: nuxeo.conf● Database confguration and database statistics

Page 27: Nuxeo World Session: Scaling Nuxeo Applications

27

Monitoring CPU● Is the CPU a bottleneck ?● Is the CPU waiting for IO ?● Does the system use all

the available CPU ?

Page 28: Nuxeo World Session: Scaling Nuxeo Applications

28

Monitoring Disk● Is there a device

saturation ?● Writing or reading

operations ?

Page 29: Nuxeo World Session: Scaling Nuxeo Applications

29

Monitoring GC● Is the JVM spend too

much time in the garbage collector ?

Page 30: Nuxeo World Session: Scaling Nuxeo Applications

30

Monitoring web request processor

● How many requests ?● How much input/output ?

Page 31: Nuxeo World Session: Scaling Nuxeo Applications

31

Monitoring web thread pool● Is the thread pool a bottle

neck ?

Page 32: Nuxeo World Session: Scaling Nuxeo Applications

32

Monitoring datasources● Is there enough

connection in the pool ?

Page 33: Nuxeo World Session: Scaling Nuxeo Applications

33

Monitoring JVM● How much JVM threads ?● After a longevity testing, is

there a memory leak ?

Page 34: Nuxeo World Session: Scaling Nuxeo Applications

34

Monitoring database● How much time is spent

processing SQL ?● Which query took up the

most time ?

Page 35: Nuxeo World Session: Scaling Nuxeo Applications

35

Setup the monitoring● Use your production monitoring

(nagios, hyperic, sysstat …)

● Setup GC log in the nuxeo.conf fle:JAVA_OPTS=$JAVA_OPTS -Xloggc:$DIRNAME/../log/gc.log -verbose:gc -XX:+PrintGC

● Setup the monitor JBoss templates in the nuxeo.conf fle :

nuxeo.templates=postgresql,monitor

● Use monitorctl.sh./jboss/bin/monitorctl.sh

Usage: monitorctl.sh (start|stop|status|heapdump [TAG]|info|vacuumdb|help)

Page 36: Nuxeo World Session: Scaling Nuxeo Applications

36

Load generating tools● The application has to cooperate a bit to ease

test writing● Tools

● In-house tools● Vendor tools● Open source tools

– JMeter (GUI script, java)– FunkLoad (Python)

Page 37: Nuxeo World Session: Scaling Nuxeo Applications

37

Writing test scriptsTry to simulate the expected user actions

Page 38: Nuxeo World Session: Scaling Nuxeo Applications

38

Performance report● Speed (response time)

● Throughput (requests/s)

● User satisfaction (Apdex)http://www.apdex.org/

Page 39: Nuxeo World Session: Scaling Nuxeo Applications

39

Monitoring report● All in one logchart.py

http://svn.nuxeo.org/nuxeo/tools/qa/logchart/trunk

● Other tools:

kSar, pgfouine, gcviewer ...

Page 40: Nuxeo World Session: Scaling Nuxeo Applications

40

Scaling Nuxeo ApplicationsTuning

Page 41: Nuxeo World Session: Scaling Nuxeo Applications

41

Tuning process

Page 42: Nuxeo World Session: Scaling Nuxeo Applications

42

Jboss/Tomcat tuning● JVM, heap size (nuxeo.conf)

JAVA_OPTS=$JAVA_OPTS -Xms2g -Xmx2g

● In special case you may remove SoftRefJAVA_OPTS=$JAVA_OPTS -XX:SoftRefLRUPolicyMSPerMB=0

● Datasource connection pool (nuxeo.conf)nuxeo.db.max-pool-size=40nuxeo.vcs.max-pool-size=40

● HTTP or AJP Thread pooling <Connector port=”8080” … maxThreads=”32” ... acceptCount=”256” .. />

Page 43: Nuxeo World Session: Scaling Nuxeo Applications

43

Database tuning● Read the Nuxeo KB

https://doc.nuxeo.com/display/KB/Confguring+PostgreSQL

● EXPLAIN ANALYZE helper: http://explain.despez.com

● Check for missing index on custom schema

● PostgreSQL Performance ml mailto:[email protected]

Page 44: Nuxeo World Session: Scaling Nuxeo Applications

44

Scaling Nuxeo ApplicationsBenchmarking results

Page 45: Nuxeo World Session: Scaling Nuxeo Applications

45

Mass injection● Using load generating tools:

from 3 to 10 doc/s

works fne up to 100k docs(otherwise it takes to much time)

● Using nuxeo-platform-importer:from 30 to 100 doc/s

works fne up to 1m docs (otherwise it takes to much time)

● Using SQL injection :from 1000 to 3000 doc/s

Page 46: Nuxeo World Session: Scaling Nuxeo Applications

46

Document retrieval and insertion operations● Nuxeo DM 5.4 / tomcat / Sun JVM 6 (Heap 3G)

● 10M of documents, 1TB of data

● Dell PE 2900 2xQuad-Core 20g RAM / Linux

● http://public.dev.nuxeo.com/~ben/bench-10m-tomcat/

Operation Speed (s) Throughput (req/s) Extrapolation VU

JSF View of a random document

0.6 25 250

JSF View of a cached document

0.2 30 300

Web Engine navigation

0.1 100 1000

JSF Creating new file

0.8 16 160

Page 47: Nuxeo World Session: Scaling Nuxeo Applications

47

Thank you!