Building a high performance directory server in Java...The Garbage First (G1) GC > Introduced in the...

Post on 27-Jun-2020

3 views 0 download

Transcript of Building a high performance directory server in Java...The Garbage First (G1) GC > Introduced in the...

Building a high performance directory server in JavaLessons learned and tips from the OpenDS project.Ludovic Poitou

Matthew Smith

Sun Microsystems

#118

2

AGENDA

> Introduction to the OpenDS project

> Architecture, Design Patterns and Tips

> Experiences with Sun JVM

> Conclusion

3

AGENDA

> Introduction to the OpenDS project

> Architecture, Design Patterns and Tips

> Experiences with Sun JVM

> Conclusion

The OpenDS Project

> Released in Open Source in July 2006

– CDDL

– Source code at https://opends.dev.java.net/

> Sponsored by Sun Microsystems

> Written in Java by LDAP experts

What is it ?

> OpenDS is effectively a Java based Server supporting the LDAPv3 protocol and services

– Objet Oriented, Hierarchical Data Model

– CRUD operations

> It comes with its own embedded database

– Based on Berkeley DB Java Edition

– Not accessible from outside

> It has all security, access controls, password management features to safely store the information about Users

What is it for ?

> Generic object oriented data store

> White pages and Email Address Book

> Mostly the data store for Identities

– For Authentication and Authorization

– For profiles and personalization

> The underlying infrastructure in all Enterprises

– Leveraged by Web and Mail infrastructure products

– Cornerstone of Identity Management products: Access Management and Federation Provisioning and De-provisioning tools

7

Who for ?

> Telecom service providers, financial institutions use LDAP directories for customer related services (Portal)

– Storing customers identities, phones, services associated

– Building highly available services for 10 Millions, up to 200 Millions users

> But OpenDS can be used as OS naming service, or for SMB

– OpenSolaris, Solaris, Linux...

– Coupled with SAMBA, as a Domain controler

– Integrated with Kerberos

– White Pages...

> And being 100% pure Java, OpenDS can be embedded in other Java applications or Web applications

– OpenSSO

OpenDS 2.2

> Released in December 2009

> LDAPv3 directory server fully standard compliant

– Supports many LDAP standard and experimental extensions

– Supports Multi Master Replication with 3 different levels of data consistency

– Extensive security features

> Improved performances, reliability over OpenDS 1.0

> Installs in 6 clicks and less than 3 minutes

> Several GUI and CLI to manage, monitor the OpenDS server

> Extensive documentation

> Localized in 6 different languages

Performance characteristics

> As for most servers, scalability is extremely important

– Up to hundreds million of entries

– Up to thousands connections

– Maximize use of CPUs

> What is the operations throughput ?

> What is the average response time ? The maximum response time ?

> Our basic test

– 10 M entries, with an average size of 2.6K

– 2 servers, with Multi-Master replication between them

Searchrate on Sun x4170 box

Modrate on a Sun x4170 box

12

AGENDA

> Introduction to the OpenDS project

> Architecture, Design Patterns and Tips

> Experiences with Sun JVM

> Conclusion

How to reach those results ?

> 2 main aspects

– Architecture and code

– Run-time : JVM and Garbage Collector optimization

> There is a strong relationship between code design and memory optimization

Architecture Overview

15

Patterns

> Use of Asynchronous I/O

– Exception for write disk transactions

> Use of Immutable Objects

– Intrinsic thread safety

– Avoid need for defensive copies

> Use of “Factories” over Constructors

– Avoid creating an object

– Ease optimization for common cases Example : Most AttributeDescriptions have 0 options Example : Attributes generally have 1 value

– For immutable objects.

16

Patterns

> Producers / Consumers

– Queues

– Thread Pool

– Monitors

> Strategies

– Queue Strategies : ConcurrentLinkedQueue vs LinkedBlockingQueue

17

Anti-Patterns

> String concatenation

– Make sure to use a StringBuilder

– Compiler now optimize simple “Aaa” + “Bbb” concatenation

> Avoid very long methods (thousands of lines of code)

> Avoid exposing the concrete representation of an object

– Set vs LinkedHashSet.

– Not a performance issue, but will require more work when optimizing code for performance later

> Try to define only the methods you need.

18

Java Collections

> Vector and Hashtable are synchronized for all methods

– Pay the price even if not necessary

> Some Java collection classes are not synchronized by default

– ArrayList, LinkedList replace Vector

– HashSet, HashMap replace Hashtable

> To synchronize, wrap in a class

> With a “static factory”

– Collections.synchronizedList(new ArrayList())

> ConcurrentHashMap, for concurrency

– But watch when using the iterator

Critical Sections

> Try to minimise the code and time spent in the critical sections

> But the throughput is limited by the time spent in the largest critical section

– Example : LinkedBlockingQueue 200 000 operations on x64 processor 20 000 operations on the T2000 processor

– We use it for the WorkQueue and Access Logs

Caching data

> Using Caches reduce the disk access thus should provide better performances

> But cache eviction add pressure to the GC

– When modifying entries

– When the cache is too small to hold all data

> A cache is also a contention point

> If you want to cache objects, make sure you cache those that will be reused.

> Alternate possibility, use “thread local” cache.

– But watch out for the cost (with 1000 threads?)

Server monitoring

> Getting statistics for a server is mandatory

> Beware of contention

– Stats are updated frequently

– But seldom read

> A strategy could be to keep per thread statistics and collect them on demand

– Not yet implemented in OpenDS !

22

AGENDA

> Introduction to the OpenDS project

> Architecture, Design Patterns and Tips

> Experiences with tuning Sun JVM

> Conclusion

23

Performance Tuning

> When dealing with performances, you should consider the whole system

– Java VM

– OS

– Hardware : CPU, Memory, Disks, Network...

> In our case, we try to avoid disk I/Os

– And try to cache as much of the database

> We also want deterministic response times

– Avoid any Full GC (Stop The World)

– Make sure minor GC pauses are as small as possible

24

JVM Tuning for OpenDS

> Super Size The Heap !

– We use 32GB Heaps, sometime up to 96GB

– 2GB for the New Generation (or ¼ of heap if < 8GB)

– -Xms32768M -Xmx32768M -Xmn2048M

> Use CMS

– -XX:+UseConcMarkSweepGC

– -XX:+UseParNewGC

> -XX:MaxTenuringThreshold=1

– Avoid copy of objects in New Gen

25

Some interesting JVM Options

> -XX:CMSInitiatingOccupancyFraction=70

– Define the amount of occupancy in Old Gen before starting to collect

– Larger = better throughput but higher full GC risk

> -XX:+UseCompressedOops

– For 64bits JVM, less than 32BG of heap

– Will be the default in coming Java 6 updates

> If running on processor with NUMA architecture

– -XX:+UseNUMA> -XX:+AggressiveOpts

– Enables aggressive JIT optimizations, not related to GC

26

The Garbage First (G1) GC

> Introduced in the Java HotSpot VM in JDK 7.

> An experimental version of G1 has also been released since Java SE 6 Update 14.

> G1 is the long-term replacement for HotSpot's low-latency Concurrent Mark-Sweep (CMS) GC

> Should be officially supported with Java SE 6 Update 21

27

G1 Characteristics

> Future CMS Replacement

– Server “Style” Garbage Collector

– Parallel, Concurrent

– Generational

– Good Throughput

– Compacting

– Improved ease-of-use

– Predictable (though not hard real-time)

> The main stages consist of remembered set (RS) maintenance, concurrent marking, and evacuation pauses.

28

JVM Options With G1

> -XX:+UnlockExperimentalVMOptions-XX:+UseG1GC

> PauseTime (Hints, Goal with no promise, otherwise use Java Real Time )

– -XX:MaxGCPauseMillis=50 (target of 50 milliseconds)

– -XX:GCPauseIntervalMillis=1000 (target of 1000 msecs)

> Generation Size

– -XX:+G1YoungGenSize=512m (for a 512 MB young gen)

> Parallelism

– -XX:+G1ParallelRSetUpdatingEnabled

– -XX:+G1ParallelRSetScanningEnabled

29

OpenDS et G1

> Goal: Avoid any Full GC, best control of pauses

> Collaboration between the HotSpot and the OpenDS teams

– OpenDS is used as a “Large” reference application

– Between 10 and 20 enhancements integrated in G1 following the tests

– Performance with large heaps improved by a factor of 10

> We're still discovering it

– When doing read operations, we see pauses between 10 and 20 ms with 32GB JVM

– But we're still seeing Full GC when doing Write operations (More garbage, stresses more the Old Gen)

– Hopefully this will be resolved in next builds

30

OpenDS G1 and Searches

31

AGENDA

> Introduction to the OpenDS project

> Architecture, Design Patterns and Tips

> Experiences with Sun JVM

> Conclusion

32

Summary

> OpenDS

– A open source LDAP directory server, 100% pure Java

– Easy to install and use

– Designed for high performance and high scalability

> We saw some patterns and tips used in the OpenDS project

> Knowing and understanding the JVM and GC is required to build high performance server

– Tuning JVM and GC is an art

– Performance engineering is a profession

> Who said that Java is slow ?!

33

The Art of GC Tuning

http://developers.sun.com/learning/javaoneonline/j1sessn.jsp?sessn=TS-4887&yr=2009&track=javase

JavaOne Presentation: GC Tuning in HotSpot JVM

34

Now...

> Give OpenDS a try

– http://www.opends.org

> Join our community:

– Join/Login sur Java.net

– http://opends.dev.java.net

– Request a Role

– Subscribe to the mailing lists

– IRC: #opends on freenode.net

> OpenDS is localized in several languages. It's community based, through online tools. An easy way to participate.

Ludovic Poitou blogs.sun.com/Ludo

Sun Microsystems Ludovic.Poitou@sun.com

Matthew Swift

Sun Microsystems Matthew.Swift@sun.com