Impact2014 session # 1523 performance optimization using ibm java on z and wa sz v8.5.5 final 04...

53
© 2014 IBM Corporation Performance optimization using IBM Java on z/OS and WebSphere Application Server on z/OS V8.5.5 Marcel Mitran IBM Senior Technical Staff Member Chief Architect Java on System z Email: [email protected] Elena Nanos Health Care Service Corporation Lead Systems Architect Email: [email protected] Session: #1523

description

IMPACT 2014 ACU-1523: Performance Optimization Using IBM Java on z/OS & IBM WebSphere Application Server on z/OS V8.5.5 I was a guest speaker at IBM IMPACT 2014 conference. This session outlines how to optimize the performance of IBM WebSphere Application Server on z/OS applications, reduce CPU utilization, and take advantage of the latest zEC12 enhancements. IBM continues its efforts and investments in its Java Virtual Machine on IBM System z. zEC12 hardware packs an awesome performance punch with second-generation, out-of-order pipeline design, large caches, and 5.5 GHz hex-core processor. With the exploitation of new features, IBM Java Runtime Environment continues a long history of aggressive vertical integration on IBM System z. Come hear how HCSC is taking advantage of the latest IBM WebSphere Application Server and Java releases and enhancements. This presentation covers installation of Java V6.1, V7.0, and V7.1 with IBM WebSphere Application Server on z/OS V8.5.5 and exploitation of 1 Meg large pages with zEC12 Flash Express and IBM zEnterprise Data Compression with z/OS V2.1. Benchmark performance data is presented

Transcript of Impact2014 session # 1523 performance optimization using ibm java on z and wa sz v8.5.5 final 04...

Page 1: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

© 2014 IBM Corporation

Performance optimization using IBM Java on z/OS and WebSphere

Application Server on z/OS V8.5.5

Marcel MitranIBM Senior Technical Staff MemberChief Architect Java on System z Email: [email protected]

Elena Nanos Health Care Service CorporationLead Systems ArchitectEmail: [email protected]

Session: #1523

Page 2: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Please Note

IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion.

Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision.

The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.

Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user’s job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.

1

Page 3: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Java Road MapLanguage Updates

Java 5.0 • New Language features:

• Autoboxing• Enumerated types• Generics• Metadata

Java 6.0• Performance Improvements• Client WebServices Support

• Support for dynamic languages• Improve ease of use for SWING• New IO APIs (NIO2)• Java persistence API• JMX 2.x and WS connection for JMX

agents• Language Changes

Java 7.0

IBM Java RuntimesIBM Java 5.0 (J9 R23) • Improved performance

• Generational Garbage Collector• Shared classes support• New J9 Virtual Machine• New Testarossa JIT technology

• First Failure Data Capture• Full Speed Debug• Hot Code Replace• Common runtime technology

• ME, SE, EE

IBM Java 6.0 (J9 R24)• Improvements in

• Performance• Serviceability tooling• Class Sharing

• XML parser improvements• z10™ Exploitation

• DFP exploitation for BigDecimal• Large Pages• New ISA features

5.0

6.0

2005 2009

SE

5.0

18 p

latf

orm

s

SE

6.0

20 p

latf

orm

s

EE 5

WAS6.1

WAS7.0

2006 2008

WAS6.0

200704

EE 6.x

**Timelines and deliveries are subject to change.

2010 2011

IBM Java 6.0.1/Java 7 (J9 R26)

• Improvements in• Performance• GC Technology

• z196™ Exploitation• OOO Pipeline• 70+ New Instructions

• JZOS/Security Enhancements

WAS8.5

2012 2013 2014

7.0

• Language improvements • Closures for simplified fork/join

Java 8.0**

SE

601/ 7.x

>=

20 p

latf

orm

s

IBM Java 7 (J9 R26 SR3+)• Improvements in

• Performance• zEC12™ Exploitation

• Transactional Execution• Flash 1Meg pageable LPs• 2G large pages• Hints/traps

IBM Java 7R1 (J9 R27)• Improvements in

• Performance• RAS• Monitoring

• zEC12™ Exploitation• zEDC for zip acceleration• SMC-R integration• Transactional Execution• Runtime instrumentation• Hints/traps

• Data Access Accelerator

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

2

Page 4: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

zEC12 – More Hardware for JavaContinued aggressive investment in Java on ZSignificant set of new hardware features tailored

and co-designed with Java

Hardware Transaction Memory (HTM)

Better concurrency for multi-threaded applications

eg. ~2X improvement to juc.ConcurrentLinkedQueue

Run-time Instrumentation (RI)

Innovation new h/w facility designed for managed runtimes

Enables new expanse of JRE optimizations

2GB page frames

Improved performance targeting 64-bit heaps

Pageable 1M large pages with Flash Express

Better versatility of managing memory

Shared-Memory-Communication

RDMA over Converged Ethernet

zEnterprise Data Compression accelerator

gzip accelerator

New software hints/directives/traps

Branch preload improves branch prediction

Reduce overhead of implicit bounds/null checks

New 5.5 GHz 6-Core Processor Chip

Large caches to optimize data serving

Second generation OOO design

Up-to 60% improvement in throughput amongst Java workloads measured with zEC12 and IBM Java 7

Engineered Together—IBM Java and zEC12 Boost Workload Performance

http://www.ibmsystemsmag.com/mainframe/trends/whatsnew/java_compiler/

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.53

Page 5: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

4

z/OS IBM Java 7: 16-Way PerformanceAggregate HW and SDK Improvement z9 IBM Java 5 to zEC12 IBM Java 7

(Controlled measurement environment, results may vary)

~12x aggregate hardware and software improvement comparing IBM Java5 on z9 to IBM Java 7 on zEC12

LP=Large Pages for Java heap CR= Java compressed references

Java7SR3 using -Xaggressive + Flash Express pageable 1Meg large pages

z/OS Multi-Threaded 64 bit Java Workload 16-Way

~12x Improvement in Hardware and Software

0

20

40

60

80

100

120

140

160

1 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

Threads

No

rma

lize

d T

hro

ug

hp

ut

zEC12 SDK 7 SR3Aggressive + LP Code Cache

zEC12 SDK 7 SR1

z196 SDK 7 SR1

z196 SDK 6 SR8

z10 SDK 6 SR4

z10 SDK 6 GM NO (CR or Heap LP)

z9 Java 5 SR5 NO (CR or Heap LP)

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

4

Page 6: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

WAS on z/OS – DayTrader Aggregate HW, SDK and WAS Improvement: WAS 6.1 (IBM Java 5) on z9 to WAS 8.5 (IBM Java 7) on zEC12

(Controlled measurement environment, results may vary)

6x aggregate hardware and software improvement comparing WAS 6.1 IBM Java5 on z9 to WAS 8.5 IBM Java7 on zEC12

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

5

Page 7: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

IBM SDK for z/OS, Java Tech. Edition, Version 7 Release 1

Expand zEC12/zBC12 exploitation

• More TX, instruction scheduler, traps, branch preload

• Runtime instrumentation exploitation

• zEDC exploitation through java/util/zip

• Integration of SMC-R

Improved native data binding - Data Access Accelerator

• Integrated with JZOS native record binding framework

Improved general performance/throughput

• Up-to 19% improvement to throughput (ODM)

• Up-to 2.4x savings in CPU-time for record parsing batch application

Improved WLM capabilities

Improved SAF and cryptography support

Additional reliability, availability, and serviceability (RAS) enhancements

Enhanced monitoring and diagnostics

http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?infotype=AN&subtype=CA&htmlfid=897/ENUS213-498&appname=USN

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

6

Page 8: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Java-based Store, Inventory and Point-of-Sale App and IBM Java 7R1

� 10% improvement to Java-based Inventory and Point-of-Sale application with IBM Java 7R1 compared to IBM Java 7

(Controlled measurement environment, results may vary)

Java Store, Inventory and Point-Of-Sale Application zEC12 16-way

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

IBM Java 7 IBM Java 7R1

No

rmali

zed

Maxim

um

Op

era

tio

ns p

er

seco

nd

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

7

Page 9: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

IBM Operational Decision Manager and IBM Java 7R1

� 19% improvement to ODM with IBM Java 7R1 compared to IBM Java 7

(Controlled measurement environment, results may vary)

IBM Operational Decision Management zEC12 16-way

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

IBM Java 7 IBM Java 7R1

Th

rou

gh

pu

t

(No

rmali

zed

to

IB

M J

ava 7

SR

4)

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

8

Page 10: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Store your Data - zEnterprise Data Compression and IBM Java 7R1

** IDC: The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East

With IBM Java 7R1 :

Java application to compress files using java.util.zip.GZIPOutputStream class

Up to 91% reduction in CPU time using zEDC hardware versus zlib software

Up to 74% reduction in Elapsed time (not shown)

Compression ratio up-to ~5x

Every day over 2000 petabytes of data are created• Between 2005 to 2020, the digital universe will grow by 300x, going from 130 to 40,000 exa-bytes**• 80% of world's data was created in last two years alone

(Controlled measurement environment, results may vary)

What is it?

� zEDC Express is an IO

adapter that does high

performance industry

standard compression

� Used by z/OS Operating

System components, IBM

Middleware and ISV products

� Applications can use

zEDC via industry

standard APIs (zlib and

Java)

� Each zEDC Express sharable

across 15 LPARs, up to 8

devices per CEC.

� Raw throughput up to 1 GB/s

per zEDC Express Hardware

Adapter

CPU Time for Software versus zEDC Hardware Compression

Using - java.util.zip.GZIPOutputStream Class

-

1,000

2,000

3,000

4,000

5,000

6,000

7,000

8,000

Public Domain Books SVC Dump SMF Data

Compressed Data Files

CP

U T

ime

zlib Software

zEDC Hardware

Size of Compressed Data - Software versus zEDC Hardware

Using - java.util.zip.GZIPOutputStream Class

-

20

40

60

80

100

120

Public Domain Books SVC Dump SMF Data

Compressed Data Files

Fil

e S

ize

in

Me

ga

By

tes

Input File Size

zlib Software

zEDC Hardware

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

9

Page 11: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

RDMA Enables a host to read or write directly from/to a remote host’s memory withoutinvolving the remote host’s CPU

SMC-R automatically/transparently exploits RDMA/RoCE for sockets based TCP applications

Move your Data - Shared Memory Communications

(Controlled measurement environment, results may vary)

SMC-Rz/OS SYSAz/OS SYSB

RoCE

WAS

Liberty

TradeLite DB2

JDBC/DRDA

3 per HTTPConnection

Linux on x

Workload Client Simulator(JIBE)

HTTP/REST

40 ConcurrentTCP/IP Connections

TCP/IP

WebSphere to DB2 communications using SMC-R

40% reduction in overall

Transaction response time! –As seen from client’s perspective

Small data sizes ~ 100 bytes

10

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 12: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Transform your Data - Data Access Accelerator in IBM Java 7R1

A Java library for bare-

bones data conversion

and arithmetic

Operates directly on byte array

No Java object tree created

Orchestrated with JIT for deep platform opt.

Avoids expensive Java object instantiation

Library is platform and JVM-neutral

Current Approach:

byte[] addPacked(array a[], array b[]) {

BigDecimal a_bd = convertPackedToBd(a[]);

BigDecimal b_bd = convertPackedToBd(b[]);

a_bd.add(b_bd);

return (convertBDtoPacked(a_bd));

}

Proposed Solution:

byte[] addPacked(array a[], array b[]) {

DAA.addPacked(a[], b[]);

return (a[]);

}

Marshalling and Un-marshallingTransform primitive type (short, int, long, float, double) � byte array

Support both big/little endian byte arrays

Packed Decimal (PD) OperationsArithmetic: +, -, *, /, % on 2 PD operands

Relation: >,<,>=,<=,==,!= on 2 PD operands

Error checking: checks if PD operand is well-formed

Other: shifting, and moving ops on PD operand

Decimal Data Type ConversionsDecimal � Primitive: Convert Packed Decimal(PD), External

Decimal(ED), Unicode Decimal(UD) �primitive types (int, long)

Decimal � Decimal: Convert between dec. types (PD, ED, UD)

Decimal �Java: Convert dec. types (PD, ED, UD) �BigDecimal, BigInteger

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

11

Page 13: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

DAA – JZOS Medicare Record Benchmark and IBM Java 7R1

31-bit IBM Java 7R1 with DAA versus IBM Java 7 CPU Time improved by 2.4x

64-bit IBM Java 7R1 with DAA versus IBM Java 7 CPU Time improved by 1.9x

http://www.ibm.com/developerworks/java/zos/javadoc/jzos/index.html?com/ibm/jzos/sample/fields/MedicareRecord.html

(Controlled measurement environment, results may vary)

JZOS Medicare Record Parsing Benchmark

0

0.2

0.4

0.6

0.8

1

1.2

31-bit 64-bit

CP

U-t

ime t

o P

ars

e 5

M R

eco

rds

(No

rmali

zed

to

31-b

it I

BM

Java 7

SR

4 w

/o D

AA

)

IBM Java 7 SR4

IBM Java 7R1

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

12

Page 14: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

HCSC Java on z/OS and WebSphere Application Server

on z/OS V8.5.5 User Experience

Page 15: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Health Care Service Corporation (HCSC) is the fourth largest health insurance company in the nation.

Largest customer-owned health insurer in the U.S., founded in 1936, now with more than 14 million members, HCSC operates health insurance Plans in Illinois, Montana, New Mexico, Oklahoma, Texas, and Dearborn National.

We're greater than 21,000 employees strong with 60 local offices and state-of-the-art technology, including two Tier IV data centers – the industry's highest reliability level –that provide the speed and data security to meet our customers' current and future business needs.

14

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

About Health Care Service Corporation

Page 16: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Why WebSphere on z/OS?

15

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

WebSphere on z/OS has been selected at HCSC as a preferred platform to support development and deployment of the Core processing for new Java mission-critical Applications for the following reasons:

� z/OS Hardware, Software, Storage, and Network are all designed for maximum application availability

� WebSphere on z/OS is designed to support very high transactional volume

� WebSphere on z/OS provides highest Quality of Service:- Performance- Scalability- Recovery/failover capability- High Availability- Stability- Manageability- Maintainability - Security/Integrity

� By using WebSphere on z/OS you can minimize the number of physical tiers to get to backend data

� Use of single tier removes Network layer and additional overhead associated with it

� Tight integration with DB2, MQ, and CICS

Page 17: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Features and Technology Unique to z/OS

16

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

� Server Architecture- Control/Servant Region Split

- Multiple Servant Region

� Workload Management- Leverages Workload Manager (WLM)- WLM/RMF integration- Work classified according to importance & performance goals- Work is selected from WLM queue and managed to goal- Provides Failover to available Servers- Automatic servant restart after an outage - Automatic startup of additional servants, as needed, based on Policies

� WebSphere on z/OS Network Deployment Clustering across z/OS LPARs

- Horizontal scaling for increased throughput

- Continuous availability & fail-over

� MQ Queue Sharing using Shared Queues across LPARs and XM memory communication for optimum performance

� DB2 Data Sharing across LPARs, with JDBC Type 4 driver� SYSPlex Distributor - workload management and distribution across multiple systems� Coupling Facility - high-speed inter-system communication, used with MQ Queue

Sharing & DB2 Data sharing� Resource Recovery Services - required for 2-phase commits

� zSeries Application Assist Processor (zAAP) - specialty assist processor dedicated exclusively to execution of Java workloads under z/OS

� Mainframe security

Page 18: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

HCSC WebSphere on z/OS Java Exploitation

� Currently using WASz V8.5.5.1 JVM Build is JRE 1.6.0

� IBM J9 2.6 z/OS s390x-64 Compressed References 20130823_162690

� J9VM - R26_Java626_SR6_Ifix_20130823_2006_B162690

� JIT - r11.b04_20130528_38954ifx6

� GC - R26_Java626_SR6_Ifix_20130823_2006_B162690_CMPRSS

� J9CL - 20130823_162690.

� Next Plan on upgrading to WASz V8.5.5.2 (GA 4/28/14), which adds IBM Java 7R1 (JRE 1.7) support to WASz

IBM SDK Version 7.1 has been released but can't install it as an optional feature to

IBM WebSphere Application Server, version 8.5.5, until V8.5.5.2.

� Enable -Xaggressive with IBM Java7 SR4

� Explore exploitation of zEnterprise Data Compression

� Make use of WASz Health Center, to look for Java tuning opportunities at Application level

17

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 19: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

HCSC WebSphere on z/OS Cell Architecture

18

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 20: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Applications #1-3 - WASz V7.0.27 vs V8.5.5.1Application #1 – MDB Application in 64 bit mode. Compressed reference is set .

� Max Heap size increased from 640M to 1792M, Min Heap size increased from 512M to 1344M.

� Encountered Java OOM after upgrade, had to increase max/min heap size.

� Memory Leak found in WASz JMS code, using ThreadLocals with AlarmManagerThread and not releasing storage. Fixing APAR PI14746 . IBM Flash alert “Memory leak in WAS 8.5.x J2C PoolManager” at http://www-01.ibm.com/support/docview.wss?uid=swg21670448 .

� MDB response times went down 25-30% and throughput increased, using a little more CPU.

Application #2 – Activation Specification Application in 31 bit mode. Writes a message to MQ when DB2 on z/OS update triggers.

� Max Heap size 512M, Min Heap 256M.

� Performs better in 31 bit mode vs 64 bit mode. Performance did not change after WASz V8.5.5.1 upgrade.

Application #3 – MDB Application in 64 bit mode. Compressed reference is set .

� Max Heap size is 2100M, Min Heap size is 1024M.

� 15-20% CPU reduction.

� Being tuned to lower heap size under 2048M and exploit Pageable Large Pages, with Flash Express.

19

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 21: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #1 - WASz V7.0.27 vs V8.5.5.1

20

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

MDB Response times – 25-30% reduction under WASz V8.5.5.1

WASz V7.0.27 31 bit - Max Heap 640M

WASz V8.5.5.1 64 bit - Max Heap 1240M

Page 22: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #1 - WASz V7.0.27 vs V8.5.5.1

21

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Backend DB2 calls Average Response Times –much lower spikes

Page 23: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #1 - WASz V7.0.27 31 bit mode

� Max Heap size set at 640M, Min Heap size set at 512M.

This was our configuration prior to WASz V8.5.5.1 upgrade.

22

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Heap storage usage per servant CPU usage (zAAP & GP)

Page 24: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #1 - WASz V8.5.5.1 64 bit mode

� Max Heap size increased from 640M to 832M, Min Heap size increased from 512M to 640M .

� 20% throughput increase in WASz V7.0.27 vs V8.5.5.1.

� Getting Java Out Of Memory under heavier load and major increase in zAAP usage, due to heavy GC activities. APAR PI14746 resolves this issue.

23

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Heap storage usage per servant zAAP CPU increase due to Java OOM

Page 25: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Memory Leak In Application #1

24

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

ISSUEA memory leak in the J2C PoolManager. Observed major increase in Java Heap size usage, 30% GC overhead and 5-10X CPU usage increase.

� It specifically manifests itself when using the MQ JMS Resource adapter where for each JMS Managed Connection a Session pool and a Connection pool is created.

� When the Managed connection is destroyed due to unused or aged timeouts or the connection is stale, then the associated JMS Session pool should be stopped/destroyed and the reaper alarms associated with the PoolManagershould also be cancelled when the pool is stopped.

� The Session pool is being destroyed, however, the PoolManager instance registered in the alarms never get destroyed and the alarms are repeatedly created and cancelled for every reap cycle, thus leading to the PoolManager objects and its associated reaper alarm objects to stay on the heap forever and potentially leading to OOM conditions.

SOLUTIONInstall APAR PI14746, which is in OPEN status and IBM Level 3 can provide the ifix, which is targetted for inclusion in WAS fixpack 8.5.5.3.

Page 26: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Memory Leak in the J2C PoolManager

25

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

� See below PMAT data, showing GC data analyses from the server where memory leak was observed. System is almost entirely out of free space.

Page 27: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #1 - WASz V8.5.5.1 64 bit mode� Max Heap size set at 832M

26

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Response time & CPU usage

Page 28: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #1 - WASz V8.5.5.1 64 bit mode� Max Heap size increased from 832M to 1280M

27

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Response time & CPU usage

Page 29: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #1 - WASz V8.5.5.1 64 bit mode

� Max Heap size increased from 832M to 1280M

28

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Heap storage usage per servant CPU usage back to normal, NO Java OOM

Page 30: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #1 - WASz V8.5.5.1 64 bit mode

� Max Heap size increased from 832M to 1280M

� Again Java OOM, high GC and high CPU usage after a few days

29

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 31: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #1 - WASz V8.5.5.1 64 bit mode

� Max Heap size increased from 1280M to 1792M

� Steady increase in Heap allocation prior to APAR PI14746 installation.

30

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 32: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #2 - WASz V8.5.5.1 31 Bit vs 64 Bit Mode

31

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Heap storage usage per servant zAAP CPU increase due to Java OOM

WASz V8.5.5.1 CPU usage comparison, Heap size Max=512M, Min=256M

31 bit mode 64 bit mode

Page 33: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #2 - WASz V7.0.27 vs WASz V8.5.5.1 CPU Usage

Activation Specification 31 bit mode Application CPU usage in WASz

V7.0.27 vs WASz V8.5.5.1 - slight increase in CPU usage

32

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

WASz V7.0.27 CPU Usage by Report Class WASz V8.5.5.1 CPU Usage by Report Class3/11/14 3/25/14

Page 34: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #3 - WASz V7.0.27 vs WASz V8.5.5.1 CPU Usage

MDB Application CPU usage in WASz V7.0.27 vs WASz V8.5.5.1

15-20% CPU usage reduction under WASz V8.5.5.1

33

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

WASz V7.0.27 CPU Usage by Report Class WASz V8.5.5.1 CPU Usage by Report Class3/11/14 3/25/14

Page 35: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #4 - WASz V8.5.5.1 64 Bit Mode

APPL4 – Java Application that is being converted from COBOL, running in CICS.

Compressed reference is set with 64 bit mode.

Application uses Spring Framework and does not use currently JPA.

Application has ASYNC batch process and Online HTTP work, using 2 different WASz Clusters. Most of the work is done by lower priority ASYNC Cluster.

After WASz V8.5.5.1 we saw around 28% CPU reduction and we turned off GP cross over, using zAAP capacity instead. We were spilling around 1 GP engine under load with WASz V7.0.27.

We are currently using Health Center to see if there are tuning opportunities at Java level.

34

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 36: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #4 - WASz V8.5.5.1 64 bit mode

Load & Performance Testing

~30% overall CPU reduction comparing WASz V7.0.27 vs tuned WASz V8.5.5.1 with Pageable 1M Large Pages, using Flash Express.

WASz V8.5.5.1 upgrade – 11.6% improvement

Exploiting Pageable 1M Large Pages with Flash Express – 4.4 % improvement

Reduced Max Heap size from 2100M to 2047M – 3% improvement, due to more efficient compressed reference with Heap size < 2G

Increased Min Heap size to 1532M and Java Nursery size to 1023M –10.5% improvement, less GC overhead – global GC scans were taking 1-2 seconds, now only around 300ms.

35

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 37: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Application #4 - WASz V8.5.5.1 64 bit mode

� Prod Configuration - decided to run with the larger Max Heap size to accommodate expected growth in workload. We are expecting 60% increase in volume this year.

Observed ~28% CPU reduction.

36

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

� Prod CPU usage in seconds before and after WASz upgrade and doubling ASYNC JVMs & increasing volume, converting more code from COBOL to Java.

Page 38: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

PMAT and Application #4 Testing

37

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 39: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

WASz V8.5.x Packaging

38

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

� Liberty Profile is a feature of the WebSphere Application Server install and independently installable Installation Manager offering for all WebSphere Application Server editions.

� WebSphere Extreme Scale for z/OS is now included with WebSphere Application Server for z/OS.

� SMP/E FMIDs and files included for:

Page 40: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

WebSphere on z/OS V8.5.5 and Java Installation

39

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Beginning with WASz V8.5.5, WebSphere Application Server provides separate Installation Manager offerings for the Liberty profile, and for Java 7 for use with Liberty.

For more details see: http://publib.boulder.ibm.com/infocenter/ieduasst/v1r1m0/topic/com.ibm.iea.was_v8/was/8.5.5.0/content/WAS855_Install-zOS.pdf?dmuid=20130820122411195350

Page 41: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

WASz V8.5.5 and Java Installation

40

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Check in WASz Admin Console what version of Java you are running and in what mode – 64 bit mode is a DEFAULT now.

Application Servers > server name > Java SDKs

See the instructions in the Information Center to install Java 7: http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/index.jsp?topic=/com.ibm.websphere.installation.zseries.doc/ae/tins_installation_zos_installing_jdk7.html

Page 42: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Why Flash Express?

� Flash Express is a PCIe IO adapter with NAND Flash SSDs that can help you improve availability and performance especially during periods of paging spikes.

� When using small pages (4K pages), paging is less efficient than paging using fewer larger 1 MB pages.

� Cache buffers are used by the operating system to reduce virtual to real address translations. Performance of this translation can be improved through the use of having a greater number of page entries in cache; this is made possible through the use of larger 1MB pages.

� As a result of improved cache hits, exploiters of Pageable Large Pages and Flash Express experience performance improvements both in elapsed time and CPU.

41

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 43: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Large Page Support and Flash Express

� Need to size LFAREA area and PAGESCM with WASz V8.5.x exploitation of Large Page support

� When the JVM is allocating large pages, if a particular Large Page size cannot be allocated, the following sizes are attempted, in order, where applicable:

� 2G nonpageable , 1M nonpageable, 1M pageable, 4K pageable

For example, if 1M nonpageable Large Pages are requested but cannot be allocated, pageable 1M large pages are attempted, and then pageable 4K pages.

� Flash Express is designed to offer exceptional performance for paging spikes by reducing paging latency. Flash can be allocated before the LPARs are

activated and detected by z/OS during IPL, or configured on dynamically after IPL.

� Use TSO RMF to see Storage Memory Objects usage

42

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 44: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Exploitation of Large Page Support

� An example of setup needed to enable 1 Meg Pageable Large Page support –

-Xlp:codecache:pagesize=1m,pageable -Xlp:objectheap:pagesize=1m,pageable� -Xlp:codecache - Requests the JVM to allocate the JIT code cache by using pageable 1M large page sizes.

� -Xlp:codecache:pagesize=1m,pageable - default for Java V7, needs to be set for Java V6.0.1.

� -Xlp:objectheap - Requests the JVM to allocate the Java object heap by using pageable 1M large page sizes

� -Xlp:objectheap:pagesize=1m,pageable - default for Java V7, needs to be set for Java V6.0.1

Note that - Xlp will override the default - Xlp:objectheap:pagesize=1m,pageable.

� If there are no 1M pageable frames available, RSM is acting outside of JAVA so it will do the allocations without JAVA knowing there are no more 1mb Pageable frames since RSM will just allocate more dynamically from either 1mb fixed frames or as a last resort 4k frames.

� To check what page size you are using, look at WAS servant log

43

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Also can use - verbose:sizes set on IBM_JAVA_OPTIONS - displays default Java setting used for that JVM.

<attribute name="pageSize" value="0x1000" /> getting 4K<attribute name="requestedPageSize" value=" 0x1000 " /> requested 4K page size

<attribute name="pageSize" value="0x100000" /> getting 1M page size<attribute name="requestedPageSize" value="0x100000" /> requested 1M page size

Page 45: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Issue to Watch Out For - High CPU with Intelligent Management Enabled

ISSUE

� The Virtual Enterprise product was integrated in WAS V8.5.x and is now referred to as Intelligent Management. It is enabled by default. This might increase idle server CPU time considerably.

� Major increase in the number of TCP/IP SMF 119 records. We have seen the creation of millions of SMF type 119 TCP/IP sub type 1 and 2 SMF records (open and close connections), instead of 1,000s prior to WASz V8.5.5 upgrade.

� Related APAR - http://www-01.ibm.com/support/docview.wss?uid=swg1PM79754

� Review IBM WASz doc "Idle WebSphere Tuning Considerations" -

http://www-01.ibm.com/support/docview.wss?uid=tss1wp101894&aid=1

SOLUTION

� In WAS V8.5.5 a new custom property (LargeTopologyOptimization) was added to disable Intelligent Management for those who do not to use the functionality .

� To configure the Cell custom property via the administrative console go to System Administration, Cell, Configuration, Additional Properties, Custom Properties, and create a new entry with Name

LargeTopologyOptimization, and Value false.

44

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 46: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Issue to Watch Out For - Higher CPU Usage by Spring Applications Using @Async Annotation

ISSUE

A Spring Application calling ApplicationContext.getBean() on a prototype bean using the @Async annotation caused an increase in Spring method calls. This increase in Spring method calls resulted in higher CPU usage when moving to WebSphere V8.0 and above.

CAUSE

WebSphere V8 and above contain interface classes for the EJB 3.1 specification level which includes the support for using the @Asynchronous annotation. Prior to WebSphere V8.0 Spring was only searching for the @Async annotation. At WebSphere V8.0 and above Spring was searching for both @Async and @Asynchronous annotations. The additional searching for the added support for @Asynchronous was the cause of the higher CPU.

SOLUTION

� Change the logic to Cache the bean instance returned from ApplicationContext.getBean() call.

� Remove annotation-driven @Async searches by Spring. This was accomplished by removing the configuration option "task:annotation-driven“

More info at http://www-01.ibm.com/support/docview.wss?uid=swg21648523&acss=danl_335_email

45

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 47: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Issue to Watch Out - Memory Leak Using ThreadLocals

� ThreadLocal variables don't work well with thread pools in J2EE environment

We observed memory leaks in Native Storage caused by using ThreadLocal variables. ThreadLocals are only garbage collected if their owning thread is destroyed. Thread pooling in WebSphere Application Server keeps threads alive indefinitely, and as such, ThreadLocal variables remain alive even after the application is stopped. This problem is compounded by ThreadLocal variables that consume native storage, such as classloaders.

� Best coding practice recommendations

To avoid storage leaking (native or heap):� Use ThreadPool threads, which are managed by WebSphere on z/OS� Avoid the use of ThreadLocal variables� Clear all ThreadLocal variables before returning control from an EJB or Servlet invocation

� An example of the error to watch out for -

46

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

FFDC Exception:java.lang.Exception SourceId:com.ibm.ejs.j2c.PoolManager$2 ProbeId:50 Reporter:[email protected]: WSThreadLocal: instance count = 200: Potential memory leak; verify usage.

Page 48: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Useful Links

Recommended fixes for WebSphere Application Serverhttp://www-01.ibm.com/support/docview.wss?uid=swg27004980

Which version of WebSphere MQ is shipped with WebSphere Application Server?

http://www-01.ibm.com/support/docview.wss?uid=swg21248089

Knowledge Collection: Migrating to WebSphere Application Server V8.5

http://www-01.ibm.com/support/docview.wss?uid=swg27008727

Introduction to Flash Express Improving Availability with

Flash Expresshttp://public.dhe.ibm.com/common/ssi/ecm/en/zsl03189usen/ZSL03189USEN.PDF

The Flash Express Feature on IBM zEnterprise EC12 and z/OS exploitation of flash storage

http://www-03.ibm.com/systems/resources/flash.pdf

47

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 49: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Join your local WUG - http://www.websphereusergroup.org/

Join Chicago North-West Integration and Cloud Computing WebSphere User Group, that Cindy Schmoeller (CSC) and Elena Nanos

(HCSC) are leading - http://www.websphereusergroup.org/chicagonw

Join us for annual on-site meeting on June 12th, 2014 in Chicago area

Tentative Agenda

� WebSphere Application Server V8.5.x update - Paul Lucas (IBM)

� WebSphere Liberty Profile and demo - Bill Killer (IBM)

� WebSphere MQ Next.x update - Mitch Johnson (IBM)

� Demo – Watch how CICS applications can be integrated with Portal to create an intuitive, "point and click," mobile front end - Chris Ganim (IBM)

� Advantages of a Private Cloud on zEnterprise - Michael J. Casile (IBM)

� Demos - CSL Wave, DB2 Analytics Adapter performance demonstration and Mobile - Michael J. Casile (IBM)

� Modernized Digital Experiences on System z - Chris Ganim (IBM)

Global WebSphere Community

48

Session #1523 - Performance optimization using IBM Java on z/OS and WAS on z/OS V8.5.5

Page 50: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Questions?

Page 51: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

We Value Your Feedback

Don’t forget to submit your Impact session and speaker feedback! Your feedback is very important to us – we use it to continually improve the conference.

Use the Conference Mobile App or the online Agenda Builder to quickly submit your survey

• Navigate to “Surveys” to see a view of surveys for sessions you’ve attended

50

Page 52: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Thank You

Page 53: Impact2014  session # 1523 performance optimization using ibm java on z and wa sz v8.5.5  final 04 21_14

Legal Disclaimer

• © IBM Corporation 2014. All Rights Reserved.• The information contained in this publication is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information contained

in this publication, it is provided AS IS without warranty of any kind, express or implied. In addition, this information is based on IBM’s current product plans and strategy, which are subject to change by IBM without notice. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this publication or any other materials. Nothing contained in this publication is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software.

• References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or capabilities referenced in this presentation may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken by you will result in any specific sales, revenue growth or other results.

• If the text contains performance statistics or references to benchmarks, insert the following language; otherwise delete:Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.

• If the text includes any customer examples, please confirm we have prior written approval from such customer and insert the following language; otherwise delete:All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics may vary by customer.

• Please review text for proper trademark attribution of IBM products. At first use, each product name must be the full name and include appropriate trademark symbols (e.g., IBM Lotus® Sametime® Unyte™). Subsequent references can drop “IBM” but should include the proper branding (e.g., Lotus Sametime Gateway, or WebSphere Application Server). Please refer to http://www.ibm.com/legal/copytrade.shtml for guidance on which trademarks require the ® or ™ symbol. Do not use abbreviations for IBM product names in your presentation. All product names must be used as adjectives rather than nouns. Please list all of the trademarks that you use in your presentation as follows; delete any not included in your presentation. IBM, the IBM logo, Lotus, Lotus Notes, Notes, Domino, Quickr, Sametime, WebSphere, UC2, PartnerWorld and Lotusphere are trademarks of International Business Machines Corporation in the United States, other countries, or both. Unyte is a trademark of WebDialogs, Inc., in the United States, other countries, or both.

• If you reference Adobe® in the text, please mark the first use and include the following; otherwise delete:Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.

• If you reference Java™ in the text, please mark the first use and include the following; otherwise delete:Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

• If you reference Microsoft® and/or Windows® in the text, please mark the first use and include the following, as applicable; otherwise delete:Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other countries, or both.

• If you reference Intel® and/or any of the following Intel products in the text, please mark the first use and include those that you use as follows; otherwise delete:Intel, Intel Centrino, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

• If you reference UNIX® in the text, please mark the first use and include the following; otherwise delete:UNIX is a registered trademark of The Open Group in the United States and other countries.

• If you reference Linux® in your presentation, please mark the first use and include the following; otherwise delete:Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of others.

• If the text/graphics include screenshots, no actual IBM employee names may be used (even your own), if your screenshots include fictitious company names (e.g., Renovations, Zeta Bank, Acme) please update and insert the following; otherwise delete: All references to [insert fictitious company name] refer to a fictitious company and are used for illustration purposes only.

52