ThreadLogic-v0.9

13
We'll do the analysis for you! Thread Dump Analysis is a key tool for performance tuning and troubleshooting of Java based applications. The current set of TDA tools (Samurai/TDA) dont mine the thread dumps or provide a more detailed view of what each thread is doing while just limiting themselves to reporting the state (locked/waiting/running) or the lock information. Most tools dont mention the type of activity within a thread, should it be treated as normal or deserving a closer look? Can a pattern or anti-pattern be applied against them? Any possible optimizations? Are there any hot spots? Any classification of threads based on their execution cycles? We decided to create ThreadLogic to address these deficiencies, by forking from the existing open source TDA version 2.2 instead of reinventing the wheel, leveraging the capabilities of TDA to parse the thread dumps and handle the UI. Eric Gross build the support for JRockit (support was partial for JRockit in base TDA v2.2) and IBM JVM Thread dumps. Sabha Parameswaran added analytics - grouping of threads based on functionality and tagging of threads with advisories using pre-defined rules and patterns which can be extended to handle additional patterns. In-depth handling and analysis of WebLogic Server Thread dumps is built into the tool. We wish to thank Ingo Rockel, Robert Whitehurst and numerous others who had contributed to the original TDA which allowed us build on their work in delivering a more powerful tool for the entire Java community. Once a thread dump is parsed and threads details are populated, each of the thread is then analyzed against matching advisories and tagged appropriately. The threads are also associated with specific Thread Groups based on functionality or thread group name. Both the advisories and grouping are managed via xml definition files which can be modified or extended. Each of the advisory has a health level indicating severity of the issue found, pattern, name, keyword and related advice.

Transcript of ThreadLogic-v0.9

Page 1: ThreadLogic-v0.9

WWWWWWWWeeeeeeee''''''''llllllllllllllll ddddddddoooooooo tttttttthhhhhhhheeeeeeee aaaaaaaannnnnnnnaaaaaaaallllllllyyyyyyyyssssssssiiiiiiiissssssss ffffffffoooooooorrrrrrrr yyyyyyyyoooooooouuuuuuuu!!!!!!!!

Thread Dump Analysis is a key tool for performance tuning and troubleshooting of Java based applications. The current set of TDA tools (Samurai/TDA) dont mine the thread dumps or provide a more detailed view of what each thread is doing while just limiting themselves to reporting the state (locked/waiting/running) or the lock information. Most tools dont mention the type of activity within a thread, should it be treated as normal or deserving a closer look? Can a pattern or anti-pattern be applied against them? Any possible optimizations? Are there any hot spots? Any classification of threads based on their execution cycles? We decided to create ThreadLogic to address these deficiencies, by forking from the existing open source TDA version 2.2 instead of reinventing the wheel, leveraging the capabilities of TDA to parse the thread dumps and handle the UI. Eric Gross build the support for JRockit (support was partial for JRockit in base TDA v2.2) and IBM JVM Thread dumps. Sabha Parameswaran added analytics - grouping of threads based on functionality and tagging of threads with advisories using pre-defined rules and patterns which can be extended to handle additional patterns. In-depth handling and analysis of WebLogic Server Thread dumps is built into the tool. We wish to thank Ingo Rockel, Robert Whitehurst and numerous others who had contributed to the original TDA which allowed us build on their work in delivering a more powerful tool for the entire Java community. Once a thread dump is parsed and threads details are populated, each of the thread is then analyzed against matching advisories and tagged appropriately. The threads are also associated with specific Thread Groups based on functionality or thread group name. Both the advisories and grouping are managed via xml definition files which can be modified or extended. Each of the advisory has a health level indicating severity of the issue found, pattern, name, keyword and related advice.

Page 2: ThreadLogic-v0.9

Samples of advisories: Thread Advisory Name WLS JMS Paging

Health Level FATAL

Keyword MessageHandle.setPagingInProgress

Description WebLogic JMS paging messages to disk

Advice

WLS has started paging messages to disk as consumers cannot keep up with producers and messages have started accumulating; Increase, speed or tune consumers or Introduce flow controls/quotas to slow down producers and Inflow rates. Or Increase number of servers to spread the load.

Thread Advisory Name Web Application Bottleneck

Health Level WARNING

Keyword WebLayerBlocked

Description Web Application is waiting for an Event

Advice

Web Application should not go into WAIT state as it means the end user would have to wait for indeterminate time for a synchronous response, change the code or logic to return the results or response right away instead of blocking or waiting for an event.

Each of the advisory gets triggered based on either call execution patterns observed in the thread stack or presence of other conditions (thread blocked or multiple threads blocked for same lock can trigger BlockedThreads Advisory). Sometimes a thread might be tagged as IGNORE or NORMAL based on its execution logic or might be tagged more specifically as involved in JMS send or receive client or a Servlet thread. The advisories are generated based on an advisory xml map that is extensible. The health levels (in descending of severity) are FATAL (meant for Deadlocks, STUCK, Finalizer blocked etc), WARNING, WATCH (worth watching), NORMAL and IGNORE. Based on the highest severity of threads within a group, that health level gets promoted to the Thread Group's health level and same is repeated at the thread dump level. There can be multiple advisories tagged to a Thread, Thread Group and Thread Dump.

<Advisory>

<Name>EOF Exception in socket read</Name>

<Health>WARNING</Health>

<Keyword>SocketMuxer.deliverEndOfStream</Keyword>

<Descrp>WLS Muxer got an abrupt End of Stream while reading from a Socket</Descrp>

<Advice>Check for connection disruptions between Server and Client (or other server

instances)</Advice>

</Advisory>

Page 3: ThreadLogic-v0.9

Snapshot of Advisory Map

Snapshot of Threads tagged with advisories in the thread dump

Threads in a thread dump tagged with Advisories/Health Levels

Page 4: ThreadLogic-v0.9

Thread Groups Summary The threads are associated with thread groups based on the functionality or thread names. Additional patterns exists to tag other threads (like iWay Adapter, SAP, Tibco threads) and group them. The summary page reports on health level of the group, total number of threads, threads that are blocked, critical advisories found etc. The grouping is managed by group definition xml files that specify pattern for matching threads to specific groups. The grouping can be a simple group (match a set of patterns) or complex (include some groups while exclude others). A set of advisories can also be referred as ignorable or excluded for determining the health of a thread or group.

<SimpleGroup>

<Name>Oracle Service Bus (OSB)</Name>

<Visible>true</Visible>

<Inclusion>true</Inclusion>

<MatchLocation>stack</MatchLocation>

<PatternList>

<Pattern>com.bea.wli.sb.transports</Pattern>

<Pattern>com.bea.wli.sb.pipeline</Pattern>

</PatternList>

</SimpleGroup>

<ComplexGroup>

<Name>Oracle AQ Adapter</Name>

<Visible>true</Visible>

<Inclusions>

<SimpleGroupId>Oracle AQ AdapterTemp</SimpleGroupId>

</Inclusions>

<Exclusions>

<SimpleGroupId>Oracle SOA DFW</SimpleGroupId>

</Exclusions>

<ExcludedAdvisories>

<AdvisoryId>Database Query Execution</AdvisoryId>

<AdvisoryId>Socket Read</AdvisoryId>

</ExcludedAdvisories>

</ComplexGroup>

Page 5: ThreadLogic-v0.9

Thread Groups Summary

Critical Advisories per thread group The critical advisories (at Warning/Fatal health levels) found in individual threads are then promoted to the parent thread group and reported in the thread group summary page.

Critical Advisories for Thread Group

Page 6: ThreadLogic-v0.9

Thread Groups One can see the thread groups are divided into two buckets - WLS and non-WLS related threads. The JVM threads, LDAP and other unknown custom threads go under the non-WLS bucket while all the WLS, Muxer, ADF, Coherence, Oracle, SOA, JMS, Oracle Adapter threads are all under the WLS bucket. The classification can be changed by modifying the GroupsDefn xml files.

Page 7: ThreadLogic-v0.9

Individual Thread tagging with Advisories Clicking on the individual threads will display the advisories and thread stack.

Advisories and details at thread level The details of the advisory will pop up on mouse over on the advisory links.

Page 8: ThreadLogic-v0.9

The Advisories are color coded and details can be highlighted.

Color coded advisories for individual threads

Sub-groups are also created within individual Thread Groups based on Warning Levels, Hot call patterns (multiple threads executing same code section), threads doing remote I/O (socket or db reads) etc.

Page 9: ThreadLogic-v0.9

Following snapshot shows example of a Hot call pattern where multiple threads are executing the same code path (all are attempting to get lock a Queue instance).

Hot Call Pattern - multiple threads exhibiting similar code execution

Page 10: ThreadLogic-v0.9

Dynamic Filtering based on Thread Health Its also possible to just view a subset of threads based on health levels by using the top level Minimum Health Level option.

Threads at IGNORE or higher health levels

Threads at FATAL health level

Page 11: ThreadLogic-v0.9

Merging of threads across multiple thread dumps and reporting of progress in the thread state Merge has been enhanced to report on the progress of the thread across the thread dumps. Based on the order of the thread dumps, the thread stack traces are compared for every consecutive thread dump.

Merged view showing progress information for individual threads

Page 12: ThreadLogic-v0.9

Merged reporting of individual thread stack traces (exists from base TDA version 2.2).

Merged Thread stack traces across thread dumps

Merging can also be done across multiple thread dump log files (like in case of IBM which creates new log file containing the thread dump every time a request is issued).

Page 13: ThreadLogic-v0.9

Usability benefits of ThreadLogic Thanks to the advisories and health levels, its easy for users to quickly understand the usage patterns, hot spots, thread groups, as well as highlight the patterns or anti-patterns already implemented in the advisory list. For example of an anti-pattern: a Servlet thread should not be waiting for an event to occur as this will translate to bad performance for end user. Similarly usage of synchronous jms consumers might be less performant compared to using async consumers. Too many WLS Muxer threads is not advisable. If WLS Muxer or Finalizer threads are blocked for unrelated locks, this will be a fatal condition. It would be okay to ignore STUCK warning issued by WLS Server for Pollers like AQ Adapter threads but not for other threads that are handling servlet request. The thread groups help in bunching together related threads; so SOA Suite users can see how many BPEL Invoke and Engine threads are getting used, B2B users can see number of JMS consumers/producers, WLS users can look at condition and health of Muxer threads, similarly for JVM/Coherence/LDAP/other thread groups. The merged report lets the user see at a glance the critical threads and check if they are progressing or not instead of wading through numerous threads and associated thread dumps. We hope this tool can really help both beginners and experts do their jobs more quickly and efficiently when it comes to thread dumps.