Post on 08-May-2015
description
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
caucho® , resin® and quercus® are registered trademarks of Caucho Technology, Inc.
Resin Health SystemBeyond Java Monitoring and Server Monitoring
Health Checks, Watchdog and Snapshot Report
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
caucho® , resin® and quercus® are registered trademarks of Caucho Technology, Inc.
Gartner names Cauchoin "Cool Vendors in Platformand Integration Middleware"Java EE Certified
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Health System (RHS) Overview
• Resin Health System (RHS)
• Goes Beyond Just Monitoring Server and JVM
• can respond to conditions with actions
• Actions can remediate problems
• If server about to go down
• due to bug, denial of service, or spike
• RHS triggers diagnostics then restarts
• Resin Application Server keeps running
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
RHS : Reliability and System Transparency
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
RHS born from need
• Idea for RHS came from doing Resin support
• Thread lock? Can you do a thread dump when you see the problem?
• Running out of memory? Can you do a heap dump?
• How is your machine configured? What version?
• What OS?
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
RHS By Engineers for Engineers
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Major features of Resin Health System (RHS)
• Ability to respond to problems
• Detect JVM and OS issues
• Avoid zombie processes
• Restarts Resin if there are major problems
• Internal monitoring
• Resin Internal WatchDog Thread
• Watchers internal meters for problems
• Periodic Thread
• External Monitoring
• Resin WatchDog Process
• Uses process control, socket connection and periodic ping to determine up time status
• Advanced Reporting PDF
• Post-mortem analysis
• Thread Dump/Log Dump
• Meters and Graphs
• Heap Dump
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
RHS Tracks Metrics
• Metrics are things like Available Memory, Number of Requests Per Minute, Garbage Collection Time, CPU Load, etc.
• Metrics can be graphed
• Tracks Historical Data for Trends
• Can determine Anomalies
• Can determine Trends
• Can compare current data with baseline data
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
VisualizaFon
• You can view data that Health System collects
• Resin Web Admin
• Watchdog Report
• Post mortem PDF Report
• Snapshot Report
• PDF Report you can generate anytime
• Trigger: CLI, REST, Through Web Admin
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
RHS and Web Admin
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
RHS: Health Checks
• RHS is highly configurable
• Similar to the Resin's "URL Rewrite" rules
• Rules are configurable
• checks,
• conditions,
• actions
• Internal Watchdog periodic checks
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Watchdog process
• Lightweight process : Used to stop and start Resin instances
• Can restart an instance if Java Monitoring / Server Monitoring / Health issue
• Parent process of Resin Server
• Opens socket to Resin Server
• Sends are-you-alive ping?
Watchdog Process
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Watchdog Non Stop Mode
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Watchdog Non Stop Mode
• Resin is resilient
• If a Denial of Service or unexpected Spike or Bug knocks down JVM, Resin restarts
• Beyond that Resin can detect critical problems and do critical diagnostics so DevOps and Developers can get to root of problem
• Resin long been product of choice for embedded devices, network appliances and large deployments
• Non Stop mode makes Resin perfect for cloud deployments
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Watch-Dog
Watchdog Process
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Watch-Dog
Watchdog Process
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Watch-Dog
Resin
Process Ownership
TCP Link
Starting Resin
Watchdog Process
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Watch-Dog
Resin
Process Ownership
TCP Link
Non-Stop Up State
Watchdog Process
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Watch-Dog
Resin
Process Ownership
TCP Link
Non-Stop Up State
Watchdog Process
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Watch-DogNon-Stop Up
State
Watchdog Process
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Internal Watchdog Thread Inside of Resin
Resin Process
Watchdog ProcessResin Health SystemWatchdog Thread
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Internal Watchdog Thread Inside of Resin
Resin Process
Watchdog ProcessResin Health SystemWatchdog Thread
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Internal Watchdog Health Thread
Resin Health SystemWatchdog Thread
• Runs inside of Resin Server
• Runs periodically
• Collects data
• Collects baseline data
• Executes series of checks
• Recheck failed conditions
• Perform actions when conditions are critical or fatal
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Java CDI / CanDI and Resin Conf based
• RHS configuration extends Resin configuration file resin.xml
• RHS uses CanDI (Resin’s Java CDI)
• create and update Java objects,
• XML tags exactly matches either a Java class or a Java property
• CanDI means classes and config is in JavaDocs
• Use HealthSystem JavaDoc
• Use JavaDoc of the various checks, actions,
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Java Doc / XML conf of RHS
• Startup delay : wait for baselined date before recording
• Period: how often to check metrics
• Recheck period: if some level has been crossed how often should RHS recheck to see if better
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Types of Health Checks
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Health Checks produce Status
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Checks and Responds
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Health System AcFons
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
AcFons based on condiFon
• Actions can be grouped
• If in critical state for two minutes perform group of actions
• Dump JMX values, Dump Threads, Dump Heap, CPU Profile, Restart
• If actions longer than 10 m, restart
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Collect data needed to diagnose
• When something goes wrong
• Denial of Service Attack
• Application Bug
• Unexpected Spike
• RHS collects metrics you need to diagnose problem
• Without collection, you are flying blind
Bug
Denial ofService
Spike
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
AcFons beQer than just watching
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Watch dog report (PDF)
• Post Mortem Report
• Environment Info
• Server Metrics
• JVM Metrics
• Thread Dump
• Heap Dump
• Metrics Graph
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Watch dog report (PDF)
• Post Mortem Report
• Environment Info
• Server Metrics
• JVM Metrics
• Thread Dump
• Heap Dump
• Metrics Graph
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Environment Data
• Collect critical information about environment
• When,
• What OS,
• What version of Resin
• How did Resin startup
• And much more
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Health Status
• Status of Health Checks in Report
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Recent Errors and Warnings
• Recent Errors and Warnings
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Anomalies
• Health Checking stores baseline
• Anomalies are configurable triggers based on large changes from expected baseline
• Anomaly detection is configurable can trigger actions
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Understanding Anomaly DetecFon
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Understanding Anomaly DetecFon
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Types of Metric Graphs in Report
• Cluster Status
• Request Count
• Request Time
• HTTP Request Errors
• Log Warnings
• Threads
• CPU Usage
• Database Connection Active
• Database Query Time
• NetStat
• JVM Memory
• Heap Used
• Tenured Used
• PermGen Used
• GC Time
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Sample Graphs Memory and GC Time
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Sample Metric Graphs Request
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
GC and Memory Metrics
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Heap Dump
• Heap dump critical for tracking down memory leaks
• Also generates hprof file which can be analyzed by many third party tools
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
CPU Profile / Thread Dumps
• Critical for debugging thread deadlock issues
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Snapshot report
• Reports same type of data as watchdog
• Watchdog report is a post-mortem analysis
• Snapshots are whenever you feel like
• e.g., during a stress test
• trigger via REST, CLI and Web Admin
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Conclusion
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
More Background Info About Health System
• Resin Health System : Java Monitoring and Server Monitoring built into Resin Application Server
• Resin Health System : Current and Into the Future
• Resin Application Server Fulfills Vision of Cloud Computing
• Resin Health System Enhancements
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
More Info
• Caucho Technology | Home Page
• Resin | Application Server
• Resin | Java EE Web Profile Application Server
• Resin - Cloud Support | 3G - Java Clustering
• Resin | Java CDI | Dependency Injection / IoC
• Resin - Health System | Java Monitoring and Server Monitoring
• Download Resin | Application Server
• Watch Resin | Application Server Featured Video
Caucho Home | Contact Us | Caucho Blog | Wiki | Application Server / Web ServerCopyright (c) 1998-2012 Caucho Technology, Inc. All rights reserved.
Resin Java ApplicaFon Server