Alfresco Effective Testing · Alfresco Best Practices – Alfresco Effective Testing AET ... If we...

Alfresco Best Practices – Alfresco Effective Testing AET – Alfresco Effective testing

[email protected]

2016-09-15

page 1/32 © 2016

Alfresco Effective Testing

Best Practices by Alfresco Consulting


[email protected]

2016-09-15

page 2/32 © 2016

1 Overview .......................................................................................................................... 4

2 What is Effective Performance Testing (AET)? ...................................................... 4

3 Change history ............................................................................................................... 5

4 AET Strategy ................................................................................................................... 64.1 Test Types .................................................................................................................. 6

4.1.1 Load Tests ........................................................................................................ 74.1.1.1 Baseline tests ...................................................................................... 84.1.1.2 Benchmark ........................................................................................... 84.1.1.3 Concurrency 50,75,100 ..................................................................... 8

4.1.2 Stress Tests ...................................................................................................... 84.1.3 Stability Tests ................................................................................................... 8

4.2 Monitoring Plan ....................................................................................................... 104.2.1 Application Monitoring Plan ......................................................................... 10

4.2.1.1 Application Server Monitoring ........................................................ 114.2.1.2 Java Virtual Machine Monitoring .................................................... 11

4.2.2 External Components Monitoring Plan ...................................................... 124.2.2.1 Database Monitoring targets .......................................................... 124.2.2.2 Network Monitoring targets ............................................................ 134.2.2.3 File systems Monitoring targets ..................................................... 13

4.2.3 Monitoring Data collection agent ................................................................ 134.2.4 Logs collection agent .................................................................................... 13

4.3 Application usage patterns .................................................................................... 154.3.1 Work with the business to Identify application usage patterns ............. 15

4.3.1.1 Determine Key Usage Scenarios ................................................... 154.3.2 Use existing production log data when applicable .................................. 164.3.3 Document the application usage patterns ................................................ 164.3.4 Identify Performance test targets ................................................................ 164.3.5 Re-Engage with Business ............................................................................ 16

4.4 Load Test Infrastructure and testing toolkit ........................................................ 174.4.1 Load Test Infrastructure ............................................................................... 174.4.2 Performance Test tool .................................................................................. 17

4.4.2.1 What is the best tool for your use case? ...................................... 174.4.2.2 Work with real browsers if your application does ....................... 184.4.2.3 Alfresco Benchmark Framework .................................................... 194.4.2.4 Apache JMeter .................................................................................. 194.4.2.5 HP Load Runner ................................................................................ 19

4.5 Test Scenarios Definition ....................................................................................... 204.5.1 Target Environment ....................................................................................... 204.5.2 Test Parameters ............................................................................................. 21

4.5.2.1 About think time ................................................................................ 21


[email protected]

2016-09-15

page 3/32 © 2016

4.5.3 Execution Strategy ........................................................................................ 214.5.3.1 Execution Strategy Golden Rules .................................................. 22

4.5.4 Test Scripts Development ............................................................................ 224.5.4.1 Use Visual Models ............................................................................ 224.5.4.2 Define Success Criteria’s ................................................................ 22

4.6 Reporting Templates .............................................................................................. 234.6.1 End User response times ............................................................................. 244.6.2 Resource Utilizations .................................................................................... 244.6.3 Volumes Capacity and Rates ...................................................................... 244.6.4 Component Response Times ...................................................................... 244.6.5 Trends .............................................................................................................. 25

5 Tuning using the test cycle wheel ........................................................................... 25

6 Appendix A .................................................................................................................... 266.1 Data Collection ........................................................................................................ 26

6.1.1 Server Processes ........................................................................................... 266.1.2 The Swiss Army Knife ................................................................................... 276.1.3 JVM Heap Memory Usage ........................................................................... 276.1.4 Java Stack Traces ......................................................................................... 276.1.5 Database Slow Queries ................................................................................ 286.1.6 Document Transformations ......................................................................... 286.1.7 Solr Searches ................................................................................................. 286.1.8 HTTP Requests .............................................................................................. 296.1.9 Log files ........................................................................................................... 29

6.2 Extra Considerations for your testing strategy ................................................... 306.2.1 Network Latency ............................................................................................ 306.2.2 Traffic Load Balancing .................................................................................. 306.2.3 Java Profiling .................................................................................................. 30

6.3 Sample Monitoring Plan based on Elastic Search ............................................ 306.4 Using JavaMelody on your Monitoring Plan ....................................................... 31


[email protected]

2016-09-15

page 4/32 © 2016

1 Overview Effectively Testing an application for performance is one of the foundations of a healthy and performant implementation. Introducing a reproducible and scalable way to execute Performance testing supports the growth and the maturity of the application.

This document defines a practice on how to execute performance testing for Alfresco based applications. It was designed with Alfresco in mind, but the concAETs and performance testing strategies of this methodology can be applied to other technologies.

AET will help your application overcome your performance goals

2 What is Alfresco Effective Testing (AET)? AET is a methodology that defines how to manage, design and execute Performance Testing of Alfresco based business applications. The aim is determining how a system performs in terms of responsiveness and stability under controlled and expected usage and concurrency scenarios. It also acts as a valuable tool to investigate, measure, validate or verify other quality attributes of the system, such as scalability, reliability and resource usage. Testing application performance should be initiated as early as possible in the application development lifecycle. To comprehensively assure performance quality, a holistic layered approach should be used to integrate protocol level performance testing - traditional performance and load testing with application level performance testing, resource monitoring and usage patterns data collection.

AET represents our perspective on how performance testing should be conducted in order to achieve best results and to get the most value of performance testing by integrating it in the lifecycle of your project.


[email protected]

2016-09-15

page 5/32 © 2016

3 Ownership of AET projects

Implementing AET will require a tight collaboration of resources and input from many different teams within the target company. For this reason, the ownership of the performance project should be a responsibility a designated customer management team.

The AET management team is responsible for:

• articulate the communication between the several teams

• make sure all necessary resources are provisioned

• participate and provide input on all stages of the implementation

• make sure all involved teams are available and can collaborate on request within expected timeframes

• setup and maintain a continuous project communication plan

4 Change history Version Date Author Reviewed by Approved by Change history

0.1 2016-09-13 Luis Cabaceira

<name> <name> First draft


Methodology wireframes, general idea


Consolidation, visual support


Fine tuning, corrections


[email protected]

2016-09-15

page 6/32 © 2016

5 AET Strategy AET Strategy is composed by a well defined set of working items.

5.1 Test Types Every application should define and execute 3 types of tests as part of its performance test strategy. AET defines 3 root types of tests

• Load Tests • Stress Tests • Stability and Endurance tests


[email protected]

2016-09-15

page 7/32 © 2016

For every application being tested AET defines that every test run should be implemented by executing 3 cycles of load tests followed by stress tests and a endurance test run (12 hours execution).

5.1.1 Load Tests Load tests contain 5 independent executions and test result outputs. They include an initial baseline test run, followed by a benchmark test run (25% target load) and 3 other executions (50,75 and 100% of target load)


[email protected]

2016-09-15

page 8/32 © 2016

5.1.1.1 Baseline tests Baseline test simulates a single user test which is used to identify the correctness of the test scripts and readiness of the system before running target load tests. Baseline tests also check whether the application meets the expected response times for 1 user load. Baseline tests are often used to benchmark newer versions of the codebase to validate performance improvements of crucial areas of the application.

5.1.1.2 Benchmark The benchmark cycle tests the application with 25% of the target load. This guarantees that the system is ready to receive higher levels of concurrency.

5.1.1.3 Concurrency 50,75,100

Irrespective of doing a slow user ramp, we should execute 3 individual load tests for 50%, 75% and 100% of the target load. The load level should be adjusted based on system performance and existing target load levels).

5.1.2 Stress Tests

Stress testing is used to understand the upper limits of capacity within the system.

Based on the Load tests results, we slowly increase the server load step by step and find out the server break point. For this test, realistic think time settings and cache settings are not required as the objective of this test to know the server break point and how it fails.

This kind of test is done to determine the system's robustness in terms of extreme load and helps application administrators to determine if the system will perform sufficiently if the current load goes well above the expected maximum.

Stress tests allow to determine what are the components that fail first and to identify the bottlenecks of the application.

5.1.3 Stability Tests


[email protected]

2016-09-15

page 9/32 © 2016

“Run Forest, Run”

Stability and Endurance tests for the application are done to determine if the system can sustain the continuous expected load. During stability tests, memory and critical resource utilization is monitored to detect potential leaks. Also important, but often overlooked is performance degradation, i.e. to ensure that the throughput and/or response times after some long period of sustained activity are as good as or better than at the beginning of the test. It essentially involves applying a significant load to a system for an extended, significant period of time. The goal is to discover how the system behaves under sustained use. Stability Tests should be run for at least 10-12 hours in order to identify memory bottlenecks, garbage collection problems and resource leaks. Stability tests are run with average load levels (50%)


[email protected]

2016-09-15

page 10/32 © 2016

5.2 Monitoring Plan It’s very important to have a well defined monitoring plan aiming to verify the performance of the most important components (hardware and software) of the solution.

5.2.1 Application Monitoring Plan The monitoring plan allows us to track and store relevant system metrics and events that can help on: • Troubleshooting possible problems • Verify system Health and resource outage • Check user behavior impact in specific components • Predict future scalability necessities We should be monitoring all relevant layers of the application, producing data on all critical aspects of the infrastructure. This will allow a pro-active system administration opposed to a reactive way of facing possible problems. We will be able to predict problems before they happen and take the necessary measures to maintain a healthy system on all layers.


[email protected]

2016-09-15

page 11/32 © 2016

5.2.1.1 Application Server Monitoring

In the application server (normally Tomcat) we need to monitor the following:

• Requests, response times, sessions and response types (i.e. 200, 400, 500 response types)

• Cpu Utilization • Disk IO • Physical Memory Utilization (Free, Used, Cached and Buffers) • Network IO • Threads (Concurrent Threads, Busy Threads) • Connection Pool / Data pool when applicable

5.2.1.2 Java Virtual Machine Monitoring

At the JVM level we should monitor the following:

• Garbage collection monitoring (gc.log) • Caches Monitoring (Caches usage, invalidation, cache sizes) • JVM Memory/CPU utilization


[email protected]

2016-09-15

page 12/32 © 2016

• Cluster member subscription analysis (subscription analysis) • Cluster cache invalidation strategy and shared caches performance

5.2.2 External Components Monitoring Plan

If we ignore the performance of the database, file systems and network that supports the application, user's experience will suffer. Setting and monitoring performance baselines for these external components (as standalone units under test and as part of the system) through roll out is absolutely essential. This may include running production system stress tests to ensure the external components can handle the new data loads, setting thresholds to avoid inefficient or poorly performing operations, and tracking real user response times to ensure a consistent user experience throughout the roll-out process. We also take an end-to-end approach but spelling out the consistent elements that need to be testing will ensure you have covered all of the bases. These are the 3 main external components we need to have defined a monitoring plan:

1. Database Performance (CPU, Memory, IO, Slow queries, …) 2. File systems performance (Indexes and Content Store) 3. Network performance (Load Balancers algorithms, SSL, SSO, Network latency)

We should watch server and component performance for at least 5 iterations at the same load level (during stable load period) and use the 90th percentile response time to report the response time metrics. Note that any externally generated traffic such as injestion nodes, LDAP sync should be monitored.

5.2.2.1 Database Monitoring targets The database represents a very important factor for the overall performance therefore, active monitoring is necessary during the performance tests. There should be a process that is responsible to collect database monitoring information during the tests cycle executions.


[email protected]

2016-09-15

page 13/32 © 2016

• Transactions Per second • Number of Connections • Slow Queries • Query Plans (this is only needed when the slow queries have been

identified rather than during performance testing) • Database server earth (CPU, memory, disk IO, network IO) • Database statistics integration

5.2.2.2 Network Monitoring targets The network should be monitored during the execution of the tests.

• Input/Output on the server network cards • High availability • Tcp Errors / Network errors at Network protocol level

5.2.2.3 File systems Monitoring targets The file systems containing the content store and indexes should be monitored during the execution of the tests.

• Disk IO performance on index and content store disks • Available free space percentage

5.2.3 Monitoring Data collection agent

A process should be defined to execute the overall collection of the monitoring information during the tests cycle executions. This is known as the monitoring data collection agent. Since the data may reside on multiple servers/locations the monitoring data needs to be stored centrally after the test, together with the test scripts so they can be cross referenced at a later stage.

5.2.4 Logs collection agent Configuring and collecting the log files that are generated during the tests represents a vital task on performance testing and it’s considered part of the monitoring plan. Log files


[email protected]

2016-09-15

page 14/32 © 2016

contain valuable information on the performance of your application. Not all problems are visible and not all users have the ability to understand the root-cause of the problems, especially the non-reproducible ones. Log Analysis tools let you monitor servers and services, generate usage and error reports on the client side, build compliance analytics and more. Collecting log data from servers, networks, devices and security systems are all part of giving us as full a picture as possible of what is going on when an application is launched and used. Log analysis and collecting should be defined, implemented and deployed as part of the performance testing execution cycles. It’s better to get a comprehensive understanding of error logs before users get them on production.

A process should be defined to execute the overall collection of the monitoring information during the tests cycle executions, note that only the information related to each test execution should be collected and stored. This process is known as the logs collection agent. This sounds the same as the data collection agent.


[email protected]

2016-09-15

page 15/32 © 2016

5.3 Application usage patterns For accurate, predictive test results, user behavior must be modeled within application usage patterns based on page flow, frequency of hits, the length of time that users stop between pages, and any other factor specific to how users interact with the application.

5.3.1 Work with the business to Identify application usage patterns The first step to understand the application usage patterns is engaging with the business analyst and collect all relevant information about expected usage patterns, having the use case in consideration. This step is vital to segment and determine the context of the tests. Throughout the process of creating workload models, remember to share your assumptions and drafts with the team and solicit their feedback. Do not get overly caught up in striving for perfection, and do not fall into the trap of oversimplification. In general, it is a good idea to start executing tests when you have a testable model and then enhance the model incrementally while collecting results.

5.3.1.1 Determine Key Usage Scenarios To simulate every possible user task or activity in a performance test is impractical, as a result, no matter what method you use to identify key scenarios, you will probably want to apply some limiting heuristic to the number of activities or key scenarios you identify for performance testing.


[email protected]

2016-09-15

page 16/32 © 2016

5.3.2 Use existing production log data when applicable When possible we should collect real information using well known sources such as web server log files that allow us to know specific details about the server load. There are several ‘log analyzers’ that help to parse the web log files and provide a report containing the number of user sessions, number of unique users, number of errors, peak day traffic, peak hour traffic, user request arrival pattern, frequently accessed pages, user navigation pattern, etc.

There are multiple solutions (usually SDKs, open source and commercial) today that enable tracking usage patterns inside the application, these stats can, when collected with load and without load, be used to gain a better understanding of the stability of the application.

• What are the most commonly used areas in the applications? • Where does the application crash the most? • How does your application render from requests from different locations? • Where does the application respond with the slowest speed? • How does the latest deployment/ build compare with previous build responses? • Does the application behave slower while backend tasks are occurring?

5.3.3 Document the application usage patterns From the analysis of the information collected the actual usage patterns of the application and infrastructure components can be identified. It’s important to produce a document on the application usage patterns and this will be the root source of information for the test scripts definitions and will help to fine tune our test scenarios.

5.3.4 Identify Performance test targets Performance testing can be executed against a single component or against the whole infrastructure therefore a clear definition of the performance test targets is a key factor to ensure we get the information we need from all relevant components. Depending on the use case, identify the target components of the infrastructure that will be monitored and tested.

5.3.5 Re-Engage with Business After you have collected a list of what you believe are the key usage scenarios, review them with the business and stakeholders. Ask what they think is missing, what they think can be de-prioritized, and, most importantly, why. What does not seem to matter to one person may still be critical to include in the performance test. This is due to potential side effects that activity may have on the system as a whole, and the fact that the individual who suggests that the activity is unimportant may be unaware of the consequences.


[email protected]

2016-09-15

page 17/32 © 2016

5.4 Load Test Infrastructure and testing toolkit We try to be as agnostic as possible in terms on using any specific tools to implement this methodology. In technology, there is always more than one way to achieve the same result, so, rather than focus on specific tools, we focus on the performance testing processes and on the characteristics that each tool must have, to successfully implement AET.

We do mention some open-source tools than can be used, but the choice of the toolkit to implement the testing processes is left to the user.

5.4.1 Load Test Infrastructure The load test environment is composed by a set of machines that compose the load-test infrastructure. Each machine normally holds a load-generator node that works as part of a cluster that simulates the required load.

You can run your own test environment on premise but there is also an option of running a cloud version of load-generators from a cloud service such as AWS. This option involves configuring security and networking on the target test environment to allow connections from the cloud.

5.4.2 Performance Test tool We need to choose the performance test tool based the use case we are simulating. There are a lot performance test tools available in the market (Licensed or freeware tools).

5.4.2.1 What is the best tool for your use case? The choice of the tool is highly dependent of the use case. Alfresco benchmarking and high load testing activities were historically performed using a custom suite of open source products that along the way have showed some limitations in complex contexts such as Alfresco.


[email protected]

2016-09-15

page 18/32 © 2016

Alfresco Share is highly based on AJAX browser interactions (i.e. asynchronous Javascript requests) and those tools did not implement a full browser simulation, therefore were not capable of interpret and execute asynchronous requests. In this sense, reproducing the real user interaction pattern with tools that do not emulate real browser instances is fairly complex to develop and to maintain, especially across Alfresco versions. For this reason, it’s preferable to use a tool that can work and simulate real browser instances. When choosing your testing tool, have the following features in consideration.

• Interpret Browser Javascript • Supports Asynchronous calls • Resource caching • Scaling (can work in a distributed test infrastructure) • Scale to thousands of active sessions • Elastic client load drivers • Produce readable Results • Durable and searchable results • Support real-time observation and analysis of results • Reusable code or components • Remote control from desktop

5.4.2.2 Work with real browsers if your application does

Alfresco Benchmark framework and JMeter can use Selenium to simulate real user interaction with Alfresco share via browser. A Selenium-Grid allows us to run tests on different machines against different browsers in parallel. That is, running multiple tests at the same time against different machines running different browsers and operating systems. Essentially, Selenium-Grid support distributed test execution. It allows for running your tests in a distributed test execution environment. A Selenium-Grid consists of a single hub, and one or more nodes. Both are started using the selenium-server.jar executable. The hub receives a test to be executed along with information on which browser and ‘platform’ (i.e. WINDOWS, LINUX, etc) where the test should be run. It ‘knows’ the configuration of each node that has been ‘registered’ to the hub. Using this information, it selects an available node that has the requested browser-platform combination. Once a node has been selected, Selenium commands initiated by the test are sent to the hub, which passes them to the node assigned to that test. The node runs the browser, and executes the Selenium commands within that browser against the application under test.


[email protected]

2016-09-15

page 19/32 © 2016

JMeter and the Alfresco Benchmark Framework are open-source tools that can be used to execute performance tests, Hp Load Runner is a commercial tool that can also be used to write and execute performance testing scripts.

5.4.2.3 Alfresco Benchmark Framework https://github.com/AlfrescoBenchmark/alfresco-benchmark This project provides a management application and a supporting library for development of highly scalable, easy-to-run Java-based load and benchmark tests. Maven and Java development patterns are employed so that load tests can be included in automated build plans; both for the product they are testing but also to prevent regressions in the tests. Alfresco Benchmark can run on Windows, Linux or Apple OS X

5.4.2.4 Apache JMeter http://jmeter.apache.org/ The Apache JMeter™ application is open source software, a 100% pure Java application designed to load test functional behaviour and measure performance. It was originally designed for testing Web Applications but has since expanded to other test functions. Apache JMeter may be used to test performance both on static and dynamic resources (Webservices (SOAP/REST), Web dynamic languages - PHP, Java, ASP.NET, Files, etc. -, Java Objects, Data Bases and Queries, FTP Servers and more). It can be used to simulate a heavy load on a server, group of servers, network or object to test its strength or to analyse overall performance under different load types. You can use it to make a graphical analysis of performance or to test your server/script/object behavior under heavy concurrent load. Jmeter can run on Windows, Linux or Apple OS X

5.4.2.5 HP Load Runner


[email protected]

2016-09-15

page 20/32 © 2016

https://saas.hpe.com/en-us/download/loadrunner HPE LoadRunner is a software testing tool from Hewlett Packard Enterprise. It is used to test applications, measuring system behavior and performance under load. HPE LoadRunner can simulate thousands of users concurrently using application software, recording and later analyzing the performance of key components of the application. LoadRunner simulates user activity by generating messages between application components or by simulating interactions with the user interface such as keypresses or mouse movements. The messages/interactions to be generated are stored in scripts. LoadRunner can generate the scripts by recording them, such as logging HTTP requests between a client web browser and an application's web server.

5.5 Test Scenarios Definition Ensure that one or more scenarios represent the difference between “quarterly close-out” period usage patterns and “typical business day” usage patterns. You may find the following limiting heuristics useful:

• Include contractually obligated usage scenario(s). • Include usage scenarios implied or mandated by performance testing goals and

objectives. • Include most common usage scenario(s). • Include business-critical usage scenario(s). • Include usage scenarios of technical concern. • Include usage scenarios of stakeholder concern. • Include high-visibility usage scenarios. • Take multiple physical locations into consideration.

A test scenario is composed by the following areas:

• Target Environment • Test parameters • Execution Strategy • Test Scripts

5.5.1 Target Environment

Variables such as firewalls, specific load balancing configurations, server’s memory configuration, etc., may severely influence on performance. It is recommended to run performance test cycles test in production or simulated production.


[email protected]

2016-09-15

page 21/32 © 2016

In case of using a simulated production environment to run the tests, ensure that the target environment contains all of the supplementary data necessary to create the actual test and it’s a faithful replication of the production environment.

5.5.2 Test Parameters Define the test parameters as follows:

• ramp up strategy • test duration • think time settings • different load conditions • types of tests • pass/fail criteria

5.5.2.1 About think time Think time is the time taken by the users to think or to navigate to different pages in the application. It’s the most important parameter in automated testing as it has a huge influence on the system load. Depending upon the application context, the think time would vary. It’s not advisable to have the default think time for all applications under testing. For ECM applications it’s normal to use think times between 30 and 60 seconds. In order to measure the response time of the transaction (user defined actions), we should be careful to place the think time outside the transaction points. Think time should not be placed between the start & end of the transactions. By changing the think time, we can simulate the load of large number of users though we run the test for a lesser number of users. Note that with the same number of users hitting the server, changing think time creates a huge impact on the server load. Decreasing the think time will increase the server load & vice versa.

5.5.3 Execution Strategy A good execution strategy is to have a slow ramp up followed by a stable period and ramp down. During this stable period, the target user load needs to perform various operations on the system with realistic think times. All the metrics measured should correspond only to the stable period and not during ramp up/ramp down period. Ensure that no other users are accessing the application during the test cycles and the servers are isolated from any other usage not part of the performance tests. We don’t conclude any transaction response time just based on 1 or 2 iterations. The


[email protected]

2016-09-15

page 22/32 © 2016

server should be monitored for a minimum of 5 iterations (at the same load level), before concluding the response time metrics, because there could be some reason for high /less response times at a single point in time. Leverage your performance scripts to run constantly. With proper correlations, you will be able to drive business intelligence, such as impact of poor or good performance on revenues and the impact of different usage scenarios on your system. The reports you will get from your load testing tool will help you to get full understanding of the important questions, such as “what, when and why”.

5.5.3.1 Execution Strategy Golden Rules

• Execute individual tests before combining them into a single test run • Execute 2 or 3 test runs to confirm the results • Execute the tests in different hours of the day

5.5.4 Test Scripts Development

The definition of the test scripts aims to simulate the most common user operations that are normally executed against the application Usage Patterns. This task is executed by executing the analysis to the usage patterns document produced in section 4.3.

5.5.4.1 Use Visual Models

One highly effective method to support test script creation is to create visual models of navigation paths and user actions that are intuitive to the entire team, including end users, developers, testers, analysts, and executive stakeholders. Also determine the percentage of users we anticipate to be performing each activity.

The key is to use language and visual representations that make sense to your team. Visual models should be circulated to both users and stakeholders for review/comment.

Following the steps taken to define key usage scenarios, ask the team members what they think is missing, what they think can be de-prioritized, and why. Often, team members will simply write new percentages on the visual model, making it very easy for everyone to see which activities have achieved a consensus, and which have not.

5.5.4.2 Define Success Criteria’s After we have the defined the tests scripts we engage with the business to define success criteria’s for business related operations, those can be end-to-end process throughput measurement or individual process/operation success criteria’s. Based on the business forecast of the user load increase, we specify a service level agreement (SLAs) that has to be met by the application in order to handle the user load in the target environment. This analysis should be done as the first step and a clear of set quantitative SLA for the application should be established.


[email protected]

2016-09-15

page 23/32 © 2016

5.6 Reporting Templates Defining the reporting templates (format, diagrams and dashboards) and the storage criteria is the last stage of AET. Managers and stakeholders need more than a big bundle of aggregated results from various tests — they need conclusions based on those results, and consolidated data that supports those conclusions. The Technical team members need analysis, comparisons, and details of how the results were obtained. Team members of all types get value from performance results being shared more frequently so we need to satisfy the needs of all the consumers of performance test results and data. We achieve this by employing a variety of reporting and results-sharing techniques. The key to effective reporting is to present information of interest to the intended audience in a quick, simple, and intuitive manner.

• Report early, report often, Report visually and Report intuitively • Use the right statistics • Consolidate and Summarize data effectively • Customize reports for different audience segments • Use concise verbal summaries • Make the data available • Recommendations should be provided when necessary

Data and statistics reported in a graphical format are easier to digest. This is especially true of performance results data, where the volume of data is frequently very large and most significant findings result from detecting patterns in the data. While reports do not need to provide the answers to issues to be effective, the issues should be quickly and intuitively clear from the presentation. Summarizing results frequently makes it much easier to demonstrate meaningful patterns in the test results. Summary charts and tables present data from different test executions side by side so that trends and patterns are easy to identify. The goal of these tables and charts is to show team members how the test results compare to the performance goals of the system so they can make important decisions about taking the system live, upgrading the system, or even, in some cases, completely reevaluating the project. Performance test results are normally read by one of three audiences: technical team members, non-technical team members, and stakeholders outside of the core team. These three groups tend to look for very different things in a performance report and are inclined to prefer different presentation methods. When reporting, make sure that you identify which group or groups you are reporting to and what their expectations are before deciding on the best way to present the results you have collected.


[email protected]

2016-09-15

page 24/32 © 2016

5.6.1 End User response times End-user response time is by far the most commonly requested and reported metric in performance testing. If you have captured goals and requirements effectively, this is a measure of presumed user satisfaction with the performance characteristics of the system or application. Stakeholders are interested in end-user response times to judge the degree to which users will be satisfied with the application. Technical team members are interested because they want to know if they are achieving the overall performance goals from a user’s perspective, and if not, in what areas those goals not being met.

5.6.2 Resource Utilizations Resource utilizations are the second most requested and reported metrics in performance testing. Most frequently, resource utilization metrics are reported verbally or in a narrative fashion. For example, “The CPU utilization of the application server never exceeded 45 percent. The target is to stay below 70 percent.” It is generally valuable to report resource utilizations graphically when there is an issue to be communicated. Overlay resource utilization metrics with other load and response data. Resource utilization metrics are most powerful when presented on the same graph as load and/or response time data. If there is a performance issue, this helps to identify relationships between the degraded performance and resource outage.

5.6.3 Volumes Capacity and Rates Volume, capacity, and rate metrics are other important performance criterias and can be indicators of scalability necessities. Some common volume, capacity, and rate metrics include:

• Bandwidth consumed • Throughput (Transactions Per Second, Hits per second)

5.6.4 Component Response Times Component response times should be collected and shared with the technical team. These response times help developers, architects, database administrators (DBAs), and administrators determine what sub-part or parts of the system are responsible for the majority of end-user response times. It’s important to relate component response times to end-user activities. It’s not always obvious what end-user activities are impacted by a component’s response time, it is a good idea to include those relationships in the report. Explain the degree to which the component response time matters. Sometimes the concern is that a component might become a bottleneck under load because it is processing too slowly; at other times, the concern is that end-user response times are noticeably degraded as a result of the component. Knowing which of these conditions applies to your project enables you to make effective decisions.


[email protected]

2016-09-15

page 25/32 © 2016

5.6.5 Trends Trends are one of the most powerful data-reporting methods. Trends show whether performance is improving or degrading from build to build, or the rate of degradation as load increases. Trends can help technical team members quickly understand whether the changes they recently made achieved the desired performance impact.

6 Tuning using the test cycle wheel We can tune the performance of our application by changing configuration parameters on several components of the infrastructure. The best approach for a tuning exercise is to execute the performance tests several times in a sequential way.

After running each test cycle we analyze the results, we execute our tuning and optimizations and we repeat the process until we’ve reached our performance goal.


[email protected]

2016-09-15

page 26/32 © 2016

7 Appendix A This appendix contains suggestions on how to implement some parts of AET.

7.1 Data Collection Its important to Collect all data, logs, test scripts, test scripts output, monitoring data,.. for each test run and store them in a predefined location. Make sure you can clearly identify the specific test run. Some customers may already have tools that will help to collect information about the platform such as NewRelic, AppDynamics, etc. If that’s not the case we have documented below a few commands that can be used to get the relevant information.

7.1.1 Server Processes

Monitor all processes running on the servers on a regular basis i.e. every minute with commands such as top: date&&top -b -n 1

This will provide valuable information about CPU, Memory and running processes

Thu Jul 14 19:34:08 BST 2016

top - 19:34:08 up 4:51, 1 user, load average: 0.00, 0.02, 0.00

Tasks: 142 total, 1 running, 141 sleeping, 0 stopped, 0 zombie Cpu(s): 9.2%us, 0.8%sy, 0.0%ni, 89.8%id, 0.1%wa, 0.1%hi, 0.1%si, 0.0%st Mem: 5845760k total, 3717792k used, 2127968k free, 144716k buffers Swap: 524280k total, 0k used, 524280k free, 2064000k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26621 alfresco 20 0 5544m 941m 20m S 9.9 16.5 7:00.61 java 1 root 20 0 19356 1536 1232 S 0.0 0.0 0:00.62 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0.0 0.0 0:00.07 migration/0 4 root 20 0 0 0 0 S 0.0 0.0 0:24.39 ksoftirqd/0 5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 6 root RT 0 0 0 0 S 0.0 0.0 0:00.07 watchdog/0 7 root RT 0 0 0 0 S 0.0 0.0 0:00.38 migration/1 8 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1 9 root 20 0 0 0 0 S 0.0 0.0 0:03.98 ksoftirqd/1 10 root RT 0 0 0 0 S 0.0 0.0 0:00.07 watchdog/1 11 root 20 0 0 0 0 S 0.0 0.0 0:05.90 events/0 12 root 20 0 0 0 0 S 0.0 0.0 0:01.07 events/1 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup 14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper


[email protected]

2016-09-15

page 27/32 © 2016

15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm 18 root 20 0 0 0 0 S 0.0 0.0 0:00.08 sync_supers 19 root 20 0 0 0 0 S 0.0 0.0 0:00.07 bdi-default

7.1.2 The Swiss Army Knife

There are a few tools that can collect information about multiple resources at once. dstat is one of these tools that should be run on all nodes during Performance testing. dstat can monitor CPU, Disk IO, Network IO, etc

Run as: dstat -tam --output dstat.out 5

This will collect statistics every 5 seconds and send the output to standard output and to a file named dstat.out

7.1.3 JVM Heap Memory Usage

JMV Heap Memory usage is one of the most important resources to monitor. Enable GC logging on the JVM. The configuration below should create up to 5 files of 100Mb in size each containing GC details. The GC logs can then be parsed by other tools to provide details on GC performance i.e. http://gceasy.io/ JAVA_OPTS="$JAVA_OPTS -Xloggc:gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=100M"

7.1.4 Java Stack Traces

jstack command prints Java stack traces of Java threads for a given Java process. If Alfresco’s java process is running high on CPU over large periods of time or the application is not responding it is recommended to capture a few java stack traces to to analyse what the application was doing at the time.

To obtain java stack traces run the following command 5 times over a period of 1 minute:

jstack <alfresco java pid> >> /path/jstack.out


[email protected]

2016-09-15

page 28/32 © 2016

Make sure you are familiar executing java stack traces before running the Performance test.

7.1.5 Database Slow Queries Most database servers can be configured to log slow queries; this should be enabled before running Performance Tests. Slow queries log file should be collected and kAET after the Performance test for analysis.

7.1.6 Document Transformations

Enable logging for class: org.alfresco.repo.content.transform.TransformerLog

by adding the following line to tomcat/shared/classes/alfresco/extension/custom-log4j.properties on all alfresco nodes:

log4j.logger.org.alfresco.repo.content.transform.TransformerLog=debug

Sample output from alfresco.log file showing document transformation times:

2016-07-14 18:24:56,003 DEBUG [content.transform.TransformerLog] [pool-14-thread-1] 0 xlsx png INFO Calculate_Memory_Solr Beta 0.2.xlsx 200.6 KB 897 ms complex.JodConverter.Image<<Complex>>

7.1.7 Solr Searches

Enable logging for class: org.alfresco.repo.search.impl.solr.SolrQueryHTTPClient by adding the following line to tomcat/shared/classes/alfresco/extension/custom-log4j.properties on all Share (front end) nodes: log4j.logger.org.alfresco.repo.search.impl.solr.SolrQueryHTTPClient=debug

Sample output from alfresco.log file showing Solr searches response times: DEBUG [impl.solr.SolrQueryHTTPClient] [http-apr-8080-exec-6] with: {"queryConsistency":"DEFAULT","textAttributes":[],"allAttributes":[],"templates":[{"template":"%(cm:name cm:title cm:description ia:whatEvent ia:descriptionEvent lnk:title lnk:description TEXT TAG)","name":"keywords"}],"authorities":["GROUP_EVERYONE","ROLE_ADMINISTRATOR","ROLE_AUTHENTICATED","admin"],"tenants":[""],"query":"((test.txt AND (+TYPE:\"cm:content\" +TYPE:\"cm:folder\")) AND -TYPE:\"cm:thumbnail\" AND -


[email protected]

2016-09-15

page 29/32 © 2016

TYPE:\"cm:failedThumbnail\" AND -TYPE:\"cm:rating\") AND NOT ASPECT:\"sys:hidden\"","locales":["en"],"defaultNamespace":"http://www.alfresco.org/model/content/1.0","defaultFTSFieldOperator":"OR","defaultFTSOperator":"OR"} 2016-03-19 19:55:54,106 DEBUG [impl.solr.SolrQueryHTTPClient] [http-apr-8080-exec-6] Got: 1 in 21 ms

7.1.8 HTTP Requests

It is very important to measure response time for all HTTP requests. To achieved this enable the following valve in tomcat/conf/server.xml.

<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="access-" suffix=".log" pattern='%a %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i" %D "%I"' resolveHosts="false"/>

For further clarification on the log pattern refer to: https://tomcat.apache.org/tomcat-7.0-doc/config/valve.html#Access_Logging

Sample output from tomcat access log under tomcat/logs directory. The important fields here are the HTTP resquest, the HTTP response status i.e. 200 and the time taken to process the request i.e. 33 milliseconds

127.0.0.1 - CN=Alfresco Repository Client, OU=Unknown, O=Alfresco Software Ltd., L=Maidenhead, ST=UK, C=GB [14/Jul/2016:18:49:45 +0100] "POST /alfresco/service/api/solr/modelsdiff HTTP/1.1" 200 37 "-" "Spring Surf via Apache HttpClient/3.1" 33 "http-bio-8443-exec-10"

7.1.9 Log files

All log files for the time when the case was run should be collected as there may be relevant information on them such as errors during test execution. This will include Alfresco logs, Catalina.out, tomcat access logs, Database server logs and Apache server logs.


[email protected]

2016-09-15

page 30/32 © 2016

7.2 Extra Considerations for your testing strategy 7.2.1 Network Latency

It is important to identify network latency before starting the performance tests. Basic tests can achieve this such as uploading/downloading content from different regions monitoring the time it takes to accomplish the task.

7.2.2 Traffic Load Balancing When testing on an Alfresco cluster make sure the requests are load balanced equally across all relevant nodes. This is a common issue we identify when analysing test results. Even when requests are sent through a load balancer we often find they have not been configured correctly and the traffic distribution is not equal.

7.2.3 Java Profiling If we identify areas of Alfresco that need deeper analysis we may need to introduce other tools such as Java profilers i.e. Yourkit (https://www.yourkit.com/), so a Java profiles should be available if necessary.

7.3 Sample Monitoring Plan based on Elastic Search The diagram below shows one example on how the different components of an Alfresco solution integrate and how would a possible monitoring implementation would work. It’s important to centralize data from all nodes and the various layers of the application in a single location to be able to collect this aggregated data when we test run results.


[email protected]

2016-09-15

page 31/32 © 2016

The above diagram is just one example, customer can have their own in-house monitoring solutions that can be adjusted, when necessary, to produce the relevant monitoring data to support the test executions.

7.4 Using JavaMelody on your Monitoring Plan JavaMelody is used to monitor Java or Java EE application servers in QA and production environments. It is a tool to measure and calculate statistics on real operation of an application depending on the usage of the application by users. It’s very easy to integrate in most applications and is lightweight with mostly no impact to target systems.


[email protected]

2016-09-15

page 32/32 © 2016

This tool is mainly based on statistics of requests and on evolution charts, for that reason it’s one important add on to our testing methodology, as it allow us to see in real time the evolution charts of the most important aspects of our application. JavaMelody includes summary charts showing the evolution over time of the following indicators: • Number of executions, mean execution times and percentage of errors of http requests,

sql requests, jsp pages or methods of business façades (if EJB3, Spring or Guice) • Java memory • Java CPU • Number of user sessions • Number of jdbc connections

These charts can be viewed on the current day, week, month, year or custom period.

You can have detailed information about javamelody at https://code.google.com/p/javamelody/ There is also a blog post that explains how to setup javamelody for Alfresco. https://www.alfresco.com/blogs/lcabaceira/2015/11/11/monitoring-alfresco-with-javamelody/ For a video course walkthrough of this paper, check out Alfresco Unplugged: Alfresco Effective Testing on https://university.alfresco.com/course/alfresco-unplugged/alfresco-effective-testing-aet

Alfresco Effective Testing · Alfresco Best Practices – Alfresco Effective Testing AET ... If we...

Documents

Transcript of Alfresco Effective Testing · Alfresco Best Practices – Alfresco Effective Testing AET ... If we...