Post on 07-May-2022
Application Monitoring Best Practices
o Scripts
o Script objectives – Monitoring scripts need not be complex to achieve their goals. When a script
becomes too complex, the risk of script failures is greater. An effective script primarily needs to
answer the following questions.
o What is the performance of the application?
o What is the availability of the application?
Additionally, a successful monitoring script needs to be tuned to minimize alerts that are
not actual failures, therefore assuring that alerts sent to recipients are useful.
o Transactions – A transaction should be a test of the primary functionality
within the application. Transactions should measure performance time
and availability by validating an expected checkpoint. They always use
visual analysis as a checkpoint, not native objects.
o Timing transactions - It is important that a transaction
is placed in its own group, directly after the click that
initiates the measured action.
If an additional step
occurs between the click
and the transaction
checkpoint, the timers
will not be accurate and
may likely show a value
of zero. Users can think
of the transaction as a
stopwatch and the
checkpoint stops the watch when it succeeds.
Part 1: Basic Monitoring Tenets
o Proper Script Flow
Script Stability Achieved by Checkpoints – Whether or not a checkpoint is used as a
transaction, it is crucial to perform text checkpoints throughout the script to be sure
the actions are unfolding as they should. Anytime that the screen changes, a
checkpoint should be used to be sure the device is ready for the next action.
Timeouts can be set on these checkpoints to values appropriate to the specific
action. Wait commands should not be used for this purpose as a wait command
does not check the status of the screen.
Unique Text Checkpoints - Checkpoints should consist of new text so they can prove
that the screen has changed. In other words, the user should sync text that is not
shown on the previous screen.
Visual vs Native Objects - Checkpoints should always be completed using visual
analysis, not native objects. The idea of a checkpoint is to assess what is visible on a
screen. Native objects are not always visible and the presence of a native object
does not guarantee that the page has correctly rendered. Non-checkpoint actions,
such as page navigation, should employ native objects when possible as this
improves script stability.
o Script reusability across devices
User Functions – User functions allow the scripter to write many subscripts that are
compatible with various devices. This allows the user to simply call a user function
from a parent script. Ultimately, this approach allows a single script to be used for
many devices which means that maintenance is easier and scripts look cleaner.
Text Checkpoints are Common – Script reusability works best when leveraging text
checkpoint which tend to be common across devices and operating systems.
o Devices
Redundant Pairs – A device that runs 24 hours a day, constantly performing
application transactions, sending and receiving communications, and playing video,
is likely to accumulate an unwieldy cache and consume memory. This can Impact
device readiness to execute the monitoring script and lead to failures.
Also, a script can be interrupted by incoming SMS, OS update available, amber alerts
and other similar elements on a specific device.
A proven way to handle these scenarios and reduce false alerts is to provide device
redundancy. Perfecto monitoring uses pairs of devices, set up identically, and a
script that automatically tries again upon failure.
Device pairs are defined using the Description field. The monitoring tool will call the
device using that field and the monitoring system script will run the script on either
one or both of the devices as needed.
Self-healing monitoring environment - Completing regular power cycles on devices
will help devices to be more stable. It is a good practice to have a script that runs
once a day to restart devices and their cradles. Many application failures will be
related to a device that needs to be refreshed rather than an application that is
malfunctioning. Therefore it is important to dynamically reboot the device when an
application fails to load. Perfecto allows us to create subscripts. A maintenance
subscript that automatically restarts the device and its cradle can be added to a
monitoring script.
A conditional statement will tell the script when the device has failed and
automatically run the maintenance script.
A maintenance script should
be set to ‘Async’ so that the
monitoring script will continue
while the maintenance script
runs.
Immediately after the maintenance script is called the monitoring script should exit
with the status ‘error’. This way the next time a script calls that device the device
has been restarted.
Handling Common Errors – It is a good practice to include subscripts that will handle
common device, application or OS related issues. A great example of this is popups
that can cause scripts to fail or turning off Wi-Fi if the test is meant for wireless data
testing.
o Scheduling
When monitoring an application, it is best to run that application as many times as
possible thus giving the script more chances to detect a failure. However, since we
know that overstressing devices may cause false alarms, we want to allow the script
enough time to run on two devices.
Ideally a script should run in less than 4 minutes. Some scripts can take up to five or
even six minutes to run, depending on the complexity of the script.
HP BSM and the VuGen templates that Perfecto uses allow us to run scripts
concurrently. This allows us greater scheduling flexibility, allowing for “rest time” on
the devices before the next script runs. There is also less of a risk that a device will
still be in use when the next script is scheduled to run.
AlertSite does not currently support running devices concurrently so the following
type of schedule is recommended.
In the case of AlertSite, where scripts need to be run one after the other, it is best to
allow 15 minutes per scheduled run. A good monitoring framework should always
employ a well-maintained script device schedule.
Run script on primary device Up to 6 minutes
Run script on redundant device and run maintenance script on the first device Up to 6 minutes
Allow time for monitoring tool overhead Up to 3 minutes
Total 15 minutes
2. Types of Errors
Device – Device related errors occur because of a problem with the device itself.
Perhaps the device has crashed and is in need of a power cycle and recover. Perhaps
there is a persistent popup that needs to be cleared outside of the script. These
errors need to be triaged and corrected as quickly as possible. Most can be
corrected through the cloud interface, but some require hands-on attention at the
data center.
Monitoring – These are errors related to the monitoring software or scheduling and
also should be remedied as quickly as possible. If a device cannot be called by the
script because it is already in use, the scheduling should be checked. If there is a
technical issue with the monitoring software, a case should be opened immediately
with Perfecto Support.
Real Errors – The goal of monitoring is finding issues related to a failure of the
service being tested. These should be tracked and reported immediately.
This tutorial will use a script that tests the Verizon Wireless Indycar application
Transactions
Check Indycar Started
Social Open
Tweets Loaded
The Script:
1. Prepare device
a. Open Device – This function is found in the devices category. It can be set for any device
assigned to a device variable.
b. Prepare Device –
a. Open device allocates the device to the user
b. Home command ensures that the device is on the
home screen.
2. Prepare applications
a. Close Application – Make certain that the application is closed at the beginning of the script
run to ensure consistent results when the application is launched.
b. Launch Application - Use the native Start Application function. In order for some
applications to function as expected, it is best to first completely close the application on
the device. In the Start
Application function
parameters, enable the
timeout and set it to 0. This
will force the script to move
directly to the checkpoint and
more accurately measure the
time to open the application.
Part 2: Writing a Monitoring Script
3. Transaction 1 – Check Indycar Started
Groups can be defined as transactions in the Parameters tab,
accessed by double-clicking the group header.
a. Text Checkpoint – Use a visual text
checkpoint to ensure that the
application has properly opened. Be
sure that the text you search is not
displayed on the device home screen
as this may result in a false positive.
b. Similarly, do not choose text that appears on the application splash screen
since apps can get “stuck” on a splash screen. The application name is not a
good text checkpoint for these reasons.
c. Maintenance Script on Failure – The Start application checkpoint should
always be followed by a conditional statement. If the checkpoint fails, the
script should move directly to a subscript that reboots and recovers the
device.
d. This maintenance script should be run in
the “Async” mode so that the script will
proceed to the next step while the device
restarts. That next step should be an exit
function with status set to Error. By
following this protocol, the device which
failed to start the application will be in a
better state when the next script is
executed.
4. Perform test actions
a. Whatever actions are needed to complete the measured transaction should
happen at this time. This script actions should be completed using native
objects via XPath when possible. Performance using XPath is far greater than
performance using visual objects.
b. In the case of the
Indycar-Social
script, the test
actions consist of navigating to the “Social” page so it is possible to test the
appearance of page elements. For this, we employ a user function,
NavigateToSocial. The steps within the subscript are listed below.
c. Now that we have accessed the Social page in the Indycar app, we can
validate 2 checkpoints.
5. Transaction 2 –Social Open, and Transaction 3, Tweets Loaded
a. As mentioned earlier, Checkpoints for transactions within a monitoring script
should be done via visual analysis of elements that can be seen on the screen.
The following 2 checkpoints will determine 2 validation points. Firstly, it will
verify that we have opened the Social page, and secondly, it will verify that
the tweets are loading.
b. The only element to be found on the page that will appear with or without a
data connection is the twitter icon.
c. Once we have seen that, we can look for the word, “Retweets” which will only
show if a tweet has been loaded on the page. Also, it shows no matter the
content of that tweet.
d. Grid label – In the checkpoint configuration, add a grid label with a meaningful
name so that your script report will display errors on the grid tab. This makes
error triage fast and easy.
6. End
a. Close apps – Again, use the Close application function when possible.
b. Home
c. Close device - Always close the device at the end of each script to avoid device
allocation errors.