Chapter 6: Performance tracking and usage tracking in...

1

Chapter 6: Performance tracking and usage tracking in Cloud Infrastructure

Topics covered:

6.1 Objective of performance tracking and usage tracking in Cloud Infrastructure

6.2 Methods employed in the Cloud to track performance and usage

6.3 Critical resource that will degrade performance

6.4 Escalation process and workflow to aid resource planning

6.5 Benefits of resource and performance tracking

6.6 Managing vCloud Director

2

Objective of performance tracking and usage tracking in Cloud Infrastructure Cloud computing deployments must be monitored and managed in order to be optimized for best performance. Cloud management software provides capabilities for managing,

Faults, Configuration, Accounting, Performance, and Security; (this is referred to as FCAPS)

Many products address one or more of these areas. You can access all five areas through network frameworks. Framework products are being repositioned to work with cloud systems. Efforts are underway to develop cloud management interoperability standards. One effort is the DMTF's (Distributed Management Task Force) Open Cloud Standards Incubator, which aims to develop management tools that work with any cloud type. Another group called the Cloud Commons is developing a technology called the Service Measurement Index (SMI). SMI aims to deploy methods for measuring various aspects of cloud performance in a standard way. Network management systems are often described in terms of the acronym FCAPS. Most network management packages have one or more of these characteristics; no single package provides all five elements of FCAPS. To get the complete set of all five of these management areas from a single vendor, you would need to adopt a network management framework. These large network management frameworks were industry leaders several years back: BMC PATROL, CA Unicentre, IBM Tivoli, HP OpenView, and Microsoft System Centre. These five vendors have (or soon will have) products for cloud management. What separates a network management package from a cloud computing management package is the “cloudly” characteristics that cloud management service must have:

• Billing on a pay-as-you-go basis. • The management service is extremely scalable. • The management service is ubiquitous (can be found everywhere) • Communication between the cloud and other systems uses cloud networking standards.

3

To monitor an entire cloud computing deployment stack, you monitor six different categories:

1. End-user services such as HTTP, TCP, POP3/SMTP, and others 2. Browser performance on the client 3. Application monitoring in the cloud, such as Apache, MySQL, and so on 4. Cloud infrastructure monitoring of services 5. Machine instance monitoring where the service measures processor

utilization, memory usage, disk consumption, queue lengths, and other important parameters

6. Network monitoring and discovery using standard protocols like the Simple Network Management Protocol (SNMP), Configuration Management Database (CMDB) and Windows Management Instrumentation (WMI).

Cloud services have a defined lifecycle. There are six stages in the lifecycle. A performance and tracking program has to touch on each of the six different stages to provide useful utilization data within the Cloud Infrastructure.

Methods employed in the Cloud to track performance and usage

The core management features offered by most cloud management service products include the following:

• Support of different cloud types • Creation and provisioning of different types of cloud resources, such as

machine instances, storage, or staged applications. • Performance reporting including availability and uptime, response time,

resource quota usage, and other characteristics. • The creation of dashboards that can be customized for a particular client's

needs. All of the service models support monitoring solutions, most often through interaction with the service API. Tapping into a service API allows management software to perform command actions that a user would normally perform. Some of these APIs are themselves scriptable. In some cases, scripting is supported in the management software.

4

Example of a Cloud management software

Emerging Cloud Management Standards Different cloud service providers use different technologies for creating and managing cloud resources. A number of large industry players such as VMware, IBM, Microsoft, Citrix, and HP have gotten together to create standards that can be used to promote cloud interoperability. DMTF cloud management standards The Distributed Management Task Force (DMTF) is an industry organization that develops industry system management standards for platform interoperability. The group has been responsible for the Common Information Model (CIM) standard. A recent standard called the Virtualization Management Initiative (VMAN) was developed to extend CIM to virtual computer system management. VMAN has resulted in the creation of the Open Virtualization Format (OVF), which describes a standard method for creating, packaging, and provisioning virtual appliances. OVF was announced in 2009. Vendors such as VirtualBox, AbiCloud, IBM, Red Hat, and VMware have announced or introduced products that use OVF. DMTF has created a working group called the Open Cloud Standards Incubator (OCSI) to help develop interoperability standards for managing interactions between and in public, private, and hybrid cloud systems. The group is focused on describing resource management and security protocols, packaging methods, and network management technologies.

5

Cloud Commons and SMI CA Technologies (http://www.ca.com), the company once known as Computer Associates, have repositioned its products as the following:

• CA Cloud Insight, a cloud metrics measurement service • CA Cloud Compose, a deployment service • CA Cloud Optimize, a cloud optimization service • CA Cloud Orchestrate, a workflow control and policy based automation

service Taken together, these products form the basis for CA's Cloud Connected Management Suite (http://www.ca.com/us/cloud-solutions.aspx). At the heart of CA Cloud Insight is a method for measuring different cloud metrics that create a Service Measurement Index or SMI. The SMI measures things like SLA compliance, cost, and other values and rolls them up into a score. Example of a SLA measurement and compliance tools

http://h30499.www3.hp.com/t5/Business-Service-Management-BAC/Are-cloud-services-affecting-application-performance-7-questions/ba-p/6344619#.U8I1UvmSxCA CA funds an industry online community called the Cloud Commons, which has built a dashboard called the CloudSensor that monitors the performance of the major cloud-based services in real time.

6

Example of a dashboard shown below

This tool measures the performance of the following:

• RackSpace file creation and deletion • E-mail availability (system uptime) based on Google Gmail, Windows Live

Hotmail, and Yahoo! Mail • Amazon Web Services server creation/destruction times at four AWS sites • Dashboard Response Times for the consoles of AWS Amazon, Google App

Status, RackSpace Cloud, and Saleforce • Windows Azure storage benchmarks • Windows Azure SQL benchmarks

It is meant to demonstrate the value of cloud performance measurements. These metrics are based on real-time data derived from real transactions. Each chart shows the last two hours of activity. The Service Measurement Index (SMI) is based on a set of measurement technologies forming the SMI Framework. It measures cloud-based services in six areas:

• Agility • Capability • Cost • Quality • Risk • Security

7

Cloud management is an important and growing area of technology. Among the management tasks are

deploying, monitoring, configuration, optimization, and security.

Nearly all network management software vendors are repositioning their products to work with cloud systems. Some network management is available from within the cloud service providers' platforms. Many of the software systems utilize the service provider's API to manage, monitor, and control resources. The use of virtualization has resulted in many new products in this area.

Critical resource that will degrade performance

Capacity planning is an iterative process with the following steps:

1. Determine the characteristics of the present system.

2. Measure the workload for the different resources in the system: CPU, RAM, disk, network, and so forth.

3. Load the system until it is overloaded to determine when it breaks, and specify what is required to maintain acceptable performance. Knowing when systems fail under load and what factor(s) is responsible for the failure is the critical step in capacity planning.

4. Predict the future based on historical trends and other factors. Defining a performance baseline Capacity planning needs some data to compare. The first item is to determine the current system capacity or workload as a measurable quantity over time (baseline). Many developers create cloud-based applications and Web sites based on a LAMP solution stack. LAMP stands for:

• Linux, the operating system • Apache HTTP Server (http://httpd.apache.org/), the Web server based on the

work of the Apache Software Foundation • MySQL (http://www.mysql.com), the database server developed by the

Swedish company MySQL AB, owned by Oracle Corporation through its acquisition of Sun Microsystems

• PHP (http://www.php.net/), the Hypertext Preprocessor scripting language developed by The PHP Group

8

Baseline measurements Let's assume:

• that a capacity planner is working with a system that has a Web site based on APACHE,

• the site is processing database transactions using MySQL.

There are two important overall workload metrics in this LAMP system:

• Page views or hits on the Web site, as measured in hits per second • Transactions completed on the database server, as measured by transactions

per second or perhaps by queries per second

The goal of a capacity planning exercise is to accommodate • spikes in demand, • overall growth of demand over time.

Of these two factors, the growth in demand over time is the most important consideration because it represents the ability of a business to grow. A spike in demand may or may not be important enough to an activity to attempt to capture the full demand that the spike represents. Resources that affect performance System resources:

• CPU • Memory (RAM) • Disk • Network connectivity

In Linux/UNIX, you might use the sar command to display the level of CPU activity. In Windows, CPU activity may be made using the Task Manager. The data can be dumped to a performance log and/or displayed in a graph. Network I/O Is often a bottleneck in Web servers. To reduce the bottleneck, Web sites prefer to scale out using many low-powered servers instead of scaling up with fewer but more powerful server. Network Capacity: A cloud-computing system resource that is difficult to plan for is network capacity. There are three aspects of assessing network capacity:

• Network traffic to and from the network interface at the server, be it a physical or virtual interface or server

• Network traffic from the cloud to the network interface. • Network traffic from the cloud through your ISP to your local network interface

(your computer)

9

Microsoft includes a utility called the Microsoft Network Monitor as part of its server utilities. There are many third-party products. The site Sectools.org has a list of packet sniffers at http://sectools.org/sniffers.html. Here are some:

• Wireshark (http://www.wireshark.org/), formerly called Ethereal • Kismet (http://www.kismetwireless.net/), a Wi-Fi sniffer • TCPdump (http://www.tcpdump.org/) • Dsniff (http://www.monkey.org/~dugsong/dsniff/) • Ntop (http://www.ntop.org/) • EtherApe (http://etherape.sourceforge.net/)

The statistics function of these tools provides a measurement for planning your resources. You can analyse the data in a number of ways, including,

• specific applications used, • network protocols, • traffic by the system or users, • the content of the individual packets crossing the wire.

Escalation process and workflow to aid resource planning

Escalation processes can help to improve resource planning and reduce error in Cloud deployment. An escalation plan is a set of procedures set in place to deal with potential problems in a variety of contexts.

In resource planning, the escalation plan normally handles the insufficiency of resources (CPU, memory and storage capacity) in a datacentre upon the report from a system log or a trigger from the workflow process.

Upon the escalation, key personnel with the authority will act to ensure the resources are dealt with as soon as possible. This is to minimise or resolve potential problems that will affect the services offered to customers.

Example: Cloud Life cycle

http://www.tcpdump.org/

10

Benefits of a escalation plan:

• reduced services outages • faster resolve time • datacentre continues to function • Increase customer confidence and satisfaction

Benefits of resource and performance tracking:

Continuous system performance monitoring can do the following:

• Sometimes detect underlying problems before they have an adverse effect • Detect problems that affect a user's productivity • Collect data when a problem occurs for the first time • Allow you to establish a baseline for comparison

Successful monitoring involves the following:

• Periodically obtaining performance-related information from the operating system

• Storing the information for future use in problem diagnosis • Displaying the information for the benefit of the system administrator • Detecting situations that require additional data collection or responding to

directions from the system administrator to collect such data, or both • Collecting and storing the necessary detail data • Tracking changes made to the system and applications

11

Managing vCloud Director

vCD Logs

Admin must know where log files are stored, and check them regularly. That way, if VMware Support asks you to send the log files or look into the log files, you’ll know where they are.

For vCloud Director, if you connect to the console of the vCloud Director virtual machine, you can go to:

/opt/vmware/vcloud-director/logs

to view the log files.

The command #cat cell.log shows the following diagram

Let’s take a look at the contents of one of these logs.

http://www.petri.co.il/wp-content/uploads/17-vcd-log-files.png

12

http://www.petri.co.il/wp-content/uploads/18-cat-cell_log.png

Chapter 6: Performance tracking and usage tracking in...

Documents

Transcript of Chapter 6: Performance tracking and usage tracking in...