Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume...

Towards Autonomic Hosting of Multi-tier Internet Services

Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen

Vrije Universiteit, Amsterdam, The Netherlands.

Hosting Large-Scale Internet Services Large-scale e-commerce enterprises use complex software

systems Sites built of numerous applications called services. A request to amazon.com leads to requests to hundreds of

services [Vogels, ACM Queue, 2006].

Each site has a SLA (latency, availability targets) Global optimization-based hosting is intractable Convert Global to per-service SLA Host each service scalably.

Problem in focus: Efficient hosting of an Internet service.

Web Services: Background

Services – Multi-tiered Applications Perform business logic on data from its data store and from

other services. E.g., Shopping cart service, Recommender service, Page generator.

Exposed and restricted through well-defined interfaces Usually accessible over the network Does not allow direct access to its internal database

ApplicationServer DB

Service YService X

E.g., JBoss, Tomcat/Axis,Websphere

e.g., DB2, Oracle, MySQL

Service Req. (XML) DB Queries

Service Req.

ServiceResponse

DB ResponseService Response (XML)

ApplicationServer

Scalability techniques applied to service hosting

DB

Service YService X

ApplicationServerApplication

ServerApplicationServer

Useful for compute-intensive services

(e.g., page generators)

ApplicationServer


DB

Service YService X

ResponseCacheResponse

Cache

Cache serviceResponses

Reduces load on application(if hit ratio is

good)

ApplicationServer


DB

Service YService X

DBCachesDB

Cache

Reduces DB load (if hit ratio is

good)

Cache Query Results e.g., IBM’s DBCache,

GlobeCBC

ApplicationServer


DB

Service YService X

ResponseCache

ResponseCache

ResponseCache

Useful if other service is across WAN or does

not meet SLA

Reduces response time


DB

ResponseCache

DBCache

ApplicationServer

DBCache

ResponseCache

ResponseCacheResponse

CacheApplication

Server

Resource provisioning for a service Wide variety of techniques at different tiers to consider What is the right (set of) technique(s) for a given service?

Depends on: locality, update workload, code execution time, query time, external service dependencies Too many parameters for an administrator to manage!

Can we automate it (at least to a large extent)?

Autonomic Hosting: Initial Objective

“To find the minimum set of resources to host a given service such that its end-to-end latency is

maintained between [Latmin, Latmax].”

We pose it as: “To find the minimum number of resources (servers) to provision in each tier for a

service to meet its SLA”

Proposed Approach

Get a model of end-to-end latency Lat = f(hrserver , tApp, hrcli , tdb ,

hrdbcache ,ReqRate) hr = hit ratio, t = execution time

f – Latency modeling function Little’s law based network of queues MVA (mean value analysis) on network of queues Or other models?

Proposed Approach (contd..)

Fit a service to the model

Parameters such as execution time can be obtained Log analysis, server instrumentation

Estimating hr at different tiers is harder Request patterns and update patterns vary Fluid-based cache models assume infinite cache memory Need a technique that predicts hr for a given cache size

Virtual Caches

Virtual cache (VC) – means to predict hr Cache that stores just the meta-data [Wong et.al., 2002]

Takes original request &update stream to compute hr Smaller footprint Can be added in different tiers such as App servers, Client stubs, JDBC

drivers. What will be hr if another server with memory d is added to a cache

pool with M memory? Run a VC with M+d memory A VC with M-d memory gives hr when a server is removed.

Running VC for distributed caches N caches servers, each with M memory Run VC in each server with M + M/N memory

=> Avg. hr when a new server is added

Resource Provisioning

To provision a service Obtain (hr & t) values from different tiers of service Estimate latency for different resource configurations

Find the best configuration that meets its latency SLA

For a running service If SLA is violated, find the best tier to add a server Switching time?

Addition of servers take time (e.g., cache warm up, reconfiguration) Right now, assumed negligible Need to investigate prediction algorithms

Current Status & Limitations

Goal: To build an autonomic hosting platform for Multi-tier internet applications Multi-queue model w/ online-cache simulations has been a good

start Prototyped with Apache, Tomcat/Axis, MySQL

Integrating with our CDN, Globule Experiments with TPC-App -> encouraging Experimented with other services

Current Work Refining Queueing Models for accurate latency estimation Investigating availability issues

Discussion Points

Utilization based SLAs Other prediction models

Does cache behavior vary with req. rate? Failures

How to provision for availability targets? Multiple service classes

Availability-aware provisioning Availability-aware provisioning

To provision for a required up-time Must consider MTTF and MTTR for servers in each tier Caches have different MTTR than AppServers

How to provision? Strategy 1

Perform latency-based provisioning. For each tier, add additional resources to reach target uptime

Strategy 2 Formulate as a dual-constrained optimization problem.

Dynamic Provisioning

For handling dynamic load changes Need to predict workload changes

Allows us to be prepared earlier Adding/reconfiguring servers take time Prediction window should be greater than server addition

time Load prediction is relatively well understood

Prediction of temporal effects?

Thank You!

More info: http://www.globule.org

Questions?

Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume...

Documents

Transcript of Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume...