Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume...
-
Upload
ronald-phelps -
Category
Documents
-
view
212 -
download
0
Transcript of Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume...
Towards Autonomic Hosting of Multi-tier Internet Services
Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen
Vrije Universiteit, Amsterdam, The Netherlands.
Hosting Large-Scale Internet Services Large-scale e-commerce enterprises use complex software
systems Sites built of numerous applications called services. A request to amazon.com leads to requests to hundreds of
services [Vogels, ACM Queue, 2006].
Each site has a SLA (latency, availability targets) Global optimization-based hosting is intractable Convert Global to per-service SLA Host each service scalably.
Problem in focus: Efficient hosting of an Internet service.
Web Services: Background
Services – Multi-tiered Applications Perform business logic on data from its data store and from
other services. E.g., Shopping cart service, Recommender service, Page generator.
Exposed and restricted through well-defined interfaces Usually accessible over the network Does not allow direct access to its internal database
ApplicationServer DB
Service YService X
E.g., JBoss, Tomcat/Axis,Websphere
e.g., DB2, Oracle, MySQL
Service Req. (XML) DB Queries
Service Req.
ServiceResponse
DB ResponseService Response (XML)
ApplicationServer
Scalability techniques applied to service hosting
DB
Service YService X
ApplicationServerApplication
ServerApplicationServer
Useful for compute-intensive services
(e.g., page generators)
ApplicationServer
Scalability techniques applied to service hosting
DB
Service YService X
ResponseCacheResponse
Cache
Cache serviceResponses
Reduces load on application(if hit ratio is
good)
ApplicationServer
Scalability techniques applied to service hosting
DB
Service YService X
DBCachesDB
Cache
Reduces DB load (if hit ratio is
good)
Cache Query Results e.g., IBM’s DBCache,
GlobeCBC
ApplicationServer
Scalability techniques applied to service hosting
DB
Service YService X
ResponseCache
ResponseCache
ResponseCache
Useful if other service is across WAN or does
not meet SLA
Reduces response time
Scalability techniques applied to service hosting
DB
ResponseCache
DBCache
ApplicationServer
DBCache
ResponseCache
ResponseCacheResponse
CacheApplication
Server
Resource provisioning for a service Wide variety of techniques at different tiers to consider What is the right (set of) technique(s) for a given service?
Depends on: locality, update workload, code execution time, query time, external service dependencies Too many parameters for an administrator to manage!
Can we automate it (at least to a large extent)?
Autonomic Hosting: Initial Objective
“To find the minimum set of resources to host a given service such that its end-to-end latency is
maintained between [Latmin, Latmax].”
We pose it as: “To find the minimum number of resources (servers) to provision in each tier for a
service to meet its SLA”
Proposed Approach
Get a model of end-to-end latency Lat = f(hrserver , tApp, hrcli , tdb ,
hrdbcache ,ReqRate) hr = hit ratio, t = execution time
f – Latency modeling function Little’s law based network of queues MVA (mean value analysis) on network of queues Or other models?
Proposed Approach (contd..)
Fit a service to the model
Parameters such as execution time can be obtained Log analysis, server instrumentation
Estimating hr at different tiers is harder Request patterns and update patterns vary Fluid-based cache models assume infinite cache memory Need a technique that predicts hr for a given cache size
Virtual Caches
Virtual cache (VC) – means to predict hr Cache that stores just the meta-data [Wong et.al., 2002]
Takes original request &update stream to compute hr Smaller footprint Can be added in different tiers such as App servers, Client stubs, JDBC
drivers. What will be hr if another server with memory d is added to a cache
pool with M memory? Run a VC with M+d memory A VC with M-d memory gives hr when a server is removed.
Running VC for distributed caches N caches servers, each with M memory Run VC in each server with M + M/N memory
=> Avg. hr when a new server is added
Resource Provisioning
To provision a service Obtain (hr & t) values from different tiers of service Estimate latency for different resource configurations
Find the best configuration that meets its latency SLA
For a running service If SLA is violated, find the best tier to add a server Switching time?
Addition of servers take time (e.g., cache warm up, reconfiguration) Right now, assumed negligible Need to investigate prediction algorithms
Current Status & Limitations
Goal: To build an autonomic hosting platform for Multi-tier internet applications Multi-queue model w/ online-cache simulations has been a good
start Prototyped with Apache, Tomcat/Axis, MySQL
Integrating with our CDN, Globule Experiments with TPC-App -> encouraging Experimented with other services
Current Work Refining Queueing Models for accurate latency estimation Investigating availability issues
Discussion Points
Utilization based SLAs Other prediction models
Does cache behavior vary with req. rate? Failures
How to provision for availability targets? Multiple service classes
Availability-aware provisioning Availability-aware provisioning
To provision for a required up-time Must consider MTTF and MTTR for servers in each tier Caches have different MTTR than AppServers
How to provision? Strategy 1
Perform latency-based provisioning. For each tier, add additional resources to reach target uptime
Strategy 2 Formulate as a dual-constrained optimization problem.
Dynamic Provisioning
For handling dynamic load changes Need to predict workload changes
Allows us to be prepared earlier Adding/reconfiguring servers take time Prediction window should be greater than server addition
time Load prediction is relatively well understood
Prediction of temporal effects?
Thank You!
More info: http://www.globule.org
Questions?