Multi-tier Internet Services

judgedrunkshipServers

Nov 17, 2013 (3 years and 6 months ago)

76 views

Towards Autonomic Hosting of
Multi
-
tier Internet Services


Swaminathan Sivasubramanian
, Guillaume Pierre and
Maarten van Steen


Vrije Universiteit, Amsterdam, The Netherlands.


Hosting Large
-
Scale Internet Services


Large
-
scale e
-
commerce enterprises use complex software
systems


Sites built of numerous applications called services.


A request to amazon.com leads to requests to hundreds of
services [Vogels, ACM Queue, 2006].



Each site has a SLA (latency, availability targets)


Global optimization
-
based hosting is intractable


Convert Global to per
-
service SLA


Host each service scalably.



Problem in focus:
Efficient hosting of an Internet service
.



Web Services: Background


Services


Multi
-
tiered Applications


Perform business logic on data from its data store and from
other services.


E.g., Shopping cart service, Recommender service, Page generator.


Exposed
and restricted

through well
-
defined interfaces


Usually accessible over the network


Does not allow direct access to its internal database



Application

Server

DB

Service Y

Service X

E.g., JBoss, Tomcat/Axis,

Websphere

e.g., DB2, Oracle,
MySQL

Service Req. (XML)

DB Queries

Service Req.

Service

Response

DB Response

Service Response


(XML)

Application

Server

Scalability techniques applied to service
hosting

DB

Service Y

Service X

Application

Server

Application

Server

Application

Server

Useful for compute
-
intensive services

(e.g., page generators)


Application

Server

Scalability techniques applied to service
hosting

DB

Service Y

Service X

Response

Cache

Response

Cache

Cache service

Responses


Reduces load
on application

(if hit ratio is
good)


Application

Server

Scalability techniques applied to service
hosting

DB

Service Y

Service X

DB

Caches

DB

Cache

Reduces DB load
(if hit ratio is
good)

Cache Query Results

e.g., IBM’s DBCache,
GlobeCBC

Application

Server

Scalability techniques applied to service
hosting

DB

Service Y

Service X

Response

Cache

Response

Cache

Response

Cache

Useful if other service
is across WAN or does
not meet SLA

Reduces
response time


Scalability techniques applied to service
hosting

DB

Response

Cache

DB

Cache

Application

Server

DB

Cache

Response

Cache

Response

Cache

Response

Cache

Application

Server


Resource provisioning for a service


Wide variety of techniques at different tiers to consider


What is the right (set of) technique(s) for a given service?


Depends on: locality, update workload, code execution time, query
time, external service dependencies


Too many parameters for an administrator to manage!



Can we automate it (at least to a large extent)?






Autonomic Hosting: Initial Objective



To find the minimum set of resources to host a
given service such that its end
-
to
-
end latency is
maintained between [Lat
min
, Lat
max
].”




We pose it as: “
To find the minimum number of
resources (servers) to provision in each tier for a
service to meet its SLA




Proposed Approach


Get a model of end
-
to
-
end latency


Lat = f(hr
server
, t
App
, hr
cli
, t
db
, hr
dbcache
,ReqRate)


hr =
hit ratio
, t =
execution time


f



Latency modeling function


Little’s law based network of queues


MVA (mean value analysis) on network of queues


Or other models?



Proposed Approach (contd..)


Fit a service to the model



Parameters such as execution time can be obtained


Log analysis, server instrumentation



Estimating

hr
at different tiers is harder


Request patterns and update patterns vary


Fluid
-
based cache models assume infinite cache memory


Need a technique that predicts
hr

for a given cache size









Virtual Caches


Virtual cache (VC)


means to predict
hr


Cache that stores just the meta
-
data [Wong et.al., 2002]


Takes original request &update stream to compute
hr


Smaller footprint


Can be added in different tiers such as App servers, Client stubs, JDBC
drivers.


What will be
hr

if another server with memory
d

is added to a cache pool
with
M
memory?


Run a VC with M+
d

memory


A VC with M
-
d

memory gives
hr

when a server is removed.



Running VC for distributed caches


N caches servers, each with
M

memory


Run VC in each server with
M

+
M
/
N
memory



=> Avg.
hr
when a new server is added



Resource Provisioning


To provision a service


Obtain (
hr

& t) values from different tiers of service


Estimate latency for different resource configurations


Find the best configuration that meets its latency SLA



For a running service


If SLA is violated, find the best tier to add a server


Switching time?


Addition of servers take time (e.g., cache warm up, reconfiguration)


Right now, assumed negligible


Need to investigate prediction algorithms







Current Status & Limitations


Goal: To build an autonomic hosting platform for Multi
-
tier
internet applications


Multi
-
queue model w/ online
-
cache simulations has been a good
start


Prototyped with Apache, Tomcat/Axis, MySQL


Integrating with our CDN, Globule


Experiments with TPC
-
App
-
> encouraging


Experimented with other services



Current Work


Refining Queueing Models for accurate latency estimation


Investigating availability issues




Discussion Points


Utilization based SLAs


Other prediction models


Does cache behavior vary with req. rate?


Failures


How to provision for availability targets?


Multiple service classes

Availability
-
aware provisioning


Availability
-
aware provisioning


To provision for a required up
-
time


Must consider MTTF and MTTR for servers in each tier


Caches have different MTTR than AppServers



How to provision?


Strategy 1


Perform latency
-
based provisioning.


For each tier, add additional resources to reach target uptime


Strategy 2


Formulate as a dual
-
constrained optimization problem.




Dynamic Provisioning


For handling dynamic load changes


Need to predict workload changes


Allows us to be prepared earlier


Adding/reconfiguring servers take time


Prediction window should be greater than server addition
time


Load prediction is relatively well understood


Prediction of temporal effects?

Thank You!



More info: http://www.globule.org


Questions?