Meta-Simulation Design and Analysis for Large Scale Networks

flutteringevergreenΔίκτυα και Επικοινωνίες

29 Οκτ 2013 (πριν από 3 χρόνια και 10 μήνες)

137 εμφανίσεις

Meta
-
Simulation Design and
Analysis for Large Scale
Networks

David W Bauer Jr


Department of Computer Science

Rensselaer Polytechnic Institute

OUTLINE


Motivation


Contributions


Meta
-
simulation


ROSS.Net


BGP4
-
OSPFv2 Investigation


Simulation


Kernel Processes


Seven O’clock Algorithm


Conclusion

“…objective as a quest for general invariant

relationships between network

parameters and protocol dynamics…”

High
-
Level Motivation: to gain varying degrees of






qualitative and quantitative





understanding of the behavior of




the system
-
under
-
test

Parameter

Sensitivity

Protocol

Stability and

Dynamics

Feature

Interactions

Meta
-
Simulation:

capabilities to extract and interpret
meaningful performance data from the results of
multiple

simulations



Individual experiment cost is high



Developing useful interpretations



Protocol performance modeling

Experiment Design Goal
:
identify

minimum
cardinality
set of meta
-
metrics to

maximally

model system

OUTLINE


Motivation


Contributions


Meta
-
simulation


ROSS.Net


BGP4
-
OSPFv2 Investigation


Simulation


Kernel Processes


Seven O’clock Algorithm


Conclusion

Contributions:
Meta
-
Simulation: OSPF

Problem: which meta
-
metrics are most important in
determining OSPF convergence?

Search complete
model space

Negligible metrics
identified and isolated

Step 2

Optimization
-
based ED:
750 experiments

Full
-
Factorial ED (FFED):

16384 experiments

Step 3

Our approach within 7% of Full Factorial using 2
orders of magnitude fewer experiments

Re
-
parameterize

Re
-
scale

Step 1

Contributions:
Meta
-
Simulation: OSPF/BGP

Ability: model BGP and OSPF control plane

Problem: which meta
-
metrics are most important in
minimizing control plane dynamics (i.e., updates)?


Optimized with respect to various metrics
--

equivalent to a particular management approach.


Importance of parameters differ for each metric.


For minimal total updates:


Local perspectives are 20
-
25% worse than the global.


For minimal total interactions:


15
-
25% worse can happen with other metrics


OB updates are more important than BO updates (i.e. ~0.1% vs. ~50%)

Meta
-
Simulation Perspective: complete view of all domains

OB: ~50% of total updates

BO: ~0.1% of total updates

Global perspective 20
-
25% better
than local perspectives




BO: BGP
-
caused OSPF update



OB: OSPF
-
caused BGP update


All updates belong to one of four categories:


OO: OSPF
-
caused OSPF (OO) update


BO: BGP
-
caused OSPF update


Minimize total BO+OB
15
-
25% better than other
metrics


Contributions:
Simulation: Kernel Process

Parallel Discrete Event Simulation

Conservative Simulation

Wait until it is safe to process next
event, so that events are
processed in time
-
stamp order

Optimistic Simulation

Allow violations of time
-
stamp
order to occur, but detect
them and recover

Benefits of Optimistic Simulation:

i.
Not dependant on network topology simulated

ii.
As fast as possible forward execution of events

Contributions:
Simulation: Kernel Process

Problem: parallelizing simulation requires 1.5 to 2 times more
memory than sequential, and additional memory requirement
affects performance and scalability

Decreased scalability as model
size increases:

due to increased memory
required to support model

Model Size Increasing

4 Processors Used

Solution: Kernel Processes (KPs)

new data structure supports
parallelism, increases
scalability

Contributions:
Simulation: Seven O’clock

Problem: distributing simulation requires efficient global
synchronization

Inefficient solution: barrier synchronization between all nodes while
performing computation


Efficient solution: pass messages between nodes, and sycnhronize in
background to main simulation


Seven O’clock Algorithm: eliminate message passing


reduce cost
from O(n) or O(log n) to O(1)

OUTLINE


Motivation


Contributions


Meta
-
simulation


ROSS.Net


BGP4
-
OSPFv2 Investigation


Simulation


Kernel Processes


Seven O’clock Algorithm


Conclusion

ROSS.Net:
Big Picture

Goal: an integrated simulation and experiment
design environment

ROSS.Net

(simulation &

meta
-
simulation

Protocol

Design

Protocol
metrics

Protocol
parameters

Measurement

Data
-
sets

(Rocketfuel)

Measured topology
data, traffic and
router stats, etc.

Modeling

Protocol Models:

OSPFv2, BGP4,

TCP Reno, IPv4, etc

ROSS.Net

Design of Experiments

Tool (DOT)

Parallel Discrete Event

Network Simulation

Input
Parameters

Output
Metric(s)

Meta
-
Simulation

Simulation


Experiment design


Statistical analysis


Optimization heuristic
search


Recursive Random
Search


Sparse empirical
modeling


Optimistic parallel
simulation


ROSS


Memory efficient
network protocol
models


ROSS.Net:
Big Picture

Design of Experiments Tool (DOT)

Traditional

Experiment Design

(Full/Fractional Factorial)

Statistical or

Regression Analysis

(R, STRESS)

Metric(s)

Parameter
Vector


Small
-
scale systems


Linear parameter
interactions


Small # of params

Empirical model

Design of Experiments Tool (DOT)

Optimization Search


Statistical or

Regression Analysis

(R, STRESS)

Metric(s)

Parameter
Vector


Large
-
scale systems


Non
-
Linear parameter
interactions


Large # of params


curse of
dimensionality

Sparse empirical model

ROSS.Net:

Meta
-
Simulation Components


Router topology from
Rocketfuel tracedata


took each ISP map as a
single OSPF area


Created BGP domain
between ISP maps


hierarchical mapping of
routers

AT&T’s US Router Network Topology


8 levels of routers:


Levels 0 and 1, 155Mb/s, 4ms delay


Levels 2 and 3, 45Mb/s, 4ms delay


Levels 4 and 5, 1.5Mb/s, 10ms delay


Levels 6 and 7, 0.5Mb/s, 10ms delay

Meta
-
Simulation:
OSPF/BGP Interactions


OSPF


Intra
-
domain, link
-
state routing


Path costs matter


Border Gateway Protocol (BGP)


Inter
-
domain, distance
-
vector, policy routing


Reachability matters


BGP decision
-
making steps:


Highest LOCAL PREF


Lowest AS Path Length


Lowest origin type





( 0


iBGP, 1


eBGP, 2


Incomplete)


Lowest MED


Lowest IGP cost


Lowest router ID

iBGP connectivity

eBGP connectivity

OSPF domain

Meta
-
Simulation:
OSPF/BGP Interactions


Intra
-
domain routing decisions can
effect inter
-
domain behavior, and vice
versa.



All updates belong to either of four
categories:


OSPF
-
caused OSPF (OO) update


OSPF
-
caused BGP (OB) update


interaction


BGP
-
caused OSPF (BO) update


interaction


BGP
-
caused BGP (BB) update

Link failure or cost increase
(e.g. maintenance)

Destination

OB Update

8

10

Meta
-
Simulation:
OSPF/BGP Interactions

Intra
-
domain routing decisions can effect inter
-
domain behavior, and vice versa.


Identified four categories of updates:


OO: OSPF
-
caused OSPF update


BB: BGP
-
caused BGP update


OB: OSPF
-
caused BGP update


interaction


BO: BGP
-
caused OSPF update


interaction

eBGP connectivity
becomes available

Destination

BO Update

These interactions cause route changes to thousands of

IP prefixes, i.e. huge traffic shifts!!

Meta
-
Simulation:
OSPF/BGP Interactions


Three classes of protocol
parameters:


OSPF timers, BGP timers,
BGP decision


Maximum search space size
14,348,907.


RRS was allowed 200 trials
to optimize (minimize)
response surface:


OO, OB, BO, BB,
OB+BO, ALL updates


Applied multiple linear
regression analysis on the
results

Meta
-
Simulation:
OSPF/BGP Interactions


Optimized with respect to
OB+BO
response surface.


BGP timers

play the major role, i.e. ~15% improvement in the optimal
response.


BGP KeepAlive timer seems to be the dominant parameter..


in contrast to
expectation of MRAI!


OSPF timers effect little, i.e. at most 5%.


low time
-
scale OSPF updates do not effect BGP.

Meta
-
Simulation:
OSPF/BGP Interactions

~15% improvement when BGP
timers included in search
space


Varied response surfaces
--

equivalent to a particular management approach.


Importance of parameters differ for each metric.


For minimal total updates:


Local perspectives are 20
-
25% worse than the global.


For minimal total interactions:


15
-
25% worse can happen with other metrics


OB updates are more important than BO updates (i.e. ~0.1% vs. ~50%)

Meta
-
Simulation:
OSPF/BGP Interactions

Important to optimize
OSPF

Important to optimize
OSPF

Important to optimize
OSPF

Important to optimize
OSPF

OB: ~50% of total updates

BO: ~0.1% of total updates

Global perspective 20
-
25% better
than local perspectives

Minimize total BO+OB
15
-
25% better than other
metrics


Meta
-
Simulation

Conclusions:


Number of experiments were reduced by an
order of magnitude in comparison to Full
Factorial.



Experiment design and statistical analysis
enabled rapid elimination of insignificant
parameters.



Several qualitative statements and system
characterizations could be obtained with few
experiments.

OUTLINE


Problem Statement


Contributions


Meta
-
simulation


ROSS.Net


BGP4
-
OSPFv2 Investigation


Simulation


Kernel Processes


Seven O’clock Algorithm


Conclusion

Simulation: Overview

Parallel Discrete Event Simulation


Logical Process (LPs) for each relatively parallelizable simulation
model, e.g. a router, a TCP host


Local Causality Constraint: Events within each LP
must

be processed






in time
-
stamp order


Observation: Adherence to LCC is sufficient to ensure that parallel



simulation will produce same result as sequential simulation

Conservative Simulation

-
Avoid violating the local causality
constraint (wait until it’s safe)


I.
Null Message
(deadlock avoidance)

(Chandy/Misra/Byrant)

II.
Time
-
stamp of next event

Optimistic Simulation

-
Allow violations of local causality to
occur, but detect them and recover
using a rollback mechanism

I.
Time Warp Protocol


(Jefferson, 1985)

II.
Limiting amount of opt. execution

ROSS:
Rensselaer’s Optimistic Simulation System

tw_event

message

receive_ts

src / dest_lp

user data

message

free event list tail

free event list head

event queue

cancel queue

lp_list

tw_pe

pe

lp number

type

proc ev queue head

proc ev queue tail

tw_lp

ROSS

free event list[ ][ ]

GTW

message

cancel queue

lplist[MAX_LP]

PEState

event queue

PEState GState[NPE]

...

message

init proc ptr

rev proc ptr

final proc ptr

LPState

process ptr

message

message

Event

lp number

Example Accesses


GTW: Top down hierarchy

lp_ptr =
GState[LP[i].Map].lplist[LPNum[i]]


ROSS: Bottom up hierarchy

lp_ptr = event
-
>src_lp;

or

pe_ptr = event
-
>src_lp
-
>pe;


Key advantages of bottom up
approach
:


reduces access overheads


improves locality and processor
cache performance


Memory usage only 1% more than
sequential and independent of LP count.

“On the Fly” Fossil Collection

Processor 0

FreeList[1]

FreeList[0]

LP A

LP B

LP C

5.0

5.0

5.0

10.0

10.0

10.0

15.0

15.0

15.0

Snapshot of PE 0’s
internal state at time
15.0

Snapshot of PE 0’s
internal state after
rollback of LP A and
re
-
execute

Processor 0

FreeList[1]

FreeList[0]

LP A

LP B

LP C

5.0

5.0

10.0

10.0

15.0

15.0

5.0

10.0

15.0

Key Observation: Rollbacks cause the free list to become
UNSORTED

in virtual time.

Result: event buffers that could be allocated are not.

user must over
-
allocate the free list

OTFFC works by only allocating events from the free list that are less than GVT. As events are
processed they are immediately placed at the end of the free list....

KP

KP

Kernel

Processes

Contributions:
Simulation: Kernel Process

LP

LP

LP

LP

. . .

(Logical Processes)

9

5

8

7

3

1

9

6

4

2

Fossil Collection / Rollback

PE

(Processing Element

per CPU utilized)

Advantages:


i.

significantly lowers fossil collection overheads

ii.

lowers memory usage by aggregation of LP statistics into KP
statistics

iii.

retains ability to process events on an LP by LP basis in the forward
computation.


Disadvantages:


i.
potential for “false rollbacks”

ii.
care must be taken when deciding on how to map LPs to KPs

ROSS: Kernel Processes

ROSS: KP Efficiency

Not enough work
in system…

Small trade
-
off:
longer rollbacks
vs faster FC

ROSS: KP Performance Impact

# KPs does
not

negatively impact performance

ROSS: Performance vs GTW

ROSS
outperforms GTW
2:1 in sequential

ROSS outperforms GTW
2:1 at best parallel

Optimistic approach


Relies on
global virtual time

(GVT) algorithm to perform fossil collection at
regular intervals


Events with timestamp less than GVT:


Will not be rolled back


Can be freed


GVT calculation


Synchronous

algorithms: LPs stop event processing during GVT calculation


Cost of synch. may be higher than positive work done per interval


Processes waste time waiting


Asynchronous

algorithms: LPs continue processing events while GVT
calculation continues in the background

*
Goal
: creating a consistent cut among LPs that divides the events
into past and future the wall
-
clock time

Two problems: (i) Transient Message Problem, (ii) Simultaneous Reporting Problem

Simulation: Seven O’clock GVT

Construct cut via message
-
passing


Cost: O(log n) if
tree
, O(N) if ring

!

If large number of
processors, then free
pool exhausted waiting
for GVT to complete

Simulation: Mattern’s GVT

Construct cut using shared
memory flag

Cost: O(1)

!

Limited to shared
memory architecture

Sequentially consistent
memory model ensures
proper causal order

Simulation: Fujimoto’s GVT

Sequentially consistent
does not mean
instantaneous

Memory events are only
guaranteed to be
causally ordered

Is there a method to achieve
sequentially consistent
shared memory in a loosely
coordinated, distributed
environment?


Simulation: Memory Model

Key observations
:



An operation can occur atomically within a network of processors if
all processors observe that the event occurred at the same time.


CPU clock time scale (ns) is significantly smaller than network time
-
scale (ms).

Network Atomic Operations

(NAOs):


an agreed upon frequency in wall
-
clock time at which some event
logically observed to have happened across a distributed system.


subset of the possible operations provided by a complete sequentially
consistent memory model.

wall
-
clock time

Compute

GVT

Compute

GVT

Compute

GVT

Compute

GVT

Compute

GVT

Compute

GVT

Compute

GVT

Update
Tables

Update
Tables

Update
Tables

Update
Tables

Update
Tables

Update
Tables

Update
Tables

Simulation: Seven O’clock GVT

wall
-
clock time

GVT

7

5

10

9

LVT: 7

LVT: 5

LVT: min(5,9)

GVT: min(5,7)

A

B

C

D

E


Itanium
-
2 Cluster


r
-
PHOLD


1,000,000 LPs


10% remote events


16 start events


4 machines


1
-
4 CPUs


1.3 GHz


Round
-
robin LP to
PE mapping


Simulation: Seven O’clock GVT

Linear Performance


Netfinity Cluster


r
-
PHOLD


1,000,000 LPs


10, 25% remote
events


16 start events


4 machines


2 CPUs, 36 nodes


800 GHz


Simulation: Seven O’clock GVT


Itanium
-
2 Cluster


1,000,000 LPs


each modeling a
TCP host (i.e. one
end of a TCP
connection).


2 or 4 machines


1
-
4 CPUs on each


1.3 GHz


Poorly mapped
LP/KP/PE

Simulation: Seven O’clock GVT: TCP

Linear Performance


Netfinity Cluster


1,000,000 LPs


each modeling a
TCP host (i.e. one
end of a TCP
connection).


4
-
36 machines


1
-
2 CPUs on each


Pentium III


800MHz

Simulation: Seven O’clock GVT: TCP


Sith Itanium
-
2
cluster


1,000,000 LPs


each modeling a
TCP host (i.e. one
end of a TCP
connection).


4
-
36 machines


1
-
2 CPUs on each


900MHz

Simulation: Seven O’clock GVT: TCP

Summary


Seven O’Clock Algorithm


Clock
-
based algorithm

for distributed processors


creates a sequentially consistent view of distributed memory


Zero
-
Cost Consistent Cut


Highly
scalable

and independent of event memory limits

Fujimoto’s

Seven O’Clock

Mattern’s

Samadi’s

Cut Calculation
Complexity

O(1)

O(1)

O(n) or O(log n)

O(n) or O(log n)

Parallel /
Distributed

P

P & D

P & D

P & D

Global Invariant

Shared
Memory Flag

Clock
Synchronization

Message Passing
Interface

Message Passing
Interface

Independent of
Event Memory

N

Y

N

N

Simulation: Seven O’clock GVT

Summary: Contributions


Meta
-
simulation



ROSS.Net: platform for large
-
scale network simulation,
experiment design and analysis



OSPFv2 protocol performance analysis



BGP4/OSPFv2 protocol interactions


Simulation



kernel processes



memory efficient, large
-
scale simulation



Seven O’clock GVT Algorithm



zero
-
cost consistent cut



high performance distributed execution


Summary: Future Work


Meta
-
simulation



ROSS.Net: platform for large
-
scale network



incorporate more realistic measurement data, protocol
models



CAIDA, Multi
-
cast, UDP, other TCP variants



more complex experiment designs


better qualitative
analysis


Simulation



Seven O’clock GVT Algorithm



compute FFT and analyze “power” of different models



attempt to eliminate GVT algorithm by determining max rollback
length