pptx - Department of Computer Science, University of Toronto

deadpannectarineΔίκτυα και Επικοινωνίες

26 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

82 εμφανίσεις

Professor Yashar Ganjali

Department of Computer Science

University of Toronto


yganjali@cs.toronto.edu

http://www.cs.toronto.edu/~yganjali

TexPoint fonts used in EMF.

Read the TexPoint manual before you delete this box.:

Announcements


Final Project


Intermediate report
d
ue:
Fri. Nov. 9
th


Don’t wait till the
last minute



Volunteer for next week’s presentation?

CSC 2203


Packet Switch and Network Architectures

2

University of Toronto


Fall 2012

Outline


Uniform traffic


Uniform cyclic


Random permutation


Wait
-
until
-
full


Non
-
uniform traffic, known traffic matrix


Birkhoff
-
von
-
Neumann


Unknown traffic matrix


Maximum Size Matching


Maximum Weight Matching

CSC 2203


Packet Switch and Network Architectures

3

University of Toronto


Fall 2012

CSC 2203


Packet Switch and Network Architectures

4

University of Toronto


Fall 2012

Maximum Size Matching
Unstability


Counter
-
example for maximum size matching stability


Consider
the following non
-
uniform traffic pattern, with Bernoulli IID arrivals:










Consider the case when Q
21
, Q
32

both have arrivals, w. p. (1/2
-



2
.


In this case, input 1 is served w. p. at most 2/3.



Overall, the service rate for input 1,

1

is at most



2/3.[(1/2
-


2
] + 1.[1
-
(1/2
-



2
]



i.e.


1

≤ 1


1/3.(1/2
-



2
.



Switch unstable for




0.0358

Three possible

matches, S(
n
):








Problem
. Maximum size
matching


Maximizes instantaneous
throughput.


Does not take into
account VOQ backlogs.








Solution
. Give higher
priority to VOQs which
have more packets.


Scheduling in Input
-
Queued Switches

CSC 2203


Packet Switch and Network Architectures

5

University of Toronto


Fall 2012

A
1
(n)

N

N

Q
NN
(n)

A
1N
(n)

A
11
(n)

Q
11
(n)

1

1

A
N
(n)

A
NN
(n)

A
N1
(n)

D
1
(n)

D
N
(n)

S*(n)

Maximum Weight Matching (MWM)


Assign weights to the edges of the request graph.








Find the matching with maximum weight.

CSC 2203


Packet Switch and Network Architectures

6

University of Toronto


Fall 2012

Q
11
(n)>0

Q
N1
(n)>0

Request Graph

Weighted Request Graph

Assign

Weights

W
11

W
N1

MWM Scheduling


Create the request graph.


Find the associated link weights.


Find the matching with maximum weight.


How?


Transfer packets from the ingress lines to egress lines
based on the matching.



Question
. How often do we need to calculate MWM?

CSC 2203


Packet Switch and Network Architectures

7

University of Toronto


Fall 2012

Weights


Longest Queue First (LQF)


Weight associated with each link is the length of the
corresponding VOQ.


MWM here, tends to give priority to long queues.


Does not necessarily serve the longest queue.



Oldest Cell First (OCF)


Weight of each link is the waiting time of the HoL
packet in the corresponding queue.

CSC 2203


Packet Switch and Network Architectures

8

University of Toronto


Fall 2012

Longest Queue First (LQF)


LQF is the name given to the maximum weight
matching, where weight
w
ij
(n) =
L
ij
(n).


But the name is so bad that people keep the name
“MWM”!


LQF doesn’t necessarily serve the longest queue.


LQF can leave a short queue
unserved

indefinitely.


Theorem
. MWM
-
LQF scheduling provides 100%
throughput.


However, MWM
-
LQF is very important theoretically:
most (if not all) scheduling algorithms that provide
100% throughput for unknown traffic matrices are
variants of MWM!

CSC 2203


Packet Switch and Network Architectures

9

University of Toronto


Fall 2012

Proof Idea
: Use
Lyapunov

Functions


Basic idea
: when queues become large, the MWM
schedule tends to give them a negative drift.



CSC 2203


Packet Switch and Network Architectures

10

University of Toronto


Fall 2012

CSC 2203


Packet Switch and Network Architectures

11

University of Toronto


Fall 2012

Lyapunov Analysis


Simple Example

CSC 2203


Packet Switch and Network Architectures

12

University of Toronto


Fall 2012

Lyapunov Example


Cont’d

CSC 2203


Packet Switch and Network Architectures

13

University of Toronto


Fall 2012

Lyapunov Functions

CSC 2203


Packet Switch and Network Architectures

14

University of Toronto


Fall 2012

Back to the Proof

CSC 2203


Packet Switch and Network Architectures

15

University of Toronto


Fall 2012

Outline of Proof

Note: proof based on paper by McKeown
et al.

CSC 2203


Packet Switch and Network Architectures

16

University of Toronto


Fall 2012

LQF Variants


Question: what if


or




What if weight
w
ij
(
n
) =
W
ij
(
n
)
(waiting time)?


Preference is given to cells that have waited a long
-
time.


Is it stable?


We call the algorithm OCF (Oldest Cell First).


Remember that it doesn’t guarantee to serve the oldest
cell!

Summary of MWM Scheduling


MWM


LQF scheduling provides 100% throughput.


It can starve some of the packets.


MWM


OCF scheduling gives 100% throughput.


No starvation.



Question
. Are these fast enough to implement in real
switches?

CSC 2203


Packet Switch and Network Architectures

17

University of Toronto


Fall 2012

CSC 2203


Packet Switch and Network Architectures

18

University of Toronto


Fall 2012

References


“Achieving 100% Throughput in an Input
-
queued
Switch (Extended Version)”. Nick McKeown, Adisak
Mekkittikul, Venkat Anantharam and Jean Walrand.
IEEE Transactions on Communications, Vol.47, No.8,
August 1999.


“A Practical Scheduling Algorithm to Achieve 100%
Throughput in Input
-
Queued Switches.”. Adisak
Mekkittikul and Nick McKeown. IEEE Infocom 98, Vol
2, pp. 792
-
799, April 1998, San Francisco.

CSC 2203


Packet Switch and Network Architectures

19

University of Toronto


Fall 2012

The Story So Far


Output
-
queued switches


Best performance


Impractical
-

need speedup of
N


Input
-
queued switches


Head of line blocking


VOQs


Known traffic matrix


BvN


Unknown traffic matrix


MWM

Complexity of Maximum Matchings


Maximum Size
Matchings
:


Typical complexity O(N
2.5
)


Maximum Weight
Matchings
:


Typical complexity O(N
3
)



In general:


Hard to implement in hardware


Slooooow



Can we find a faster algorithm?

CSC 2203


Packet Switch and Network Architectures

20

University of Toronto


Fall 2012

Maximal Matching


A
maxim
al

matching is a matching in which each
edge is added one at a time, and is not later removed
from the matching.



No augmenting paths allowed (they remove edges
added earlier)



Consequence: no input and output are left
unnecessarily idle.

CSC 2203


Packet Switch and Network Architectures

21

University of Toronto


Fall 2012

CSC 2203


Packet Switch and Network Architectures

22

University of Toronto


Fall 2012

Example of Maximal Matching

A

1

B

C

D

E

F

2

3

4

5

6

A

1

B

C

D

E

F

2

3

4

5

6

Maxim
al


Size Matching

Maxim
um


Size Matching

A

B

C

D

E

F

1

2

3

4

5

6

Properties of Maximal Matchings


In general, maximal matching is much simpler to
implement, and has a much faster running time.



A maximal size matching is at least half the size of a
maximum size matching. (Why?)



We’ll study the following algorithms:


Greedy LQF


WFA


PIM


iSLIP

CSC 2203


Packet Switch and Network Architectures

23

University of Toronto


Fall 2012

Greedy LQF


Greedy LQF

(Greedy Longest Queue First) is defined
as follows:


Pick the VOQ with the most number of packets (if
there are ties, pick at random among the VOQs that
are tied). Say it is VOQ(i
1
,j
1
).


Then, among all free VOQs, pick again the VOQ with
the most number of packets (say VOQ(i
2
,j
2
), with i
2

≠ i
1
,
j
2

≠ j
1
).


Continue likewise until the algorithm converges.


Greedy LQF is also called
iLQF

(iterative LQF) and
Greedy Maximal Weight Matching.

CSC 2203


Packet Switch and Network Architectures

24

University of Toronto


Fall 2012

Properties of Greedy LQF


The algorithm converges in at most N iterations.
(Why?)


Greedy LQF results in a maximal size matching.
(Why?)


Greedy LQF produces a matching that has at least half
the size and half the weight of a maximum weight
matching. (Why?)

CSC 2203


Packet Switch and Network Architectures

25

University of Toronto


Fall 2012

CSC 2203


Packet Switch and Network Architectures

26

University of Toronto


Fall 2012

Wave Front Arbiter (WFA
)
[
Tamir

and Chi, 1993]

Requests

Match

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

CSC 2203


Packet Switch and Network Architectures

27

University of Toronto


Fall 2012

Wave Front Arbiter

Requests

Match

CSC 2203


Packet Switch and Network Architectures

28

University of Toronto


Fall 2012

Wave Front
Arbiter


Implementation

1,1

1,2

1,3

1,4

2,1

2,2

2,3

2,4

3,1

3,2

3,3

3,4

4,1

4,2

4,3

4,4

Simple combinational

logic blocks

CSC 2203


Packet Switch and Network Architectures

29

University of Toronto


Fall 2012

Wave Front
Arbiter


Wrapped WFA
(WWFA)

Requests

Match

N

steps instead of

2
N
-
1

Properties of Wave Front Arbiters


Feed
-
forward (i.e. non
-
iterative) design lends itself to
pipelining.


Always finds maximal match.


Usually requires mechanism to prevent Q
11

from
getting preferential service.


In principle, can be distributed over multiple chips.

CSC 2203


Packet Switch and Network Architectures

30

University of Toronto


Fall 2012

CSC 2203


Packet Switch and Network Architectures

31

University of Toronto


Fall 2012

Parallel Iterative
Matching
[
Anderson
et al.
, 1993]

uar selection

uar selection

1

2

3

4

1

2

3

4


2: Grant

1

2

3

4

1

2

3

4


3: Accept/Match


1: Requests

1

2

3

4

1

2

3

4

#1

1

2

3

4

1

2

3

4

#2

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

PIM Properties


Guaranteed to find a maximal match in at most N
iterations. (Why?)


In each phase, each input and output arbiter can
make decisions independently.


In general, will converge to a maximal match in < N
iterations.


How many iterations should we run?

CSC 2203


Packet Switch and Network Architectures

32

University of Toronto


Fall 2012

Parallel Iterative Matching


Convergence Time

CSC 2203


Packet Switch and Network Architectures

33

University of Toronto


Fall 2012

Number of iterations to converge:

Anderson et al., “
High
-
Speed Switch Scheduling for Local Area Networks,
” 1993.

CSC 2203


Packet Switch and Network Architectures

34

University of Toronto


Fall 2012

Parallel Iterative Matching

CSC 2203


Packet Switch and Network Architectures

35

University of Toronto


Fall 2012

Parallel Iterative Matching

PIM with a single
iteration

CSC 2203


Packet Switch and Network Architectures

36

University of Toronto


Fall 2012

Parallel Iterative Matching

PIM with 4
iterations

CSC 2203


Packet Switch and Network Architectures

37

University of Toronto


Fall 2012

i
SLIP

[
McKeown
et al.
, 1993]

1

2

3

4

1

2

3

4


1: Requests

1

2

3

4

1

2

3

4


3: Accept/Match

1

2

3

4

1

2

3

4

#1

#2

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4


2: Grant

1

2

3

4

iSLIP Operation


Grant phase
: Each output selects the requesting
input at the pointer, or the next input in round
-
robin
order. It only updates its pointer if the grant is
accepted.


Accept phase
: Each input selects the granting output
at the pointer, or the next output in round
-
robin
order.


Consequence
: Under high load, grant pointers tend
to move to unique values.

CSC 2203


Packet Switch and Network Architectures

38

University of Toronto


Fall 2012

iSLIP

Properties


Random under low load


TDM under high load


Lowest priority to MRU (most recently used)


1 iteration: fair to outputs


Converges in at most N iterations. (On average,
simulations suggest < log2N)


Implementation: N priority encoders


100% throughput for uniform i.i.d. traffic.


But…some pathological patterns can lead to low
throughput.

CSC 2203


Packet Switch and Network Architectures

39

University of Toronto


Fall 2012

CSC 2203


Packet Switch and Network Architectures

40

University of Toronto


Fall 2012

i
SLIP

CSC 2203


Packet Switch and Network Architectures

41

University of Toronto


Fall 2012

i
SLIP

CSC 2203


Packet Switch and Network Architectures

42

University of Toronto


Fall 2012

i
SLIP

Implementation

Grant

Grant

Grant

Accept

Accept

Accept

1

2

N

1

2

N

State

N

N

N

Decision

log
2
N

log
2
N

log
2
N

Programmable

Priority Encoder

Maximal Matches


Maximal matching algorithms are widely used in
industry (especially algorithms based on WFA and
iSLIP).


PIM and iSLIP are rarely run to completion (i.e. they
are sub
-
maximal).


We will see that a maximal match with a speedup of
2 is stable for non
-
uniform traffic.

CSC 2203


Packet Switch and Network Architectures

43

University of Toronto


Fall 2012

References


A. Schrijver, “Combinatorial Optimization
-

Polyhedra and
Efficiency”, Springer
-
Verlag, 2003.



T. Anderson, S. Owicki, J. Saxe, and C. Thacker, “High
-
Speed Switch Scheduling for Local
-
Area Networks,” ACM
Transactions on Computer Systems, II (4):319
-
352,
November 1993.



Y. Tamir and H.
-
C. Chi, “Symmetric Crossbar Arbiters for
VLSI Communication Switches,” IEEE Transactions on
Parallel and Distributed Systems, 4(j):13
-
27, 1993.



N. McKeown, “The iSLIP Scheduling Algorithm for Input
-
Queued Switches,” IEEE/ACM Transactions on
Networking, 7(2):188
-
201, April 1999.

CSC 2203


Packet Switch and Network Architectures

44

University of Toronto


Fall 2012