# pptx - Department of Computer Science, University of Toronto

Δίκτυα και Επικοινωνίες

26 Οκτ 2013 (πριν από 4 χρόνια και 6 μήνες)

110 εμφανίσεις

Professor Yashar Ganjali

Department of Computer Science

University of Toronto

yganjali@cs.toronto.edu

http://www.cs.toronto.edu/~yganjali

TexPoint fonts used in EMF.

Read the TexPoint manual before you delete this box.:

Announcements

Final Project

Intermediate report
d
ue:
Fri. Nov. 9
th

Don’t wait till the
last minute

Volunteer for next week’s presentation?

CSC 2203

Packet Switch and Network Architectures

2

University of Toronto

Fall 2012

Outline

Uniform traffic

Uniform cyclic

Random permutation

Wait
-
until
-
full

Non
-
uniform traffic, known traffic matrix

Birkhoff
-
von
-
Neumann

Unknown traffic matrix

Maximum Size Matching

Maximum Weight Matching

CSC 2203

Packet Switch and Network Architectures

3

University of Toronto

Fall 2012

CSC 2203

Packet Switch and Network Architectures

4

University of Toronto

Fall 2012

Maximum Size Matching
Unstability

Counter
-
example for maximum size matching stability

Consider
the following non
-
uniform traffic pattern, with Bernoulli IID arrivals:

Consider the case when Q
21
, Q
32

both have arrivals, w. p. (1/2
-

2
.

In this case, input 1 is served w. p. at most 2/3.

Overall, the service rate for input 1,

1

is at most

2/3.[(1/2
-

2
] + 1.[1
-
(1/2
-

2
]

i.e.

1

≤ 1

1/3.(1/2
-

2
.

Switch unstable for

0.0358

Three possible

matches, S(
n
):

Problem
. Maximum size
matching

Maximizes instantaneous
throughput.

Does not take into
account VOQ backlogs.

Solution
. Give higher
priority to VOQs which
have more packets.

Scheduling in Input
-
Queued Switches

CSC 2203

Packet Switch and Network Architectures

5

University of Toronto

Fall 2012

A
1
(n)

N

N

Q
NN
(n)

A
1N
(n)

A
11
(n)

Q
11
(n)

1

1

A
N
(n)

A
NN
(n)

A
N1
(n)

D
1
(n)

D
N
(n)

S*(n)

Maximum Weight Matching (MWM)

Assign weights to the edges of the request graph.

Find the matching with maximum weight.

CSC 2203

Packet Switch and Network Architectures

6

University of Toronto

Fall 2012

Q
11
(n)>0

Q
N1
(n)>0

Request Graph

Weighted Request Graph

Assign

Weights

W
11

W
N1

MWM Scheduling

Create the request graph.

Find the associated link weights.

Find the matching with maximum weight.

How?

Transfer packets from the ingress lines to egress lines
based on the matching.

Question
. How often do we need to calculate MWM?

CSC 2203

Packet Switch and Network Architectures

7

University of Toronto

Fall 2012

Weights

Longest Queue First (LQF)

Weight associated with each link is the length of the
corresponding VOQ.

MWM here, tends to give priority to long queues.

Does not necessarily serve the longest queue.

Oldest Cell First (OCF)

Weight of each link is the waiting time of the HoL
packet in the corresponding queue.

CSC 2203

Packet Switch and Network Architectures

8

University of Toronto

Fall 2012

Longest Queue First (LQF)

LQF is the name given to the maximum weight
matching, where weight
w
ij
(n) =
L
ij
(n).

But the name is so bad that people keep the name
“MWM”!

LQF doesn’t necessarily serve the longest queue.

LQF can leave a short queue
unserved

indefinitely.

Theorem
. MWM
-
LQF scheduling provides 100%
throughput.

However, MWM
-
LQF is very important theoretically:
most (if not all) scheduling algorithms that provide
100% throughput for unknown traffic matrices are
variants of MWM!

CSC 2203

Packet Switch and Network Architectures

9

University of Toronto

Fall 2012

Proof Idea
: Use
Lyapunov

Functions

Basic idea
: when queues become large, the MWM
schedule tends to give them a negative drift.

CSC 2203

Packet Switch and Network Architectures

10

University of Toronto

Fall 2012

CSC 2203

Packet Switch and Network Architectures

11

University of Toronto

Fall 2012

Lyapunov Analysis

Simple Example

CSC 2203

Packet Switch and Network Architectures

12

University of Toronto

Fall 2012

Lyapunov Example

Cont’d

CSC 2203

Packet Switch and Network Architectures

13

University of Toronto

Fall 2012

Lyapunov Functions

CSC 2203

Packet Switch and Network Architectures

14

University of Toronto

Fall 2012

Back to the Proof

CSC 2203

Packet Switch and Network Architectures

15

University of Toronto

Fall 2012

Outline of Proof

Note: proof based on paper by McKeown
et al.

CSC 2203

Packet Switch and Network Architectures

16

University of Toronto

Fall 2012

LQF Variants

Question: what if

or

What if weight
w
ij
(
n
) =
W
ij
(
n
)
(waiting time)?

Preference is given to cells that have waited a long
-
time.

Is it stable?

We call the algorithm OCF (Oldest Cell First).

Remember that it doesn’t guarantee to serve the oldest
cell!

Summary of MWM Scheduling

MWM

LQF scheduling provides 100% throughput.

It can starve some of the packets.

MWM

OCF scheduling gives 100% throughput.

No starvation.

Question
. Are these fast enough to implement in real
switches?

CSC 2203

Packet Switch and Network Architectures

17

University of Toronto

Fall 2012

CSC 2203

Packet Switch and Network Architectures

18

University of Toronto

Fall 2012

References

“Achieving 100% Throughput in an Input
-
queued
Switch (Extended Version)”. Nick McKeown, Adisak
Mekkittikul, Venkat Anantharam and Jean Walrand.
IEEE Transactions on Communications, Vol.47, No.8,
August 1999.

“A Practical Scheduling Algorithm to Achieve 100%
Throughput in Input
-
Mekkittikul and Nick McKeown. IEEE Infocom 98, Vol
2, pp. 792
-
799, April 1998, San Francisco.

CSC 2203

Packet Switch and Network Architectures

19

University of Toronto

Fall 2012

The Story So Far

Output
-
queued switches

Best performance

Impractical
-

need speedup of
N

Input
-
queued switches

Head of line blocking

VOQs

Known traffic matrix

BvN

Unknown traffic matrix

MWM

Complexity of Maximum Matchings

Maximum Size
Matchings
:

Typical complexity O(N
2.5
)

Maximum Weight
Matchings
:

Typical complexity O(N
3
)

In general:

Hard to implement in hardware

Slooooow

Can we find a faster algorithm?

CSC 2203

Packet Switch and Network Architectures

20

University of Toronto

Fall 2012

Maximal Matching

A
maxim
al

matching is a matching in which each
edge is added one at a time, and is not later removed
from the matching.

No augmenting paths allowed (they remove edges

Consequence: no input and output are left
unnecessarily idle.

CSC 2203

Packet Switch and Network Architectures

21

University of Toronto

Fall 2012

CSC 2203

Packet Switch and Network Architectures

22

University of Toronto

Fall 2012

Example of Maximal Matching

A

1

B

C

D

E

F

2

3

4

5

6

A

1

B

C

D

E

F

2

3

4

5

6

Maxim
al

Size Matching

Maxim
um

Size Matching

A

B

C

D

E

F

1

2

3

4

5

6

Properties of Maximal Matchings

In general, maximal matching is much simpler to
implement, and has a much faster running time.

A maximal size matching is at least half the size of a
maximum size matching. (Why?)

We’ll study the following algorithms:

Greedy LQF

WFA

PIM

iSLIP

CSC 2203

Packet Switch and Network Architectures

23

University of Toronto

Fall 2012

Greedy LQF

Greedy LQF

(Greedy Longest Queue First) is defined
as follows:

Pick the VOQ with the most number of packets (if
there are ties, pick at random among the VOQs that
are tied). Say it is VOQ(i
1
,j
1
).

Then, among all free VOQs, pick again the VOQ with
the most number of packets (say VOQ(i
2
,j
2
), with i
2

≠ i
1
,
j
2

≠ j
1
).

Continue likewise until the algorithm converges.

Greedy LQF is also called
iLQF

(iterative LQF) and
Greedy Maximal Weight Matching.

CSC 2203

Packet Switch and Network Architectures

24

University of Toronto

Fall 2012

Properties of Greedy LQF

The algorithm converges in at most N iterations.
(Why?)

Greedy LQF results in a maximal size matching.
(Why?)

Greedy LQF produces a matching that has at least half
the size and half the weight of a maximum weight
matching. (Why?)

CSC 2203

Packet Switch and Network Architectures

25

University of Toronto

Fall 2012

CSC 2203

Packet Switch and Network Architectures

26

University of Toronto

Fall 2012

Wave Front Arbiter (WFA
)
[
Tamir

and Chi, 1993]

Requests

Match

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

CSC 2203

Packet Switch and Network Architectures

27

University of Toronto

Fall 2012

Wave Front Arbiter

Requests

Match

CSC 2203

Packet Switch and Network Architectures

28

University of Toronto

Fall 2012

Wave Front
Arbiter

Implementation

1,1

1,2

1,3

1,4

2,1

2,2

2,3

2,4

3,1

3,2

3,3

3,4

4,1

4,2

4,3

4,4

Simple combinational

logic blocks

CSC 2203

Packet Switch and Network Architectures

29

University of Toronto

Fall 2012

Wave Front
Arbiter

Wrapped WFA
(WWFA)

Requests

Match

N

2
N
-
1

Properties of Wave Front Arbiters

Feed
-
forward (i.e. non
-
iterative) design lends itself to
pipelining.

Always finds maximal match.

Usually requires mechanism to prevent Q
11

from
getting preferential service.

In principle, can be distributed over multiple chips.

CSC 2203

Packet Switch and Network Architectures

30

University of Toronto

Fall 2012

CSC 2203

Packet Switch and Network Architectures

31

University of Toronto

Fall 2012

Parallel Iterative
Matching
[
Anderson
et al.
, 1993]

uar selection

uar selection

1

2

3

4

1

2

3

4

2: Grant

1

2

3

4

1

2

3

4

3: Accept/Match

1: Requests

1

2

3

4

1

2

3

4

#1

1

2

3

4

1

2

3

4

#2

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

PIM Properties

Guaranteed to find a maximal match in at most N
iterations. (Why?)

In each phase, each input and output arbiter can
make decisions independently.

In general, will converge to a maximal match in < N
iterations.

How many iterations should we run?

CSC 2203

Packet Switch and Network Architectures

32

University of Toronto

Fall 2012

Parallel Iterative Matching

Convergence Time

CSC 2203

Packet Switch and Network Architectures

33

University of Toronto

Fall 2012

Number of iterations to converge:

Anderson et al., “
High
-
Speed Switch Scheduling for Local Area Networks,
” 1993.

CSC 2203

Packet Switch and Network Architectures

34

University of Toronto

Fall 2012

Parallel Iterative Matching

CSC 2203

Packet Switch and Network Architectures

35

University of Toronto

Fall 2012

Parallel Iterative Matching

PIM with a single
iteration

CSC 2203

Packet Switch and Network Architectures

36

University of Toronto

Fall 2012

Parallel Iterative Matching

PIM with 4
iterations

CSC 2203

Packet Switch and Network Architectures

37

University of Toronto

Fall 2012

i
SLIP

[
McKeown
et al.
, 1993]

1

2

3

4

1

2

3

4

1: Requests

1

2

3

4

1

2

3

4

3: Accept/Match

1

2

3

4

1

2

3

4

#1

#2

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

2: Grant

1

2

3

4

iSLIP Operation

Grant phase
: Each output selects the requesting
input at the pointer, or the next input in round
-
robin
order. It only updates its pointer if the grant is
accepted.

Accept phase
: Each input selects the granting output
at the pointer, or the next output in round
-
robin
order.

Consequence
: Under high load, grant pointers tend
to move to unique values.

CSC 2203

Packet Switch and Network Architectures

38

University of Toronto

Fall 2012

iSLIP

Properties

Random under low load

TDM under high load

Lowest priority to MRU (most recently used)

1 iteration: fair to outputs

Converges in at most N iterations. (On average,
simulations suggest < log2N)

Implementation: N priority encoders

100% throughput for uniform i.i.d. traffic.

But…some pathological patterns can lead to low
throughput.

CSC 2203

Packet Switch and Network Architectures

39

University of Toronto

Fall 2012

CSC 2203

Packet Switch and Network Architectures

40

University of Toronto

Fall 2012

i
SLIP

CSC 2203

Packet Switch and Network Architectures

41

University of Toronto

Fall 2012

i
SLIP

CSC 2203

Packet Switch and Network Architectures

42

University of Toronto

Fall 2012

i
SLIP

Implementation

Grant

Grant

Grant

Accept

Accept

Accept

1

2

N

1

2

N

State

N

N

N

Decision

log
2
N

log
2
N

log
2
N

Programmable

Priority Encoder

Maximal Matches

Maximal matching algorithms are widely used in
industry (especially algorithms based on WFA and
iSLIP).

PIM and iSLIP are rarely run to completion (i.e. they
are sub
-
maximal).

We will see that a maximal match with a speedup of
2 is stable for non
-
uniform traffic.

CSC 2203

Packet Switch and Network Architectures

43

University of Toronto

Fall 2012

References

A. Schrijver, “Combinatorial Optimization
-

Polyhedra and
Efficiency”, Springer
-
Verlag, 2003.

T. Anderson, S. Owicki, J. Saxe, and C. Thacker, “High
-
Speed Switch Scheduling for Local
-
Area Networks,” ACM
Transactions on Computer Systems, II (4):319
-
352,
November 1993.

Y. Tamir and H.
-
C. Chi, “Symmetric Crossbar Arbiters for
VLSI Communication Switches,” IEEE Transactions on
Parallel and Distributed Systems, 4(j):13
-
27, 1993.

N. McKeown, “The iSLIP Scheduling Algorithm for Input
-
Queued Switches,” IEEE/ACM Transactions on
Networking, 7(2):188
-
201, April 1999.

CSC 2203

Packet Switch and Network Architectures

44

University of Toronto

Fall 2012