# Routing Algorithms

Networking and Communications

Jul 13, 2012 (5 years and 10 months ago)

408 views

Seif Haridi 1
Routing algorithms
Seif Haridi
Seif Haridi 2
The routing problem
• Routing is the decision making procedure by
which one node selects one (or more) of its
neighbors to forward a packet towards its ultimate
destination.
– Routing-table computation.
– Packet forwarding.
Seif Haridi 3
Criteria for good routing:
– Correctness, each packet is delivered.
– Complexity (few time, storage, messages to compute tables).
– Efficiency, routing though “best” paths. Choice of “good paths”,
small delay, high bandwidth.
– Robustness. Table computation.
• Changes in topology. Tables are updated when a channel/node
– Fairness in delivery of packets
Seif Haridi 4
Some graph theory
A path of length k between v
0
and v
k
is sequence P=v
0
,

v
k

such that v
i
v
i+1

path is simple if the nodes v
0
through v
k
different.
A cycle is path of which the begin node is equal to the end node.
8
9
13
3
5
12
Weighted
Directed
Undirected
An undirec
ted graph
is
V,
E
V the node set,
E is a collection of unordered pairs from V,
The degree of a node v V is the number of edges
incident from v (the number of neighbors).

,
Seif Haridi 5
Graphs
• A cycle is simple if nodes v
1
through v
k
are different.
• The distance between u and v, d(u,v), is the length of the shortest path
between u and v.
• The diameter of a graph G is the largest distance between any two
nodes.
• An undirected graph is connected if there is path between any two
nodes.
• An undirected graph is acyclic if it contains no simple cycles of length
3 or more.
• A tree is an undirected, connected, acyclic graph.
Seif Haridi 6
Graphs
• Trees, G={v
1
,…,v
N
}
– A tree is an undirected, connected, acyclic graph.
• Equivalent statements
– Between any node there is a unique simple path.
– G is connected but becomes disconnected if any edge is removed.
– G is connected, |E| = N-1.
– G is acyclic, |E| = N-1.
Seif Haridi 7
What is a best-path algorithm
(1) Minimum Hop.
(2) Shortest path, given that each channel is assigned a
weight.
(3) Minimum delay, the weight depends on the load of the
channel. Tables are revised to take into account the load.
Seif Haridi 8
Summary
• Section 4.1
– For minimum hop and shortest path, there are routing algorithms
that routes all packets for the same destination d optimally via a
spanning tree rooted at d. The source of the packets can be ignored
(destination-based routing).
• Section 4.2
– An distributed algorithm that computes the routing table for a static
network. Stores the first neighbor to each destination in the node’s
routing tables. The algorithm must be recomputed on topological
change in the network.
• Section 4.3
– The NETCHANGE algorithm. Does partial recomputation of
routing tables.
• Section 4.4
– Coding topological information in the node addresses.
Seif Haridi 9
Summary
• Section 4.5
– Hierarchical routing methods.
Seif Haridi 10
Destination-based routing
• Optimal routing algorithm exist if the following is satisfied:
– The cost of sending a packet P via a path is independent of the
actual utilization of the path (load in involved).
– The cost of concatenation of two paths equal the sum of the costs
of the the two paths:
),,(),,(),,(
,
,
0

all
For
1
0
0


k
i
i
k
uuCuuCuuC
k
i


– The graph does not contain any cycle of negative cost.
• A path from u to v is optimal
if there is no path from u to v with lower
cost.
Seif Haridi 11
Existence of optimal paths
• Lemma 4.1
– Let u,v be in V. If a path from u to v exists in G, then there is a
simple path that is optimal.
• Proof
– There is a finite number of simple paths in G.
– There is a finite number of simple paths from any u to v.
– Choose S

that is minimal from u to v.
– For all non-simple paths P
i
, S is a lower bound.
Seif Haridi 12
Existence of optimal paths
Assume a non-simple path from u to v, call it P
0
, remove the cycles resulting in P
N.
.
Then C(S)  C(P
N
)
P
i-1
u v
v
i
v
i
first
last
P
i
u
v
v
i
F
G
F G
V={v
1
,…,v
N
}
Seif Haridi 13
Minimal Spanning Trees
. in to from path optimal an
is to from path the, node eachfor that such and
,,, a tree exists there, eachFor
4.2

Theorem
Vdv
dvVv
EEEVTVd
ddd


.

in

to

from
path optimal an is in to from path simple the, .4
. of subtreea is 3.
., ; of treesuba a tree; is 2.
.},{
1.
:properties with
,,,, treesof seriesa Construct
},
,
,
{

1
0
000
00
G
d
w
TdwVw
TT
EEVVGT
dT
EVTEVT
v
d
v
v
V
ii
ii
iii
NNN
N


 


on
Constructi
Seif Haridi 14
• Set V
0
to {d}, E
0
to .
• Construct T
i+1
from T
i
: pick v
i+1
 V
i
, v
i+1
V.
• Choose an optimal path from v
i+1
to d, call it P.
• u
l
is the first node in the path such that u
l
V
i
• V
i+1
= V
i
 the set of nodes in the prefix u
0
,…,u
l-1
of P.
• E
i+1
= E
i
 the set of edges in the prefix u
0
,…,u
l-1
of P.
The construction
d
v
i+1
u
0
u
l
u
k
P
Seif Haridi 15
The construction
u
0
u
l
u
k
P”
T
i+1
is a tree; connected and the
number of nodes exceeds the edges
by one.
1
10
in optimal is
to from path the},,,{ allFor

i
l
T
dwuuw 
d
v
i+1
u
0
u
l
u
k
P
Q
w
P’
)()(, t better tha is assume ifNow
)()( Therefore
),()()()( i.e.
optimal.

is

)
(
)
(
know

We
PCPCPP
PCPC
PCQCPCQC
P
C
Q
C







Seif Haridi 16
Destination-based routing
• Optimal sink tree for
d
is a spanning tree rooted at d, where the path
from any node to d is optimal.
• Compute the sink tree for all nodes in the network, store a table T
u
indexed by all destination nodes in each node u.
• For each node u, T
u
[d] is the parent node of u in the optimal sink tree
for d.
• Algorithm:
–/* A packet with destination d received or generated at
node u */
– if d==u then deliver the packet locally
– else send the packet to T
u
[d] end
• The algorithm delivered each packet, because the routing tables are
cycle-free.
Seif Haridi 17
Bifurcated Routing
• Traffic splits and takes
multiple paths for each
source-destination pair.
y
u v
x
Seif Haridi 18
All-pairs shortest path problem
• An algorithm that computes simultaneously the routing table for all
nodes in a network.
• Computes for each pair (u,v) of nodes, the shortest path from u to v and
stores the first channel of the path in u.
. to from paths all ofght lowest wei theis ),,(, tofor distance The
. is ,, patha ofWeight
.
weight
has

edge

Each
1
0
0
1
vuvudvu
wuu
w
uv
k
i
uuk
uv
ii



Seif Haridi 19
S-paths
• The algorithm starts by computing all  -paths, incrementally
computes larger S-paths, and all V-paths are considered.
. otherwise path,-Sany ofght lowest wei theis ),,(, to from distance-
.,, if path-S an is path A
.
let
110



vudvuS
SuSu,u,u
V
S
S
kk

. 7.
exists. to from path-a iff exists to from path A .6
)),(),(),,(min(),( then},{ If 5.
. to from path- anby
edconcatenat to from path- anor , to form path- an
either is to from path simplea then},{ If .4
. then If .3
. to from path-a is there,for 2.
.0),( .
1
(u,v)dd(u,v)
vuVvu
vwdwudvudvudwSS
vwS
wuSvuS
vuwSS
?(u,v)dEuv
Euvvuvu
uud
V
SSSS
uv
S










Seif Haridi 20
S-paths
1
2
3 4
5 6
7
8
S
0
=
S
1
={1}
1
2
3 4
5 6
7
8
1
2
3 4
5 6
7
8
S
2
={1,2}
1
2
3 4
5 6
7
8
S
3
={1,2,3}
Seif Haridi 21
S-paths
S
4
={1,2,3,4}
S
5
={1,2,3,4,5}
1
2
3 4
5 6
7
8
1
2
3 4
5 6
7
8
Seif Haridi 22
Floyd-Warshall sequential algorithm

{w}S:S
v])D[w,w]D[u,v],min(D[u, :v] D[u, V v
pivot- wglobala Execute% Vu
S\V frompick w
v)(u,
S
dv]D[u,:vu, :invariant Loop% VS
pivotingby S Expand%
:v] D[u,
uv
w:v] D[u, E uv
: v] D[u, vu
vu,
:S

distance
-

to
D
and

to
S

Initialize

%
end
end
enddoforall
doforall
dowhile
end
endelse
thenelseif
thenif
doforall









Seif Haridi 23
The algorithm

{w}S:S
v])D[w,w]D[u,v],min(D[u, :v] D[u, V v
pivot- wglobala Execute% Vu
S\V frompick w
v)(u,
S
dv]D[u,:vu, :invariant Loop% VS
pivotingby S Expand%
:v] D[u,
uv
w:v] D[u, E uv
: v] D[u, vu
vu,
:S

distance
-

to
D
and

to
S

Initialize

%
end
end
enddoforall
doforall
dowhile
end
endelse
thenelseif
thenif
doforall









steps. )(N in Computes
3

N
2
N N
2
Seif Haridi 24
Toueg’s shortest path
• A distributed version of Floyd and Warshall algorithm.
• Assumptions:
– Each cycle in the network has a positive weight.
– Each node initially knows the identities of all nodes (the set V).
– Each node u knows its neighbors stored in Neigh
u
, and the weight
of outgoing channels.
– Described in two refinement steps.
nodes. ofarray :
weights.ofarray :
nodes. ofset :
:
variables
u
u
u
Nb
D
S
Seif Haridi 25
Version 1

{w}
u
S:
u
S
]w[Nb:]v[Nb
[v];
w
D[w]
u
D:[v]
u
D
][
u
D][
w
Dw][
u
D
V v
"
w
"
w
element same theuniformly pick All %
u
S\V frompick w
v)(u,
S
d[v]
u
D:vu, :invariant Loop% V
u
S

:]v[Nb;:[v] D
v:]v[Nb;:[v] D Neigh v
:]v[Nb;: [v] D vu
V v
:S
uu
u
u
u
uvu
u
u
u
u
u
end
end
end
thenif
doforall
endelse
thenif
dowhile
end
endelse
thenelseif
thenif
doforall











vv
udef
udef

u
w
Seif Haridi 26
Version 1 Contd.
• After each pivot round:
.
at

rooted

a tree

is
, to from channelfirst theis if and
,][
where),(T graph directed The
. topath-shortest a
of channelfirst theis ][ then, and ),( if
).,(][,
w
w
wuxEux
wDVu
EV
wS
wNbwuwud
wudwDu
w
uw
ww
u
S
S
u





For each destination w, the nodes that computed the way to w
form a spanning tree rooted at w.
Seif Haridi 27
The improved algorithm
• At the start of the w-pivot round a node u with D[w]= does not
improve its table.
• Only the nodes in T
w
need to receive w’s table, to extend their table.
• The table is sent via the channels of the tree T
w.
.
• Each node knows its father in T
w.
but not its sons, therefore sons must
inform the father (needed to do the broadcast).
Seif Haridi 28
The skeleton at node u
• Initialize D
u
and Nb
u
table by self and immediate neighbors
• Start the w-pivot rounds, for each w round:
• Establish the son-father chain in T
w
.
– Send (ys,w) message to the father neighbor, (nys,w) to the non-
father neighbors.
– Receive (ys,w) and (nys,w) messages from neighbors.
• Participate in the w-pivot round
– Receive (dtab,w,D) from father in T
w
(uw).
– Send (dtab,w,D) to sons in T
w
.
– Extends D
u
and Nb
u
tables
– Extend S
u
with w.
Seif Haridi 29
Messages
• (ys,w): your-son message in the spanning tree of w.
• (nys,w): not-your-son message in the spanning tree of w.
• (dtab,w,D): the D-table of w.
• Requires FIFO channels for not mixing rounds, or storing messages for
round w’ if w’ is after w.
u
x
w
nys
nys
ys
Seif Haridi 30
Tree construction phase
end
dowhile
end
endtoelse
tothenif
doforall
1num_rec:num_rec
Neighnum_rec
0:num_rec
x wys,n send
x wys, send x]w[Nb
Neigh x
uu
uu
u
u
u






Seif Haridi 31
}w{S:S
*}n computatio tablelocal {*

x toDw,dtab, send x from received waswys,
Neigh x
]w[D
u
u
u
u
u





end
end
end thenif
doforall
endfromthisthenif
thenif
Seif Haridi 32
Complexity

)WN()NEWN2( ed transferrbits ofNumber
bits. W takesor weight) id-(nodeentry each Assume
)N E2N( message ofNumber
in total. dtab N and per/roundin messages dtab N
mostat have we treespanning a traversesmessage dtab a Since
message. dtab 1 messages ys/nys 2 edgeeach on most At
:roundeach At
rounds

N

have

We
33
dtab
2
ys/nys
2




w
x
u
ys/nys
Dtab (table of N entries)
Seif Haridi 33
From sequential to distributed
algorithms
• Variables of a sequential algorithm are distributed over a number of
nodes. Computation on the variables are done locally.
• Whenever a remote variable is needed communication is performed.
• Minimize amount of communication by exploiting properties of the
sequential algorithm.
• Two bad properties of Toueg’s algorithm.
– Agreement of pivot nodes require knowledge of the nodes in the
system. In general we need to execute first a wave algorithm to get
acquire this knowledge.
– Requires information that is not available in the node, nor in the
neighbors.
• d(u,w) + d(w,v) < d(u,v)
Seif Haridi 34
Alternative solutions
• Communication is local (only information from neighbors).
• Computing different destinations is independent.
• Requires more total computation ??.
• Locality makes it easy to design an algorithm that adapts to topological
changes.



otherwise
vu if
)v,w(d(min
0
v)d(u,
:
equation

following

on the

based

be
can

v
u to

from

distances

Computing
uw
Neighw
u
Seif Haridi 35
Chandy-Misra Algorithm
end
enddoforall
thenif
end
doforall
x to ]v[D,vmydist, send Neigh x
w:]v[Nb
d:]v[D
]v[Dd
u toneighbor w from d,vmydist, a Processing
0,vmydist, send
Neigh w
0:]v[D
tree)spanning theofroot (The : vnodeFor
.initially node, :]v[Nb
.initially weight,:]v[D
:
Variables
0u0u
0u
uw0u
0uuw
0
0
0
v
0v
0
0u
0u
0
0








undef
Seif Haridi 36
Illustration
v
0
2


 2
1
0
v
0


 
1
0
v
0


 

0
v
0
23

3
2
1
0
b
a
a<b
c
d
d<c
Seif Haridi 37
Illustration
v
0
23

3
2
1
0
c
d
d<c
v
0
23
4
3
2
1
0
v
0
23
45
3
2
1
0
v
0
23
45
3
2
1
0
spanning tree
Seif Haridi 38
Reasoning
v
1
v
2
v
3
v
4
V
5
v
0
v
7
v
6
v
0
5
4
0
00
consider ,,,for holdit assume
:Example
.1for it prove and
,for holdsit assume induction,by Proof
),(][,

where
reached

is

ion
configurat
a

,
1
vvv
ji
ji
vvdvDji
N
j
iv
i





The complete algorithm contains also
termination detection mechanism.
total.bits |)|(
nodes. all compute tomessages of |)|(
. compute tomessages of |)|(
2
2
0
EWN
EN
vEN



Seif Haridi 39
Netchange algorithm
• Assumptions
– The nodes know the size of the network (N).
– The channels are fifo.
– Nodes are notified of failure and repair of their adjacent channels.
– The cost of the path equals the number of channels in the path.
– Failure of a node is observed as a failure of its connecting
channels.
• If the topology of the network remains constant after a finite number of
topological changes, the algorithm terminates after a finite number of
steps.
• when the algorithm terminates the following holds for node u:
– Nb
u
[v] = local, if u = v,
– Nb
u
[v] = w, where w is the first neighbor on a shortest path to v
– Nb
u
[v] = udef, if there is no path from u to v
Seif Haridi 40
Description
• Network of N nodes
• Initial estimate of d(u,u)=0,
d(u,v)=N where uv.
• Maintains an estimate of each
neighbor’s distance to v,
initially N.
• Initially (mydist,u,0) is sent to
all neighbors.
u w
v
),( of estimate ],[
topath inneighbor preferred ][
N1 ofarray :
),( of estimate ][
of neighbors the
vwdvwndis
vvNb
D
vudvD
uNeigh
u
u
u
u
u

),(min1),( vwdvud
u
Neighw

Has to be maintained
Seif Haridi 41
receiving mydist
• If the estimate ndist[w,v] is different from d :
– d(u,v) is recomputed
– if d(u,v) has changed, (mydist,v,d) is sent to
all neighbors.
u w
v
(mydist,v,d)
Seif Haridi 42
channel failure
• Messages may be lost, therefore distance to all
nodes have to be recomputed after removing w
as neighbor.
u w
v
u w
v
(fail,w) (fail,u)
Seif Haridi 43
channel repair
• u uses N as an estimate of d(w,v)
• u sends its estimate d(u,v) for all v.
u w
v
u w
v
(repair,w) (repair,u)
Seif Haridi 44
Variables
),( of estimate ],[
topath inneighbor preferred ][
N1 ofarray :
),( of estimate ][
of neighbors the
vwdvwndis
vvNb
D
vudvD
uNeigh
u
u
u
u
u

Initializations

,0mydis, send
:][;:][

:],[ ,
endtodo forall
end
do forall
enddoforall
wuNeighw
udefvNbNvD
Vv
NvwndisVvNeighw
u
uu
uu



Seif Haridi 45
Recompute(v)
end
end do forall
then if
end
end else

then if

else

then

if
xvDvNeighx
vD
udefvNbNvD
dvwndistswvNbdvD
Nd
Neighwvwndisd
local
v
Nb
v
D
v
u
uu
u
uu
uuu
uu
uu
to][,,mydist send
changed has ][

:][;:][
],[1.. :][;:][

};:],[min{1:

:
]
[
;
0
:
]
[







Seif Haridi 46
Netchange part two
end
do forall
end do forall
wvDvNvwndis
Vv
wNeighNeigh
w
uw
vVv
wNeighNeigh
w
uw
vdvwndis
wdv
w
d
v
u
uu
uu
uu
u
to][,,mydist send;],[

};{:
: channel ofrepair Upon
)recompute(
};{\:
: channela of Failure
)recompute( ;],[
:
neighbor

from

,
,
mydist

message

the
receiving

Node









Seif Haridi 47
Tree labeling scheme
in the packet to reduce table
size.
• Tree labeling scheme routes
interval through one channel.
• Assume the network has a tree
structure (or route via a logical
tree structure, e.g. a spanning
tree for a fixed root).
u
w
1
w
3
w
4
w
2
dest.chan.
v
1
w
2
u -
v
j
w
3
v
N
w
1
chan.dest.
w
1
…, v
N
w
2
v
1
,…
w
3
…,v
j
,….
w
4
….
Seif Haridi 48
Tree Labeling
• Nodes are labeled in a pre-order way,
(root, left subtree, right subtree).
• This classifies packets into class
according to intervals modulo the N (the
number of nodes).
• Not good if the network is general:
– some channels are not used
– single point of failure partitions the
network.
• Interval routing extends the scheme so
that (almost) every channel is used.
0
1
2
3 4
5
6 7
8, [8,1)
9
10
11
2, [2,5)
5 [5,8)
8