One of the most important problems in distributed process- ing consists in balancing the work load among all processors. The purpose of load (work) balancing is to achieve better performances of distributed computations, by improving load allocation. The load balancing problem was studied by several au- thors from dierent points of view [1, 2, 3, 4, 5, 6, 7]. Various techniques can be used to perform load balancing, they are categorized in [2] according to dierent criteria: centralized/distributed, static load/dynamic load, and synchronous/asynchronous. In distributed systems, the schedules of the load balancing problem are iterative

boardpushyΠολεοδομικά Έργα

8 Δεκ 2013 (πριν από 3 χρόνια και 6 μήνες)

213 εμφανίσεις

SYNCHRONOUS DISTRIBUTED LOAD BALANCING ON
DYNAMIC NETWORKS
JACQUES BAHI,RAPHA

EL COUTURIER,AND FLAVIEN VERNIER

Abstract.In this paper,three distributed load balancing algorithms for dynamic networks are
investigated.Dynamic networks are networks in which the topology may change dynamically.The
denition of a dynamic network is introduced and its graph model is presented.The main result of
this study consists in proving the convergence toward the uniform load distribution of the diusion
algorithm on an arbitrary dynamic network despite communication link failures.We also give two
adaptations of this algorithm (the GAE and the relaxed diusion).Notice that the hypotheses of
our result are realistic and that for example the network does not have to be maintained connected.
To study the behavior of these algorithms,we compare the load evolution by several simulations.
Key words.load balancing,dynamic networks,iterative algorithm,rst order,dimension
exchange.
1.Introduction.One of the most important problems in distributed process-
ing consists in balancing the work load among all processors.The purpose of load
(work) balancing is to achieve better performances of distributed computations,by
improving load allocation.The load balancing problem was studied by several au-
thors from dierent points of view [1,2,3,4,5,6,7].Various techniques can be used
to perform load balancing,they are categorized in [2] according to dierent criteria:
centralized/distributed,static load/dynamic load,and synchronous/asynchronous.
In distributed systems,the schedules of the load balancing problem are iterative
in nature and their behavior can be characterized by iterative methods derived from
linear systems theory.Local iterative load balancing algorithms were rst proposed
by Cybenko in [1].These algorithms iteratively balance the load of a node with its
neighbors until the whole network is globally balanced.There are mainly two itera-
tive load balancing algorithms:the diusion algorithms [1,5,8] and the dimension
exchange algorithms [1,2,9].Diusion algorithms assume that a processor simulta-
neously exchanges load among all neighbor processors,whereas dimension exchange
algorithms assume that a processor exchanges load with only one neighbor processor
in each dimension at each time step.These algorithms,however,have been derived
for use on homogeneous or heterogeneous networks [10] with xed topologies.But
nowadays,with grid computing possibilities,problems of communication failures or
communication time out (i.e.low bandwidth communication) appear.
In a network with dynamically changing topology (i.e.a dynamic network),the
set of edges in the network may vary at each time step.In this paper,a dynamic
network is a network with dynamic links.We suppose that no computer can be added
or denitively retrieved in a dynamic network.At a given time,we dene a living edge
as an edge that can transmit one message in each direction [11,12,13].At each time
step,each node in a dynamic network knows which of its edges are alive.A dynamic
network can be viewed here as a network in which some edges fail during the execution
of an algorithm.These considerations lead us to dene a new criterion to categorize
load balancing techniques:the static/dynamic network criterion.According to the
above criteria,the following algorithms are synchronous and distributed.They are
models for static load balancing and dynamic networks.A static load is used only

Laboratoire d'Informatique de l'universite de Franche-Comte (LIFC),IUT de Belfort-
Montbeliard,BP 527,90016 Belfort CEDEX,France.fbahi,couturie,vernierg@iut-bm.univ-fcomte.fr
to simplify the algorithms,it is easy to apply them with dynamic load,for more
information,see [1].
The main result of this study consists in proving the convergence toward the uni-
form load distribution of the diusion algorithm on an arbitrary dynamic network
despite communication link failures.Notice that the hypotheses of our result are
realistic (see theorem 3.2 and remark 1).Based on this algorithm,two variants are
proposed.The former,which we called GAE for Generalized Adaptive Exchange,is
a new algorithm.It constraints one node to exchange its load with at most one of
its neighbors.Thus,GAE may be considered as the extension of GDE for dynamic
networks.The latter,which we called relaxed diusion,introduces a relaxation pa-
rameter in the diusion algorithm.The relaxation parameter may dramatically speed
up the convergence.
The results of this paper unify those of two previous works [14] and [15],and
introduce for the rst time the GAE algorithm.Indeed,in [14] we only consider
the case of hypercube topology networks and at each time step t there is only one
dimension d(t) to balance the load;if there is a link failure on this dimension then the
processor does not balance its load.In [15],the diusion and relaxed diusion were
introduced.The GAE algorithmhas never been proposed in any paper yet.Moreover,
this paper is intented to propose three algorithms based on rst order designed for
any dynamic network.
This paper is organized as follows.Section 2 presents the related works,we
review the diusion and the dimension exchange on any static networks.In section
3,we introduce a graph model for dynamic networks and the diusion load balancing
algorithm for dynamic networks.Section 4 presents two algorithms derived from the
diusion.The rst one is GAE,an adaptation of GDE for dynamic networks and
the second one is the relaxed diusion load balancing algorithm that speeds up the
convergence of classical diusion algorithm.Section 5 illustrates the behavior and
presents experimentations of our algorithms.In section 6 we conclude our work and
the last section (7) is devoted to the detailed proof of our main result.
2.Related Works.The algorithms studied in this paper are derived from the
diusion model and some of its variants.This section describes those models.
Classically,a static network topology is represented by a simple undirected con-
nected graph G = (V;E),where V is the set of vertexes and E is the set of edges,
E  V V.Each computing processor is a vertex of the graph and each communica-
tion link between two processors i;j is the edge fi;jg 2 E between the two vertexes
i and j (i;j 2 V ).By denition,each vertex is labeled from 1 to n where n is the
number of processors,so jV j = n.Let m be the number of communication links
(jEj = m).
In [1] Cybenko introduced two distributed load balancing algorithms for static
networks.The rst one,called diusion,assumes that a process i balances its load
simultaneously with all its neighbors.To balance the load,a ratio 
ij
of the dierence
of load between the process i and j is swapped between i and j.For a process i,the
load balancing step with all its neighbors is given by equation 2.1 where w
(t)
i
is the
work load done by process i at time t.
w
(t+1)
i
= w
(t)
i
+
P
j

ij
(w
(t)
j
w
(t)
i
)
(2.1)
The second algorithm called Dimension Exchange (DE) is a diusion algorithm in
which a processor communicates with only one processor at a given time step,and
2
balances load after each communication.The DE algorithmassumes that all processes
of a D-dimensional binary hypercube balance their load on the same dimension d at
time t.Between two processes i and j linked on the dimension d,with d = t mod D,
the DE algorithm for the process i is given by equation 2.2.
w
(t+1)
i
= w
(t)
i
+
w
(t)
j
w
(t)
i
2
(2.2)
In [2,9] the authors introduced a DE algorithm,called Generalized Dimension
Exchange (GDE),for an arbitrary xed network topology.In order to represent the
dimensions used in the classical DE algorithm,they introduce an edge coloring graph,
each color represents one dimension.Each edge of G is supposed to be colored with
the smallest number of colors (say k).If (G) denotes the degree of G,it is proved
that the minimum number of colors k is (G)  k  ((G) +1) [16].It is supposed
that a vertex i cannot have two edges with the same color.Each color is indexed with
one integer from 0 to k 1.
The graph Gof the network topology is transformed in the k-color graph G
k
= (V;E
k
).
E
k
is a set of 3-tuples fi;j;cg where fi;jg 2 E is an edge between vertexes i and j,
and c is the color of the edge fi;jg,(0  c  k 1).
When GDE balances the load on a dimension (a color c),each vertex with an edge
of color c balances its load with its neighbor on this edge.For a process i the GDE
algorithm is dened by equation 2.3.
w
(t+1)
i
= w
(t)
i
+(w
(t)
j
w
(t)
i
) if 9j such that fi;j;cg 2 E
k
= w
(t)
i
otherwise
(2.3)
where c is a chromatic index such that c = t mod k and  2]0;1[ is the exchange
parameter chosen according to the network topology.
All these algorithms are dened for static networks and require a correct order for load
balancing (dimension or chromatic order).If the communication medium is Internet,
some problems of communication can appear,the network topology becomes dynamic
and an arbitrary order can be a constraint for the load balancing.
3.Diusion Load Balancing on Dynamic Networks.This section recalls
works realized in [15]:the denition of dynamic networks,its graph representation
and the diusion load balancing algorithm adapted to dynamic networks.
The diusion algorithmic model needs a graph representation for the dynamic
network.A dynamic network is a network in which some links are evolving with time.
It can lose some edges due to communication failures or communication time out,as it
may be the case in the Internet network,but no computer can be added or denitively
retrieved in the network.A classical undirected connected graph G = (V;E) is used
for the global network and we dene the set E
(t)
B
as the set of broken edges at time t.
So G
(t)
= (V;E;E
(t)
B
) is a graph model for dynamic networks.As in a classical graph,
V is the set of processors,E is the set of edges,E is a subset of V V,each edge
fi;jg 2 E is a communication link between processors i and j,(i;j 2 V ),jV j = n and
jEj = m.E
(t)
B
is a subset of E.Figure 3.1 illustrates a possible evolution of a dynamic
network.It should be noted that if E
(t)
B
is empty at any time t,G
(t)
= (V;E;E
(t)
B
) is
the static network G = (V;E).
In the context of dynamic networks,the standard diusion scheme requires some
adaptations due to the dynamic nature of the topology.The main dierence lies
3
1
2
3 4
(a) Time t,E
(t)
B
= f(2;3)g.
1
2
3 4
(b) Time t + 1,E
(t+1)
B
=
f(2;3);(1;3)g.
1
2
3 4
(c) Time t + 2,E
(t+2)
B
=
f(1;3)g.
1
2
3 4
(d) Time t+3,E
(t+3)
B
= fg.
Fig.3.1.Time evolution of a dynamic network.
in a relevant adaptation of the diusion matrix that needs to dynamicly integrate
information about the link failure.
The following example illustrates a possible situation in the case of dynamic networks:
suppose that using the standard diusion algorithm,a processor i must balance its
load with all its neighbors called j,k and l.If the edge between i and j is broken,
then i cannot balance its load with its neighbor j.Nevertheless i can still balance its
load over its living edges,ie.(i;k) and (i;l).
The diusion algorithm with dynamic networks may be described as follows:for
a processor i,the exchange of its workload with its reachable neighbors j is executed
as algorithm (3.1).
w
(t+1)
i
= w
(t)
i
+
P
j

ij
(w
(t)
j
w
(t)
i
) for all living edges (i;j)
(3.1)
where w
(t)
i
is the workload of processor i at time t and 
ij
is dened as in (2.1).
The equation (3.1) is linear and it expresses the vector equation (3.2) that updates
load for all nodes at time t.
W
(t+1)
= M
(t)
W
(t)
(3.2)
Where W
(t)
is the vector of w
(t)
i
and M
(t)
is dened by
m
(t)
ij
=
8
>
<
>
:

ij
if (i;j) 2 E ^ (i;j) 62 E
(t)
B
^ i 6= j;
1 
P
k

ik
8kj(i;k) 2 E ^ (i;k) 62 E
(t)
B
^ i = j
0 otherwise:
M
(t)
is the diusion matrix at time t,it represents the incidence matrix of commu-
nication graph at this time step.The evolution of the workload distribution between
time t and t +1 is given by W
(t+1)
= M
(t)
W
(t)
.
4
In the example of Figure 3.2,matrices M
(t+1)
and M
(t+2)
are:
M
(t+1)
=
"
1 
1;2

1;2
0 0

2;1
1 
2;1
0 0
0 0 1 
3;4

3;4
0 0 
4;3
1 
4;3
#
cf:Fig:3:2(b)
M
(t+2)
=
"
1 0 0 0
0 1 
2;3

2;3
0
0 
3;2
1 
3;2
0
0 0 0 1
#
cf:Fig:3:2(c)
To give our main result,we need the following denition:
Definition 3.1.At each time step,the communication graph for load balancing is
the graph which shows only the edges that are used for load balancing communications
at this time.A superposed communication graph for load balancing G
t;t+n
between the
times t and t+n is a graph that shows all edges used for load balancing communications
between the times t and t +n (see Figure 3.2).
1
2
3 4
(a) Global Network before
any load balancing.
1
2
3 4
(b) Time t.
1
2
3 4
(c) Time t +1.
1
2
3 4
(d) Superposed graph be-
tween time t and t + 1
(G
t;t+1
).
Fig.3.2.These graphs show a global network,two communication graphs and the corresponding
superposed communication graph.In this case,the superposed graph is connected:there is a path
between any two nodes.
Theorem 3.2.Algorithm (3.1) converges toward the uniform load distribution if
and only if to any time t corresponds a time t +n such that the superposed commu-
nication graph G
t;t+n
is a connected graph (see Fig.3.3).
Remark 1.It should be noted that theorem 3.2 does not claim that all edges have
to be alive during the load balancing process,for example in Figure 3.2,edge (1,3) is
never alive.Moreover,the network can always be disconnected,only the superposed
communication graph must be connected to ensure the convergence.This property is
checked with the series of graphs (3.2(b),3.2(c),3.2(b),3.2(c),:::).In other words,
if the communication graphs always oscillate between 3.2(b) et 3.2(c) then the network
graph is always disconnected,nevertheless we are in the hypotheses of Theorem 3.2
5
and then the convergence is ensured.
The hypothesis of the theoremmeans that,at any time t,we ensure to get a connected
graph at a later time t +n by the virtual superposition of the communication graphs
between t and t + n.Note that the integer n is not necessarily a constant.Thus,
whatever the conguration of the network,we have to be certain to construct a path
between any two processors by the articial superposition of communication graphs
(see Fig.3.3).
Time
t1 t2 t3 t4
t3+n3
ti
ti+nit4+n4t1+n1 t2+n2
Fig.3.3.Between two time steps ti and ti +ni the superposed communication graph is connected.
We propose in the next section two algorithms derived from the diusion for
dynamic networks.
4.Derived Algorithms.This section presents two algorithms derived fromthe
diusion algorithm,the rst one is GAE,an adaptation of GDE for dynamic networks
and the second one is the relaxed diusion algorithm that speeds up the convergence
of classical diusion.
4.1.GAE.The interesting property of the GDE algorithm lies in the fact that
a node communicates with at most one of its neighbors at each iteration (i.e.a node i
balances its load with only one neighbor).This property can be preserved for dynamic
networks.In the case of dynamic networks,the GDE algorithm will be called GAE
(Generalized Adaptive Exchange).
The GAE algorithmic model can be viewed as a diusion model in which at time
t,for each processor i,all except one living edge (i;j) are broken.The neighbor
j must be chosen by an arbitrary,random or more sophisticated strategy.With
this strategy,all the other edges (i;k) with k 6= j are supposed to be broken,so
(i;k) 2 E
(t)
B
j8k;k 6= j.Moreover in GAE algorithm,
ij
is equal to a constant .
The GAE algorithm may be described as follows:for a processor i,the exchange
of its workload with a neighbor j is executed as algorithm (4.1)
w
(t+1)
i
= (1 )w
(t)
i
+w
(t)
j
if i communicates with its
neighbor j at time t;
= w
(t)
i
if i does not communicate with
any neighbor at time t;
(4.1)
With these conditions,the diusion matrix M becomes:
m
(t)
ij
=
8
>
>
>
<
>
>
>
:
 if (i;j) 2 E ^ (i;j) 62 E
(t)
B
^ i 6= j;
1  9kj(i;k) 2 E ^ (i;k) 62 E
(t)
B
^ i = j
1 69kj(i;k) 2 E ^ (i;k) 62 E
(t)
B
^ i = j
0 otherwise:
(4.2)
A processor i can communicate with a neighbor j if and only if the edge between
i and j is not broken,i does not communicate with another neighbor h,and j does
not communicate with another node.
Corollary 4.1.GAE converges under the assumption of theorem 3.2.
Proof.This is a particular case of theorem 3.2 by using M dened by equation
(4.2).
6
4.2.Accelerated Relaxed Diusion.This scheme is also based on the diu-
sion scheme.It consists in introducing a relaxation parameter 
(t)
in order to speed
up the convergence [15].The relaxed diusion algorithm may be described as fol-
lows:for a processor i,the exchange of its workload with its reachable neighbors j is
executed as algorithm (4.3).
w
(t+1)
i
= w
(t)
i
+
(t)
P
j

ij
(w
(t)
j
w
(t)
i
) for all living edges (i;j)(4.3)
In this algorithm,the diusion matrix is (1  
(t)
)Id + 
(t)
M
(t)
.Where Id is the
identity matrix,M
(t)
is equal to the diusion matrix of diusion algorithm and 
(t)
is the relaxation parameter at time t.
It is known [17] that if M is a stochastic matrix,then the optimal parameter is
 =
2
2 (s +l)
;(4.4)
where s and l are respectively the smallest and the second largest eigenvalue of M.
This denition of  does not imply that each w
(t)
i
stays positive.Let us dene 
(t)
to be the relaxation parameter at time t for the network G
(t)
,such that W
(t)
stays
positive.Let us denote by R
(t)
the relation such that W
(t)
stays positive if 
(t)
< R
(t)
,
R
(t)
is dened by R
(t)
= min
i
w
(t)
i
(1 M
(t)
ii
)(w
(t)
i
w
(t)
min
)
;
where w
(t)
min
= min
i
(w
(t)
i
).With this denition,
(t)
for the network G
(t)
is given by

(t)
= min

R
(t)
;
2
2 (s
(t)
+l
(t)
)

;(4.5)
where s
(t)
and l
(t)
are respectively the smallest and the second largest eigenvalue of
M
(t)
.For the relaxed diusion,the vector equation 3.2 becomes
W
(t+1)
=

1 
(t)

W
(t)
+
(t)
M
(t)
W
(t)
:
Corollary 4.2.For  chosen according to (4.5),and under assumption of
theorem 3.2,the relaxed diusion scheme converges and is optimal.
Proof.It is sucient to apply result of [17]
5.Experimentation.This section shows the results of implementations of rst
order scheme (FOS),relaxed rst order scheme (RFOS) and GAE algorithms.For
GAE,three dierent neighbor choices are implemented to illustrate the eect of this
choice.The goal of these experimentations is to illustrate the behavior of these algo-
rithms and to highlight their convergence toward the uniform work load distribution
with some broken edges.These experimentations were performed with the mathemat-
ical software'Scilab'[18].To study the behavior of these algorithms without external
interaction,the load balancing is not applied to a real work load.The load for each
processor is virtual,it is simulated by an integer.Moreover dynamic networks are
simulated by introducing a percentage of reliability for edges.
The simulations are performed on various networks:line (64),ring (64),mesh
2D (8x8),mesh 3D (4x4x4),torus (8x8) and hypercube (d6) with respectively 63,64,
112,144,128 and 192 edges.At the initialization,all the load of the system (6400)
is given to node 0.For each network,simulations are realized with a percentage of
7
broken edges per iteration (bpi) between 0% and 50% with 5% step.The convergence
criteria is xed to 1,in other words,the algorithms are stopped when the dierence
between the most and the least loaded nodes is less than 1.
For the relaxed diusion algorithm,the parameter 
(t)
must be determined at each
time step.The calculation of 
(t)
needs a global information on the network,which
is not convenient on distributed systems.Thus  is dened as the optimal parameter
corresponding to the initial network (without communication failure) with W
(0)
,and
used at each time step (
(t)
= )
 = min

R;
2
2(s+l)

with R = min
i
w
(0)
i
(1M
ii
)

w
(0)
i
w
(0)
min

;
(5.1)
where s and l are respectively the smallest and the second largest eigenvalue of M.
Table 5.1 gives  computed at the initialization with equation (5.1) for each network.
line
ring
mesh
cube
torus
hyper.
beta
1.5
1.496
1.25
1.166
1.164
1.
Table 5.1
Beta computed with equation (5.1).
The values of  given in table 5.1 are used for each simulation.
GAE algorithm needs a neighbor choice at each iteration to balance the load.For the
simulations,three choices are studied:arbitrary,random and most to least loaded.
The arbitrary choice is equivalent to graph coloring of GDE,for a given node at a given
time step the edge to use is predened.If this edge is broken,the two corresponding
nodes do not balance their load at this time step.The random choice\RAND"
randomly elects a living edge for each node such that for each chosen edge,the two
corresponding nodes are not balanced.In the most to least loaded choice\M2LL",
each node chooses a living edge that links itself with one of its neighbors for which
the dierence of load is the highest.
The results given in tables 5.2 to 5.7 show the numbers of iterations needed to have
a uniform load distribution for each network and each reliability.
line
ring
mesh
cube
torus
hyper.
FOS
6595.
1648.
193.
72.
49.
20.
RFOS
4395.
1185.
154.
61.
43.
20.
GAE
4395.
1098.
150.
55.
36.
6.
GAE
RAND
6674.
1652.
218.
81.
67.
36.
GAE
M2LL
4395.
1098.
135.
55.
34.
6.
Table 5.2
Results of load balancing with 0% bpi.
line
ring
mesh
cube
torus
hyper.
FOS
7248.
1816.
215.
80.
53.
22.
RFOS
4831.
1211.
170.
69.
46.
22.
GAE
5267.
1317.
185.
70.
46.
17.
GAE
RAND
7513.
1872.
235.
84.
66.
34.
GAE
M2LL
5635.
1394.
161.
56.
41.
17.
Table 5.3
Results of load balancing with 10% bpi.
These tables show that RFOS is always better than FOS for Beta greater than 1.
8
line
ring
mesh
cube
torus
hyper.
FOS
7974.
1991.
236.
89.
61.
24.
RFOS
5310.
1332.
188.
75.
51.
24.
GAE
6227.
1562.
221.
86.
56.
28.
GAE
RAND
8182.
2041.
252.
88.
74.
35.
GAE
M2LL
6362.
1599.
173.
58.
43.
15.
Table 5.4
Results of load balancing with 20% bpi.
line
ring
mesh
cube
torus
hyper.
FOS
8758.
2226.
256.
99.
69.
28.
RFOS
5841.
1483.
208.
84.
57.
27.
GAE
7275.
1883.
262.
120.
68.
35.
GAE
RAND
8829.
2236.
261.
90.
71.
38.
GAE
M2LL
7068.
1798.
185.
59.
47.
20.
Table 5.5
Results of load balancing with 30% bpi.
If Beta is equal to 1,ROFS is equivalent to FOS;it is the case with hypercube
networks.
These tables illustrate the in uence of neighbor choices for GAE algorithms.M2LL
is always better than the two other choice strategies or equivalent to GAE in the case
of 0%bpi.In most cases,RAND is the less ecient choice.RAND can be better than
classical GAE for some network topologies if the percentage of broken edges is high.
It is the case for cubes with a bpi greater than 20%,for meshes with a bpi greater
than 30%,for hypercubes with a bpi greater than 35% and for tori with a bpi greater
than 40%.
Figure 5.1 gives for each studied network the number of iterations in function
of bpi.The graphics on gure 5.1 are obtained by linear regression with data given
in tables 5.2 to 5.7.These gures bring the eect of bpi for each network on the
convergence of the load balancing algorithms to the fore.These results show that the
more ecient algorithms are RFOS and GAE with M2LL choice.The comparison of
these two algorithms shows that their eectiveness depends on the network topolo-
gies.ROFS is more adapted than GAE
M2LL for line and ring topologies,whereas
GAE
M2LL is better than RFOS for the other studied topologies.
With these simulations,we experimentally show that these load balancing meth-
ods converge with broken edges.Moreover,these simulations illustrate the in uence
of network topologies and of the percentage of broken edges on the eectiveness of
load balancing methods.
6.Conclusion and future works.Some enhancements of load balancing mod-
els (First Order,Relaxed First Order and GDE) are proposed in this paper.These
new models can be used on dynamic networks.They are useful when the topology
may change due to failures in the communication links and are well-suited for large
problems that need to share computations among distant processors,as it is the case
in grid computing.
A main result of this paper is that we have given the necessary and sucient
conditions so as to have convergence in the dynamic networks framework.In other
words,we prove that if we work on dynamic networks like Internet where the com-
munication links are not reliable at 100% (communication failure or low bandwidth
communication),these algorithms can balance the loads of the system.Another im-
portant result is the enhancement of the diusion scheme by a relaxation parameter.
9
line
ring
mesh
cube
torus
hyper.
FOS
9786.
2449.
286.
109.
75.
30.
RFOS
6515.
1628.
232.
93.
64.
30.
GAE
8679.
2155.
302.
122.
76.
44.
GAE
RAND
9581.
2403.
280.
95.
82.
39.
GAE
M2LL
7880.
1970.
199.
65.
51.
22.
Table 5.6
Results of load balancing with 40% bpi.
line
ring
mesh
cube
torus
hyper.
FOS
10750.
2734.
317.
123.
81.
35.
RFOS
7155.
1824.
252.
104.
71.
34.
GAE
9989.
2563.
346.
155.
95.
47.
GAE
RAND
10322.
2624.
303.
97.
85.
43.
GAE
M2LL
8617.
2184.
217.
69.
55.
20.
Table 5.7
Results of load balancing with 50% bpi.
These algorithms are synchronous,in the sense that,to balance its load at time t +1,
a processor needs to have the load of its neighbor at time t.Finally the paper is
concluded by signicant experiments,leading to interesting teaching.The next steps
of this work are to apply these algorithms on a real distributed scientic application
and to adapt it to actual grid environment by considering the communication delays
between the processors.
7.Proof.
7.1.Technical lemma.At a time t,the adjacency matrix I
(t)
of a diusion matrix
M
(t)
is dened by
I
(t)
ij
=

1 if (i and j communicate at time t) or if (i = j)
0 otherwise
Let us dene the adjacency matrix of the superposed communication graph G
t;T
of the
network between times t and T by I
(t;T)
ij
= I
(t)
ij
+I
(t+1)
ij
+:::+I
(T1)
ij
+I
(T)
ij
where 1+1 = 1;
1 +0 = 0 +1 = 1;0 +0 = 0:
Lemma 7.1.For any m adjacency matrices I
1
;:::;I
m
we have
I
(1)
+I
(2)
+:::+I
(m)
 I
(m)
I
(m1)
:::I
(1)
where we use the usual product of matrices and where the sum of boolean numbers is dened
above.
The order relation  between two matrices is the partial ordering element to element.
Proof.By induction:It is sucient to prove that if

I
(k)
I
(k1)

ij
= 0
then

I
(k)
+I
(k1)

ij
= 0.We have:

I
(k)
I
(k1)

ij
= 0,
n
P
l=1

I
(k)

il

I
(k1)

lj
=
0,

I
(k)

il

I
(k1)

lj
= 0;for all l 2 f1;:::;ng;particularly for l = j and l = i so

I
(k)

ij

I
(k1)

jj
= 0 and

I
(k)

ii

I
(k1)

ij
= 0.
As any diagonal entry of an adjacency matrix is 1;we deduce that

I
(k)

ij
= 0 and

I
(k1)

ij
= 0;so that

I
(k)
+I
k1

ij
=

I
(k)

ij
+

I
(k1)

ij
= 0
Lemma 7.2.If the superposed graph G
t;t+m
of m+1 networks,the diusion matrices of
which are M
(t)
;M
(t+1)
;:::;M
(t+m)
is connected,then M = M
(t+m)
M
(t+m1)
:::M
(t)
is an irreducible matrix.
Proof.Indeed,the adjacency matrix of M is I = I
(t+m)
:::I
(t)
,on the other hand
the adjacency matrix of the superposed graph is I
(t)
+I
(t+1)
:::+I
(t+m)
.Thanks to lemma
10
1,we deduce that if any entry of a superposed graph is 1,the corresponding entry of I is
also 1,so if the superposed graph is connected then M is irreducible.
7.2.Proof of theorem 3.2.- Sucient condition:
By hypothesis,there always exists a time step t 2 N;such that the superposed connection
graph of the network is connected and so by the above lemma 2,for any time step t > t
0
there always exist irreducible matrices T
(p
i
)
such that
M
(t)
M
(t1)
:::M
(1)
= M
(t)
::M
(tL)
T
(p

)
T
(p
1
)
:::T
(p
1
)
where T
(p

)
= M
(tL1)
:::M
(L

)
,
T
(p
1
)
= M
(L

1)
:::M
(L
1
)
,
:::,
T
(p
1
)
= M
(L
2
1)
:::M
(L
1
)
,
Land L
i
are nite integers (so when t tends to innity  tends also to innity and lim
!1
p

=
1).
It can easily be seen that the matrices T
(p
i
)
are doubly stochastic,not bipartite (-1 is not
an eigenvalue) and irreducible.We know that if a matrix A is irreducible and doesn't have
1 as an eigenvalue,then there exists l such that A
l
is a positive matrix.These deductions
imply that
8i;lim
k!1
(T
p
i
)
k
= Q =
1
n
"
1    1
.
.
.
.
.
.
.
.
.
1    1
#
Let
(p
j
)
denote the second largest eigenvalue of T
(p
j
)
;then 0 
(p
j
)
< 1
Recall that for a row stochastic matrix T,we have kTk
1
= 1 where kTk
1
is the usual
compatible maximum norm,and that if T is symmetric and doubly stochastic
kTk
2

p
kTk
1
kTk
1
=
p
kT
0
k
1
kTk
1
= 1(7.1)
where T
0
is the transposed matrix of T and kTk
1
and kTk
2
are respectively the usual l
1
and
the Euclidean matrix norms.
Let w
(t)
be the load over the network at time t,w
(0)
be the initial load and w

=
(
P
i
w
(0)
i
n
;::::;
P
i
w
(0)
i
n
) the uniform distributed load,then


w
(t+1)
w



2
=


M
(t)
:::M
(tL)
T
(p

)
T
(p
1
)
:::T
(p
1
)
w
(0)
w



2



M
(t)


2
:::


M
(tL)


2



T
(p

)
T
(p
1
)
:::T
(p
1
)
w
(0)
w



2



T
(p

)
T
(p
1
)
:::T
(p
1
)
w
(0)
w



2
due to (7.1)
=


T
(p

)
(T
(p
1
)
:::T
(p
1
)
w
(0)
) T
(p

)
(w

)


2

(p

)


T
(p
1
)
(T
(p
2
)
:::T
(p
1
)
w
(0)
) T
(p
1
)
(w

)


2
.
.
.

(p

)
  
(p
1
)


w
(0)
w



2
so lim
t!1


w
(t+1)
w



2
 lim
!1

(p

)
  
(p
1
)


w
(0)
w



2
Since the number of edges is nite,the number of the connection graphs is nite,so
there exists k such that for all j 2 N;
(p
j
)

(p
k
)
< 1;so lim
!1

(p

)
  
(p
1
)

lim!1


(p
k
)


= 0:
The last inequality implies that lim
t!1


w
(t)
w



2
= 0;so 8i 2 f1;:::ng w
(t)
i
!w

i
in other words the load of each processor tends to the uniformdistributed load w

i
=
P
i
w
(0)
i
n
:
- Necessary condition
It is obvious,otherwise a processor is never reached,implying that its work is never
balanced.
REFERENCES
11
[1] G.Cybenko.Dynamic load balancing for distributed memory multiprocessors.Jour.of Para.
and Dist.Comp.,7:279{301,1989.
[2] S.H.Hosseini,B.Litow,M.Malkawi,J.McPherson,and K.Vairavan.Analysis of a graph
coloring based distributed load balancing algorithm.Jour.of Para.and Dist.Comp.,
10:160{166,1990.
[3] B.Litow,S.H.Hosseini,K.Vairavan,and G.S.Wole.Performance characteristics of a load
balancing algorithm.Jour.of Para.and Dist.Comp.,31:159{165,1995.
[4] R.Diekmann,A.Frommer,and B.Monien.Ecient schemes for nearest neighbor load balanc-
ing.Para.Comp.,25:289{313,1998.
[5] J.E.Boillat.Load balancing and poisson equation in a graph.Concurrency:Prac.and Expe.,
2(4):289{313,1990.
[6] J.M.Bahi and J.Gaber.Load balancing on networks with dynamically changing topology.In
Europar 2001 conf.,Lect.Notes on Comp.Scie.,pages 175{182,2001.
[7] D.P.Bertsekas and J.N.Tsitsiklis.Parallel and Distributed Computation:Numerical Methods
Englewood Clis NJ,Prentice-Hall,1989.
[8] C.Z.Xu and F.C.M.Lau.Optimal parameters for load balancing with the diusion method in
mesh networks Parallel Processing Letters,4(1-2):139{147,1994.
[9] C.Z.Xu and F.C.M.Lau.Analysis of the generalized dimension exchange method for dynamic
load balancing.Jour.of Para.and Dist.Comp.,16(4):385{393,1992.
[10] R.Elssser and B.Monien and R.Preis.Diusion Schemes for Load Balancing on Heterogeneous
Networks Theory of Computing Systems,35:305{320,2002.
[11] W.Aiello,B.Awerbuch,B.Maggs,and S.Rao.Approximate load balancing on dynamic and
asynchronous networks.In Proc.of the 25th Annual ACM Symposium on the Theory of
Comp.,pages 125{136,San Diego,California,1993.
[12] B.Aiello and T.Leighton.Coding theory,hypercube embeddings,and fault tolerance.In Proc.
of the 3rd Annual ACMSymposium on Para.Algo.and Arch.,pages 125{136,Hilton Head,
South Carolina,1991.
[13] F.T.Leighton,B.M.Maggs,and R.K.Sitaraman.On the fault tolerance of some popular
bounded-degree networks.SIAM Jour.on Comp.,27(5):1303{1333,1998.
[14] J.M.Bahi,R.Couturier,and F.Vernier.Broken edges and dimension exchange algorithm on
hypercube topology.11-th Euromicro Conf.on Para.Dist.and Netw.based Proc.,2003.
[15] J.M.Bahi,R.Couturier,and F.Vernier.Accelerated diusion algorithms on general dynamic
networks.In Proc.of 5th Inter.Conf.,PPAM Czestochowa,Poland,(to appear),2003.
[16] S.Fiorini and R.J.Wilson.Edge-coloring of graphs.In In L.W.Beineke and R.J.Wilson,
editors,Selected topics in graph theory.Academic Press,1978.
[17] A.Berman and R.J.Plemmons.Nonnegative Matrices in the Mathematical Sciences.SIAM,
Philadelphia,third edition,1979 edition,1994.
[18] P.S.Motta Pires and D.A.Rogers.32nd asee/ieee frontiers in education conference.Free/Open
Source Software:An Alternative for Engineering Students,IEEE,2002.
12
4000
5000
6000
7000
8000
9000
10000
11000
0
5
10
15
20
25
30
35
40
45
50
Iterations
% B.P.I.
FOS
RFOS
GAE
GAE_RAND
GAE_M2LL
(a) Line.
1000
1200
1400
1600
1800
2000
2200
2400
2600
2800
0
5
10
15
20
25
30
35
40
45
50
Iterations
% B.P.I.
FOS
RFOS
GAE
GAE_RAND
GAE_M2LL
(b) Ring.
100
150
200
250
300
350
0
5
10
15
20
25
30
35
40
45
50
Iterations
% B.P.I.
FOS
RFOS
GAE
GAE_RAND
GAE_M2LL
(c) Mesh.
50
60
70
80
90
100
110
120
130
140
150
160
0
5
10
15
20
25
30
35
40
45
50
Iterations
% B.P.I.
FOS
RFOS
GAE
GAE_RAND
GAE_M2LL
(d) Cube.
30
40
50
60
70
80
90
100
0
5
10
15
20
25
30
35
40
45
50
Iterations
% B.P.I.
FOS
RFOS
GAE
GAE_RAND
GAE_M2LL
(e) Torus.
10
15
20
25
30
35
40
45
50
0
5
10
15
20
25
30
35
40
45
50
Iterations
% B.P.I.
FOS
RFOS
GAE
GAE_RAND
GAE_M2LL
(f) Hypercube.
Fig.5.1.Number of iterations in function of bpi for each studied network (after a linear
interpolation).
13