SYNCHRONOUS DISTRIBUTED LOAD BALANCING ON

DYNAMIC NETWORKS

JACQUES BAHI,RAPHA

EL COUTURIER,AND FLAVIEN VERNIER

Abstract.In this paper,three distributed load balancing algorithms for dynamic networks are

investigated.Dynamic networks are networks in which the topology may change dynamically.The

denition of a dynamic network is introduced and its graph model is presented.The main result of

this study consists in proving the convergence toward the uniform load distribution of the diusion

algorithm on an arbitrary dynamic network despite communication link failures.We also give two

adaptations of this algorithm (the GAE and the relaxed diusion).Notice that the hypotheses of

our result are realistic and that for example the network does not have to be maintained connected.

To study the behavior of these algorithms,we compare the load evolution by several simulations.

Key words.load balancing,dynamic networks,iterative algorithm,rst order,dimension

exchange.

1.Introduction.One of the most important problems in distributed process-

ing consists in balancing the work load among all processors.The purpose of load

(work) balancing is to achieve better performances of distributed computations,by

improving load allocation.The load balancing problem was studied by several au-

thors from dierent points of view [1,2,3,4,5,6,7].Various techniques can be used

to perform load balancing,they are categorized in [2] according to dierent criteria:

centralized/distributed,static load/dynamic load,and synchronous/asynchronous.

In distributed systems,the schedules of the load balancing problem are iterative

in nature and their behavior can be characterized by iterative methods derived from

linear systems theory.Local iterative load balancing algorithms were rst proposed

by Cybenko in [1].These algorithms iteratively balance the load of a node with its

neighbors until the whole network is globally balanced.There are mainly two itera-

tive load balancing algorithms:the diusion algorithms [1,5,8] and the dimension

exchange algorithms [1,2,9].Diusion algorithms assume that a processor simulta-

neously exchanges load among all neighbor processors,whereas dimension exchange

algorithms assume that a processor exchanges load with only one neighbor processor

in each dimension at each time step.These algorithms,however,have been derived

for use on homogeneous or heterogeneous networks [10] with xed topologies.But

nowadays,with grid computing possibilities,problems of communication failures or

communication time out (i.e.low bandwidth communication) appear.

In a network with dynamically changing topology (i.e.a dynamic network),the

set of edges in the network may vary at each time step.In this paper,a dynamic

network is a network with dynamic links.We suppose that no computer can be added

or denitively retrieved in a dynamic network.At a given time,we dene a living edge

as an edge that can transmit one message in each direction [11,12,13].At each time

step,each node in a dynamic network knows which of its edges are alive.A dynamic

network can be viewed here as a network in which some edges fail during the execution

of an algorithm.These considerations lead us to dene a new criterion to categorize

load balancing techniques:the static/dynamic network criterion.According to the

above criteria,the following algorithms are synchronous and distributed.They are

models for static load balancing and dynamic networks.A static load is used only

Laboratoire d'Informatique de l'universite de Franche-Comte (LIFC),IUT de Belfort-

Montbeliard,BP 527,90016 Belfort CEDEX,France.fbahi,couturie,vernierg@iut-bm.univ-fcomte.fr

to simplify the algorithms,it is easy to apply them with dynamic load,for more

information,see [1].

The main result of this study consists in proving the convergence toward the uni-

form load distribution of the diusion algorithm on an arbitrary dynamic network

despite communication link failures.Notice that the hypotheses of our result are

realistic (see theorem 3.2 and remark 1).Based on this algorithm,two variants are

proposed.The former,which we called GAE for Generalized Adaptive Exchange,is

a new algorithm.It constraints one node to exchange its load with at most one of

its neighbors.Thus,GAE may be considered as the extension of GDE for dynamic

networks.The latter,which we called relaxed diusion,introduces a relaxation pa-

rameter in the diusion algorithm.The relaxation parameter may dramatically speed

up the convergence.

The results of this paper unify those of two previous works [14] and [15],and

introduce for the rst time the GAE algorithm.Indeed,in [14] we only consider

the case of hypercube topology networks and at each time step t there is only one

dimension d(t) to balance the load;if there is a link failure on this dimension then the

processor does not balance its load.In [15],the diusion and relaxed diusion were

introduced.The GAE algorithmhas never been proposed in any paper yet.Moreover,

this paper is intented to propose three algorithms based on rst order designed for

any dynamic network.

This paper is organized as follows.Section 2 presents the related works,we

review the diusion and the dimension exchange on any static networks.In section

3,we introduce a graph model for dynamic networks and the diusion load balancing

algorithm for dynamic networks.Section 4 presents two algorithms derived from the

diusion.The rst one is GAE,an adaptation of GDE for dynamic networks and

the second one is the relaxed diusion load balancing algorithm that speeds up the

convergence of classical diusion algorithm.Section 5 illustrates the behavior and

presents experimentations of our algorithms.In section 6 we conclude our work and

the last section (7) is devoted to the detailed proof of our main result.

2.Related Works.The algorithms studied in this paper are derived from the

diusion model and some of its variants.This section describes those models.

Classically,a static network topology is represented by a simple undirected con-

nected graph G = (V;E),where V is the set of vertexes and E is the set of edges,

E V V.Each computing processor is a vertex of the graph and each communica-

tion link between two processors i;j is the edge fi;jg 2 E between the two vertexes

i and j (i;j 2 V ).By denition,each vertex is labeled from 1 to n where n is the

number of processors,so jV j = n.Let m be the number of communication links

(jEj = m).

In [1] Cybenko introduced two distributed load balancing algorithms for static

networks.The rst one,called diusion,assumes that a process i balances its load

simultaneously with all its neighbors.To balance the load,a ratio

ij

of the dierence

of load between the process i and j is swapped between i and j.For a process i,the

load balancing step with all its neighbors is given by equation 2.1 where w

(t)

i

is the

work load done by process i at time t.

w

(t+1)

i

= w

(t)

i

+

P

j

ij

(w

(t)

j

w

(t)

i

)

(2.1)

The second algorithm called Dimension Exchange (DE) is a diusion algorithm in

which a processor communicates with only one processor at a given time step,and

2

balances load after each communication.The DE algorithmassumes that all processes

of a D-dimensional binary hypercube balance their load on the same dimension d at

time t.Between two processes i and j linked on the dimension d,with d = t mod D,

the DE algorithm for the process i is given by equation 2.2.

w

(t+1)

i

= w

(t)

i

+

w

(t)

j

w

(t)

i

2

(2.2)

In [2,9] the authors introduced a DE algorithm,called Generalized Dimension

Exchange (GDE),for an arbitrary xed network topology.In order to represent the

dimensions used in the classical DE algorithm,they introduce an edge coloring graph,

each color represents one dimension.Each edge of G is supposed to be colored with

the smallest number of colors (say k).If (G) denotes the degree of G,it is proved

that the minimum number of colors k is (G) k ((G) +1) [16].It is supposed

that a vertex i cannot have two edges with the same color.Each color is indexed with

one integer from 0 to k 1.

The graph Gof the network topology is transformed in the k-color graph G

k

= (V;E

k

).

E

k

is a set of 3-tuples fi;j;cg where fi;jg 2 E is an edge between vertexes i and j,

and c is the color of the edge fi;jg,(0 c k 1).

When GDE balances the load on a dimension (a color c),each vertex with an edge

of color c balances its load with its neighbor on this edge.For a process i the GDE

algorithm is dened by equation 2.3.

w

(t+1)

i

= w

(t)

i

+(w

(t)

j

w

(t)

i

) if 9j such that fi;j;cg 2 E

k

= w

(t)

i

otherwise

(2.3)

where c is a chromatic index such that c = t mod k and 2]0;1[ is the exchange

parameter chosen according to the network topology.

All these algorithms are dened for static networks and require a correct order for load

balancing (dimension or chromatic order).If the communication medium is Internet,

some problems of communication can appear,the network topology becomes dynamic

and an arbitrary order can be a constraint for the load balancing.

3.Diusion Load Balancing on Dynamic Networks.This section recalls

works realized in [15]:the denition of dynamic networks,its graph representation

and the diusion load balancing algorithm adapted to dynamic networks.

The diusion algorithmic model needs a graph representation for the dynamic

network.A dynamic network is a network in which some links are evolving with time.

It can lose some edges due to communication failures or communication time out,as it

may be the case in the Internet network,but no computer can be added or denitively

retrieved in the network.A classical undirected connected graph G = (V;E) is used

for the global network and we dene the set E

(t)

B

as the set of broken edges at time t.

So G

(t)

= (V;E;E

(t)

B

) is a graph model for dynamic networks.As in a classical graph,

V is the set of processors,E is the set of edges,E is a subset of V V,each edge

fi;jg 2 E is a communication link between processors i and j,(i;j 2 V ),jV j = n and

jEj = m.E

(t)

B

is a subset of E.Figure 3.1 illustrates a possible evolution of a dynamic

network.It should be noted that if E

(t)

B

is empty at any time t,G

(t)

= (V;E;E

(t)

B

) is

the static network G = (V;E).

In the context of dynamic networks,the standard diusion scheme requires some

adaptations due to the dynamic nature of the topology.The main dierence lies

3

1

2

3 4

(a) Time t,E

(t)

B

= f(2;3)g.

1

2

3 4

(b) Time t + 1,E

(t+1)

B

=

f(2;3);(1;3)g.

1

2

3 4

(c) Time t + 2,E

(t+2)

B

=

f(1;3)g.

1

2

3 4

(d) Time t+3,E

(t+3)

B

= fg.

Fig.3.1.Time evolution of a dynamic network.

in a relevant adaptation of the diusion matrix that needs to dynamicly integrate

information about the link failure.

The following example illustrates a possible situation in the case of dynamic networks:

suppose that using the standard diusion algorithm,a processor i must balance its

load with all its neighbors called j,k and l.If the edge between i and j is broken,

then i cannot balance its load with its neighbor j.Nevertheless i can still balance its

load over its living edges,ie.(i;k) and (i;l).

The diusion algorithm with dynamic networks may be described as follows:for

a processor i,the exchange of its workload with its reachable neighbors j is executed

as algorithm (3.1).

w

(t+1)

i

= w

(t)

i

+

P

j

ij

(w

(t)

j

w

(t)

i

) for all living edges (i;j)

(3.1)

where w

(t)

i

is the workload of processor i at time t and

ij

is dened as in (2.1).

The equation (3.1) is linear and it expresses the vector equation (3.2) that updates

load for all nodes at time t.

W

(t+1)

= M

(t)

W

(t)

(3.2)

Where W

(t)

is the vector of w

(t)

i

and M

(t)

is dened by

m

(t)

ij

=

8

>

<

>

:

ij

if (i;j) 2 E ^ (i;j) 62 E

(t)

B

^ i 6= j;

1

P

k

ik

8kj(i;k) 2 E ^ (i;k) 62 E

(t)

B

^ i = j

0 otherwise:

M

(t)

is the diusion matrix at time t,it represents the incidence matrix of commu-

nication graph at this time step.The evolution of the workload distribution between

time t and t +1 is given by W

(t+1)

= M

(t)

W

(t)

.

4

In the example of Figure 3.2,matrices M

(t+1)

and M

(t+2)

are:

M

(t+1)

=

"

1

1;2

1;2

0 0

2;1

1

2;1

0 0

0 0 1

3;4

3;4

0 0

4;3

1

4;3

#

cf:Fig:3:2(b)

M

(t+2)

=

"

1 0 0 0

0 1

2;3

2;3

0

0

3;2

1

3;2

0

0 0 0 1

#

cf:Fig:3:2(c)

To give our main result,we need the following denition:

Definition 3.1.At each time step,the communication graph for load balancing is

the graph which shows only the edges that are used for load balancing communications

at this time.A superposed communication graph for load balancing G

t;t+n

between the

times t and t+n is a graph that shows all edges used for load balancing communications

between the times t and t +n (see Figure 3.2).

1

2

3 4

(a) Global Network before

any load balancing.

1

2

3 4

(b) Time t.

1

2

3 4

(c) Time t +1.

1

2

3 4

(d) Superposed graph be-

tween time t and t + 1

(G

t;t+1

).

Fig.3.2.These graphs show a global network,two communication graphs and the corresponding

superposed communication graph.In this case,the superposed graph is connected:there is a path

between any two nodes.

Theorem 3.2.Algorithm (3.1) converges toward the uniform load distribution if

and only if to any time t corresponds a time t +n such that the superposed commu-

nication graph G

t;t+n

is a connected graph (see Fig.3.3).

Remark 1.It should be noted that theorem 3.2 does not claim that all edges have

to be alive during the load balancing process,for example in Figure 3.2,edge (1,3) is

never alive.Moreover,the network can always be disconnected,only the superposed

communication graph must be connected to ensure the convergence.This property is

checked with the series of graphs (3.2(b),3.2(c),3.2(b),3.2(c),:::).In other words,

if the communication graphs always oscillate between 3.2(b) et 3.2(c) then the network

graph is always disconnected,nevertheless we are in the hypotheses of Theorem 3.2

5

and then the convergence is ensured.

The hypothesis of the theoremmeans that,at any time t,we ensure to get a connected

graph at a later time t +n by the virtual superposition of the communication graphs

between t and t + n.Note that the integer n is not necessarily a constant.Thus,

whatever the conguration of the network,we have to be certain to construct a path

between any two processors by the articial superposition of communication graphs

(see Fig.3.3).

Time

t1 t2 t3 t4

t3+n3

ti

ti+nit4+n4t1+n1 t2+n2

Fig.3.3.Between two time steps ti and ti +ni the superposed communication graph is connected.

We propose in the next section two algorithms derived from the diusion for

dynamic networks.

4.Derived Algorithms.This section presents two algorithms derived fromthe

diusion algorithm,the rst one is GAE,an adaptation of GDE for dynamic networks

and the second one is the relaxed diusion algorithm that speeds up the convergence

of classical diusion.

4.1.GAE.The interesting property of the GDE algorithm lies in the fact that

a node communicates with at most one of its neighbors at each iteration (i.e.a node i

balances its load with only one neighbor).This property can be preserved for dynamic

networks.In the case of dynamic networks,the GDE algorithm will be called GAE

(Generalized Adaptive Exchange).

The GAE algorithmic model can be viewed as a diusion model in which at time

t,for each processor i,all except one living edge (i;j) are broken.The neighbor

j must be chosen by an arbitrary,random or more sophisticated strategy.With

this strategy,all the other edges (i;k) with k 6= j are supposed to be broken,so

(i;k) 2 E

(t)

B

j8k;k 6= j.Moreover in GAE algorithm,

ij

is equal to a constant .

The GAE algorithm may be described as follows:for a processor i,the exchange

of its workload with a neighbor j is executed as algorithm (4.1)

w

(t+1)

i

= (1 )w

(t)

i

+w

(t)

j

if i communicates with its

neighbor j at time t;

= w

(t)

i

if i does not communicate with

any neighbor at time t;

(4.1)

With these conditions,the diusion matrix M becomes:

m

(t)

ij

=

8

>

>

>

<

>

>

>

:

if (i;j) 2 E ^ (i;j) 62 E

(t)

B

^ i 6= j;

1 9kj(i;k) 2 E ^ (i;k) 62 E

(t)

B

^ i = j

1 69kj(i;k) 2 E ^ (i;k) 62 E

(t)

B

^ i = j

0 otherwise:

(4.2)

A processor i can communicate with a neighbor j if and only if the edge between

i and j is not broken,i does not communicate with another neighbor h,and j does

not communicate with another node.

Corollary 4.1.GAE converges under the assumption of theorem 3.2.

Proof.This is a particular case of theorem 3.2 by using M dened by equation

(4.2).

6

4.2.Accelerated Relaxed Diusion.This scheme is also based on the diu-

sion scheme.It consists in introducing a relaxation parameter

(t)

in order to speed

up the convergence [15].The relaxed diusion algorithm may be described as fol-

lows:for a processor i,the exchange of its workload with its reachable neighbors j is

executed as algorithm (4.3).

w

(t+1)

i

= w

(t)

i

+

(t)

P

j

ij

(w

(t)

j

w

(t)

i

) for all living edges (i;j)(4.3)

In this algorithm,the diusion matrix is (1

(t)

)Id +

(t)

M

(t)

.Where Id is the

identity matrix,M

(t)

is equal to the diusion matrix of diusion algorithm and

(t)

is the relaxation parameter at time t.

It is known [17] that if M is a stochastic matrix,then the optimal parameter is

=

2

2 (s +l)

;(4.4)

where s and l are respectively the smallest and the second largest eigenvalue of M.

This denition of does not imply that each w

(t)

i

stays positive.Let us dene

(t)

to be the relaxation parameter at time t for the network G

(t)

,such that W

(t)

stays

positive.Let us denote by R

(t)

the relation such that W

(t)

stays positive if

(t)

< R

(t)

,

R

(t)

is dened by R

(t)

= min

i

w

(t)

i

(1 M

(t)

ii

)(w

(t)

i

w

(t)

min

)

;

where w

(t)

min

= min

i

(w

(t)

i

).With this denition,

(t)

for the network G

(t)

is given by

(t)

= min

R

(t)

;

2

2 (s

(t)

+l

(t)

)

;(4.5)

where s

(t)

and l

(t)

are respectively the smallest and the second largest eigenvalue of

M

(t)

.For the relaxed diusion,the vector equation 3.2 becomes

W

(t+1)

=

1

(t)

W

(t)

+

(t)

M

(t)

W

(t)

:

Corollary 4.2.For chosen according to (4.5),and under assumption of

theorem 3.2,the relaxed diusion scheme converges and is optimal.

Proof.It is sucient to apply result of [17]

5.Experimentation.This section shows the results of implementations of rst

order scheme (FOS),relaxed rst order scheme (RFOS) and GAE algorithms.For

GAE,three dierent neighbor choices are implemented to illustrate the eect of this

choice.The goal of these experimentations is to illustrate the behavior of these algo-

rithms and to highlight their convergence toward the uniform work load distribution

with some broken edges.These experimentations were performed with the mathemat-

ical software'Scilab'[18].To study the behavior of these algorithms without external

interaction,the load balancing is not applied to a real work load.The load for each

processor is virtual,it is simulated by an integer.Moreover dynamic networks are

simulated by introducing a percentage of reliability for edges.

The simulations are performed on various networks:line (64),ring (64),mesh

2D (8x8),mesh 3D (4x4x4),torus (8x8) and hypercube (d6) with respectively 63,64,

112,144,128 and 192 edges.At the initialization,all the load of the system (6400)

is given to node 0.For each network,simulations are realized with a percentage of

7

broken edges per iteration (bpi) between 0% and 50% with 5% step.The convergence

criteria is xed to 1,in other words,the algorithms are stopped when the dierence

between the most and the least loaded nodes is less than 1.

For the relaxed diusion algorithm,the parameter

(t)

must be determined at each

time step.The calculation of

(t)

needs a global information on the network,which

is not convenient on distributed systems.Thus is dened as the optimal parameter

corresponding to the initial network (without communication failure) with W

(0)

,and

used at each time step (

(t)

= )

= min

R;

2

2(s+l)

with R = min

i

w

(0)

i

(1M

ii

)

w

(0)

i

w

(0)

min

;

(5.1)

where s and l are respectively the smallest and the second largest eigenvalue of M.

Table 5.1 gives computed at the initialization with equation (5.1) for each network.

line

ring

mesh

cube

torus

hyper.

beta

1.5

1.496

1.25

1.166

1.164

1.

Table 5.1

Beta computed with equation (5.1).

The values of given in table 5.1 are used for each simulation.

GAE algorithm needs a neighbor choice at each iteration to balance the load.For the

simulations,three choices are studied:arbitrary,random and most to least loaded.

The arbitrary choice is equivalent to graph coloring of GDE,for a given node at a given

time step the edge to use is predened.If this edge is broken,the two corresponding

nodes do not balance their load at this time step.The random choice\RAND"

randomly elects a living edge for each node such that for each chosen edge,the two

corresponding nodes are not balanced.In the most to least loaded choice\M2LL",

each node chooses a living edge that links itself with one of its neighbors for which

the dierence of load is the highest.

The results given in tables 5.2 to 5.7 show the numbers of iterations needed to have

a uniform load distribution for each network and each reliability.

line

ring

mesh

cube

torus

hyper.

FOS

6595.

1648.

193.

72.

49.

20.

RFOS

4395.

1185.

154.

61.

43.

20.

GAE

4395.

1098.

150.

55.

36.

6.

GAE

RAND

6674.

1652.

218.

81.

67.

36.

GAE

M2LL

4395.

1098.

135.

55.

34.

6.

Table 5.2

Results of load balancing with 0% bpi.

line

ring

mesh

cube

torus

hyper.

FOS

7248.

1816.

215.

80.

53.

22.

RFOS

4831.

1211.

170.

69.

46.

22.

GAE

5267.

1317.

185.

70.

46.

17.

GAE

RAND

7513.

1872.

235.

84.

66.

34.

GAE

M2LL

5635.

1394.

161.

56.

41.

17.

Table 5.3

Results of load balancing with 10% bpi.

These tables show that RFOS is always better than FOS for Beta greater than 1.

8

line

ring

mesh

cube

torus

hyper.

FOS

7974.

1991.

236.

89.

61.

24.

RFOS

5310.

1332.

188.

75.

51.

24.

GAE

6227.

1562.

221.

86.

56.

28.

GAE

RAND

8182.

2041.

252.

88.

74.

35.

GAE

M2LL

6362.

1599.

173.

58.

43.

15.

Table 5.4

Results of load balancing with 20% bpi.

line

ring

mesh

cube

torus

hyper.

FOS

8758.

2226.

256.

99.

69.

28.

RFOS

5841.

1483.

208.

84.

57.

27.

GAE

7275.

1883.

262.

120.

68.

35.

GAE

RAND

8829.

2236.

261.

90.

71.

38.

GAE

M2LL

7068.

1798.

185.

59.

47.

20.

Table 5.5

Results of load balancing with 30% bpi.

If Beta is equal to 1,ROFS is equivalent to FOS;it is the case with hypercube

networks.

These tables illustrate the in uence of neighbor choices for GAE algorithms.M2LL

is always better than the two other choice strategies or equivalent to GAE in the case

of 0%bpi.In most cases,RAND is the less ecient choice.RAND can be better than

classical GAE for some network topologies if the percentage of broken edges is high.

It is the case for cubes with a bpi greater than 20%,for meshes with a bpi greater

than 30%,for hypercubes with a bpi greater than 35% and for tori with a bpi greater

than 40%.

Figure 5.1 gives for each studied network the number of iterations in function

of bpi.The graphics on gure 5.1 are obtained by linear regression with data given

in tables 5.2 to 5.7.These gures bring the eect of bpi for each network on the

convergence of the load balancing algorithms to the fore.These results show that the

more ecient algorithms are RFOS and GAE with M2LL choice.The comparison of

these two algorithms shows that their eectiveness depends on the network topolo-

gies.ROFS is more adapted than GAE

M2LL for line and ring topologies,whereas

GAE

M2LL is better than RFOS for the other studied topologies.

With these simulations,we experimentally show that these load balancing meth-

ods converge with broken edges.Moreover,these simulations illustrate the in uence

of network topologies and of the percentage of broken edges on the eectiveness of

load balancing methods.

6.Conclusion and future works.Some enhancements of load balancing mod-

els (First Order,Relaxed First Order and GDE) are proposed in this paper.These

new models can be used on dynamic networks.They are useful when the topology

may change due to failures in the communication links and are well-suited for large

problems that need to share computations among distant processors,as it is the case

in grid computing.

A main result of this paper is that we have given the necessary and sucient

conditions so as to have convergence in the dynamic networks framework.In other

words,we prove that if we work on dynamic networks like Internet where the com-

munication links are not reliable at 100% (communication failure or low bandwidth

communication),these algorithms can balance the loads of the system.Another im-

portant result is the enhancement of the diusion scheme by a relaxation parameter.

9

line

ring

mesh

cube

torus

hyper.

FOS

9786.

2449.

286.

109.

75.

30.

RFOS

6515.

1628.

232.

93.

64.

30.

GAE

8679.

2155.

302.

122.

76.

44.

GAE

RAND

9581.

2403.

280.

95.

82.

39.

GAE

M2LL

7880.

1970.

199.

65.

51.

22.

Table 5.6

Results of load balancing with 40% bpi.

line

ring

mesh

cube

torus

hyper.

FOS

10750.

2734.

317.

123.

81.

35.

RFOS

7155.

1824.

252.

104.

71.

34.

GAE

9989.

2563.

346.

155.

95.

47.

GAE

RAND

10322.

2624.

303.

97.

85.

43.

GAE

M2LL

8617.

2184.

217.

69.

55.

20.

Table 5.7

Results of load balancing with 50% bpi.

These algorithms are synchronous,in the sense that,to balance its load at time t +1,

a processor needs to have the load of its neighbor at time t.Finally the paper is

concluded by signicant experiments,leading to interesting teaching.The next steps

of this work are to apply these algorithms on a real distributed scientic application

and to adapt it to actual grid environment by considering the communication delays

between the processors.

7.Proof.

7.1.Technical lemma.At a time t,the adjacency matrix I

(t)

of a diusion matrix

M

(t)

is dened by

I

(t)

ij

=

1 if (i and j communicate at time t) or if (i = j)

0 otherwise

Let us dene the adjacency matrix of the superposed communication graph G

t;T

of the

network between times t and T by I

(t;T)

ij

= I

(t)

ij

+I

(t+1)

ij

+:::+I

(T1)

ij

+I

(T)

ij

where 1+1 = 1;

1 +0 = 0 +1 = 1;0 +0 = 0:

Lemma 7.1.For any m adjacency matrices I

1

;:::;I

m

we have

I

(1)

+I

(2)

+:::+I

(m)

I

(m)

I

(m1)

:::I

(1)

where we use the usual product of matrices and where the sum of boolean numbers is dened

above.

The order relation between two matrices is the partial ordering element to element.

Proof.By induction:It is sucient to prove that if

I

(k)

I

(k1)

ij

= 0

then

I

(k)

+I

(k1)

ij

= 0.We have:

I

(k)

I

(k1)

ij

= 0,

n

P

l=1

I

(k)

il

I

(k1)

lj

=

0,

I

(k)

il

I

(k1)

lj

= 0;for all l 2 f1;:::;ng;particularly for l = j and l = i so

I

(k)

ij

I

(k1)

jj

= 0 and

I

(k)

ii

I

(k1)

ij

= 0.

As any diagonal entry of an adjacency matrix is 1;we deduce that

I

(k)

ij

= 0 and

I

(k1)

ij

= 0;so that

I

(k)

+I

k1

ij

=

I

(k)

ij

+

I

(k1)

ij

= 0

Lemma 7.2.If the superposed graph G

t;t+m

of m+1 networks,the diusion matrices of

which are M

(t)

;M

(t+1)

;:::;M

(t+m)

is connected,then M = M

(t+m)

M

(t+m1)

:::M

(t)

is an irreducible matrix.

Proof.Indeed,the adjacency matrix of M is I = I

(t+m)

:::I

(t)

,on the other hand

the adjacency matrix of the superposed graph is I

(t)

+I

(t+1)

:::+I

(t+m)

.Thanks to lemma

10

1,we deduce that if any entry of a superposed graph is 1,the corresponding entry of I is

also 1,so if the superposed graph is connected then M is irreducible.

7.2.Proof of theorem 3.2.- Sucient condition:

By hypothesis,there always exists a time step t 2 N;such that the superposed connection

graph of the network is connected and so by the above lemma 2,for any time step t > t

0

there always exist irreducible matrices T

(p

i

)

such that

M

(t)

M

(t1)

:::M

(1)

= M

(t)

::M

(tL)

T

(p

)

T

(p

1

)

:::T

(p

1

)

where T

(p

)

= M

(tL1)

:::M

(L

)

,

T

(p

1

)

= M

(L

1)

:::M

(L

1

)

,

:::,

T

(p

1

)

= M

(L

2

1)

:::M

(L

1

)

,

Land L

i

are nite integers (so when t tends to innity tends also to innity and lim

!1

p

=

1).

It can easily be seen that the matrices T

(p

i

)

are doubly stochastic,not bipartite (-1 is not

an eigenvalue) and irreducible.We know that if a matrix A is irreducible and doesn't have

1 as an eigenvalue,then there exists l such that A

l

is a positive matrix.These deductions

imply that

8i;lim

k!1

(T

p

i

)

k

= Q =

1

n

"

1 1

.

.

.

.

.

.

.

.

.

1 1

#

Let

(p

j

)

denote the second largest eigenvalue of T

(p

j

)

;then 0

(p

j

)

< 1

Recall that for a row stochastic matrix T,we have kTk

1

= 1 where kTk

1

is the usual

compatible maximum norm,and that if T is symmetric and doubly stochastic

kTk

2

p

kTk

1

kTk

1

=

p

kT

0

k

1

kTk

1

= 1(7.1)

where T

0

is the transposed matrix of T and kTk

1

and kTk

2

are respectively the usual l

1

and

the Euclidean matrix norms.

Let w

(t)

be the load over the network at time t,w

(0)

be the initial load and w

=

(

P

i

w

(0)

i

n

;::::;

P

i

w

(0)

i

n

) the uniform distributed load,then

w

(t+1)

w

2

=

M

(t)

:::M

(tL)

T

(p

)

T

(p

1

)

:::T

(p

1

)

w

(0)

w

2

M

(t)

2

:::

M

(tL)

2

T

(p

)

T

(p

1

)

:::T

(p

1

)

w

(0)

w

2

T

(p

)

T

(p

1

)

:::T

(p

1

)

w

(0)

w

2

due to (7.1)

=

T

(p

)

(T

(p

1

)

:::T

(p

1

)

w

(0)

) T

(p

)

(w

)

2

(p

)

T

(p

1

)

(T

(p

2

)

:::T

(p

1

)

w

(0)

) T

(p

1

)

(w

)

2

.

.

.

(p

)

(p

1

)

w

(0)

w

2

so lim

t!1

w

(t+1)

w

2

lim

!1

(p

)

(p

1

)

w

(0)

w

2

Since the number of edges is nite,the number of the connection graphs is nite,so

there exists k such that for all j 2 N;

(p

j

)

(p

k

)

< 1;so lim

!1

(p

)

(p

1

)

lim!1

(p

k

)

= 0:

The last inequality implies that lim

t!1

w

(t)

w

2

= 0;so 8i 2 f1;:::ng w

(t)

i

!w

i

in other words the load of each processor tends to the uniformdistributed load w

i

=

P

i

w

(0)

i

n

:

- Necessary condition

It is obvious,otherwise a processor is never reached,implying that its work is never

balanced.

REFERENCES

11

[1] G.Cybenko.Dynamic load balancing for distributed memory multiprocessors.Jour.of Para.

and Dist.Comp.,7:279{301,1989.

[2] S.H.Hosseini,B.Litow,M.Malkawi,J.McPherson,and K.Vairavan.Analysis of a graph

coloring based distributed load balancing algorithm.Jour.of Para.and Dist.Comp.,

10:160{166,1990.

[3] B.Litow,S.H.Hosseini,K.Vairavan,and G.S.Wole.Performance characteristics of a load

balancing algorithm.Jour.of Para.and Dist.Comp.,31:159{165,1995.

[4] R.Diekmann,A.Frommer,and B.Monien.Ecient schemes for nearest neighbor load balanc-

ing.Para.Comp.,25:289{313,1998.

[5] J.E.Boillat.Load balancing and poisson equation in a graph.Concurrency:Prac.and Expe.,

2(4):289{313,1990.

[6] J.M.Bahi and J.Gaber.Load balancing on networks with dynamically changing topology.In

Europar 2001 conf.,Lect.Notes on Comp.Scie.,pages 175{182,2001.

[7] D.P.Bertsekas and J.N.Tsitsiklis.Parallel and Distributed Computation:Numerical Methods

Englewood Clis NJ,Prentice-Hall,1989.

[8] C.Z.Xu and F.C.M.Lau.Optimal parameters for load balancing with the diusion method in

mesh networks Parallel Processing Letters,4(1-2):139{147,1994.

[9] C.Z.Xu and F.C.M.Lau.Analysis of the generalized dimension exchange method for dynamic

load balancing.Jour.of Para.and Dist.Comp.,16(4):385{393,1992.

[10] R.Elssser and B.Monien and R.Preis.Diusion Schemes for Load Balancing on Heterogeneous

Networks Theory of Computing Systems,35:305{320,2002.

[11] W.Aiello,B.Awerbuch,B.Maggs,and S.Rao.Approximate load balancing on dynamic and

asynchronous networks.In Proc.of the 25th Annual ACM Symposium on the Theory of

Comp.,pages 125{136,San Diego,California,1993.

[12] B.Aiello and T.Leighton.Coding theory,hypercube embeddings,and fault tolerance.In Proc.

of the 3rd Annual ACMSymposium on Para.Algo.and Arch.,pages 125{136,Hilton Head,

South Carolina,1991.

[13] F.T.Leighton,B.M.Maggs,and R.K.Sitaraman.On the fault tolerance of some popular

bounded-degree networks.SIAM Jour.on Comp.,27(5):1303{1333,1998.

[14] J.M.Bahi,R.Couturier,and F.Vernier.Broken edges and dimension exchange algorithm on

hypercube topology.11-th Euromicro Conf.on Para.Dist.and Netw.based Proc.,2003.

[15] J.M.Bahi,R.Couturier,and F.Vernier.Accelerated diusion algorithms on general dynamic

networks.In Proc.of 5th Inter.Conf.,PPAM Czestochowa,Poland,(to appear),2003.

[16] S.Fiorini and R.J.Wilson.Edge-coloring of graphs.In In L.W.Beineke and R.J.Wilson,

editors,Selected topics in graph theory.Academic Press,1978.

[17] A.Berman and R.J.Plemmons.Nonnegative Matrices in the Mathematical Sciences.SIAM,

Philadelphia,third edition,1979 edition,1994.

[18] P.S.Motta Pires and D.A.Rogers.32nd asee/ieee frontiers in education conference.Free/Open

Source Software:An Alternative for Engineering Students,IEEE,2002.

12

4000

5000

6000

7000

8000

9000

10000

11000

0

5

10

15

20

25

30

35

40

45

50

Iterations

% B.P.I.

FOS

RFOS

GAE

GAE_RAND

GAE_M2LL

(a) Line.

1000

1200

1400

1600

1800

2000

2200

2400

2600

2800

0

5

10

15

20

25

30

35

40

45

50

Iterations

% B.P.I.

FOS

RFOS

GAE

GAE_RAND

GAE_M2LL

(b) Ring.

100

150

200

250

300

350

0

5

10

15

20

25

30

35

40

45

50

Iterations

% B.P.I.

FOS

RFOS

GAE

GAE_RAND

GAE_M2LL

(c) Mesh.

50

60

70

80

90

100

110

120

130

140

150

160

0

5

10

15

20

25

30

35

40

45

50

Iterations

% B.P.I.

FOS

RFOS

GAE

GAE_RAND

GAE_M2LL

(d) Cube.

30

40

50

60

70

80

90

100

0

5

10

15

20

25

30

35

40

45

50

Iterations

% B.P.I.

FOS

RFOS

GAE

GAE_RAND

GAE_M2LL

(e) Torus.

10

15

20

25

30

35

40

45

50

0

5

10

15

20

25

30

35

40

45

50

Iterations

% B.P.I.

FOS

RFOS

GAE

GAE_RAND

GAE_M2LL

(f) Hypercube.

Fig.5.1.Number of iterations in function of bpi for each studied network (after a linear

interpolation).

13

## Comments 0

Log in to post a comment