Efﬁcient Algorithms for Reachability and
PathSelection Problems with Applications
Final Report of Project Funded by the
John S.Latsis Public Beneﬁt Foundation
December 2010
TeamMembers
Department of Informatics and Department of Computer Science
Telecommunications Engineering University of Ioannina
University of Western Macedonia
Alexandra Galani Stavros D.Nikolopoulos
Research and Teaching Staff Professor
Loukas Georgiadis Leonidas Palios
Assistant Professor (Coordinator) Associate Professor
Abstract
Graphs are mathematical structures that model many important entities such as the
worldwide web,transportation,communication and social networks,databases,and
biological systems.The objective of this research project was the design of efﬁcient al
gorithms for a collection of graph problems related to Reachability and PathSelection.In
Reachability and PathSelection problems we are given an input graph and wish to ef
ﬁciently perform queries that report if two vertices are connected by a path or compute
paths connecting speciﬁed vertices so that certain requirements are satisﬁed.Algo
rithmic problems of this kind have numerous applications,including internet routing,
geographical navigation,and knowledgerepresentation systems.Speciﬁcally,in this
project we studied the following types of problems:
JoinReachability:This is a natural extension of the standard Reachability problem
for a collection of graphs G.We wish to process G so that we can report fast the
set of vertices that reach a given vertex in all graphs of G.
Computation of Disjoint Paths:Our goal is to compute a pair of disjoint paths from
a given source vertex to every other vertex,or to a speciﬁc target vertex.
We developed algorithmic techniques and provided efﬁcient algorithms for problems of
the above types.We also considered newapplications of our techniques and algorithms.
Περίληψη
Ταγραφήµαταείναι µαθηµατικές δοµές πουµοντελοποιούνπολλές σηµαντικές οντό
τητες όπως ο παγκόσµιος ιστός,µεταφορικά,επικοινωνιακά και κοινωνικά δίκτυα,
βάσεις δεδοµένων και βιολογικάσυστήµατα.Οσκοπός του ερευνητικού προγράµµα
τος ήταν η σχεδίαση αποδοτικών αλγόριθµων για µια συλλογή προβληµάτων που
σχετίζονται µε τη Συνδετικότητα και την Επιλογή Μονοπατιών σε γραφήµατα.Σε
προβλήµατα Συνδετικότητας και Επιλογής Μονοπατιών µας δίνεται ένα γράφηµα
εισόδου για το οποίο επιθυµούµε να απαντούµε αποδοτικά ερωτήµατα για το εάν
δύο κορυφές του συνδέονται µε κάποιο µονοπάτι ή να υπολογίζουµε µονοπάτια που
συνδέουν συγκεκριµένες κορυφές και ταυτόχρονα ικανοποιούν καθορισµένες απαι
τήσεις.Αλγοριθµικά προβλήµατα αυτού του τύπου έχουν πολυάριθµες εφαρµογές
που περιλαµβάνουν τη δροµολόγηση σε δίκτυα,τη γεωγραφική πλοήγηση και τα
συστήµατα αναπαράστασης γνώσης.Συγκεκριµένα,στο πρόγραµµα αυτό εξερευ
νήσαµε τους ακόλουθους τύπους προβληµάτων:
Από Κοινού Συνδετικότητα:Είναι µια φυσική επέκταση του τυπικού προβλήµα
τος συνδετικότητας για µια συλλογή γραφηµάτων G.Επιθυµούµε να επεξερ
γαστούµε τη G έτσι ώστε να µπορούµε να αναφέρουµε γρήγορα το σύνολο των
κορυφών γιατις οποίες υπάρχει µονοπάτι προς µιαδεδοµένη κορυφή σε όλατα
γραφήµατα της G.
Υπολογισµός ΜηΤεµνόµενων Μονοπατιών:Οστόχος µας είναι ναυπολογίσουµε
ένα ζεύγος µη τεµνόµενων µονοπατιών από µια δεδοµένη αφετηριακή κορυφή
προς κάθε άλλη κορυφή ή προς µια συγκεκριµένη καταληκτική κορυφή.
Αναπτύξαµε αλγοριθµικές τεχνικές και παρουσιάσαµε αποδοτικούς αλγόριθµους
γιαπροβλήµατατωνπαραπάνωτύπων.Επιπλέοναναζητήσαµε νέες εφαρµογές των
τεχνικών και των αλγορίθµων µας.
Contents
1 Introduction
4
1.1 Fundamental Concepts in Graph Theory
...................
5
1.2 Reachability
....................................
6
1.3 PathSelection
...................................
7
2 JoinReachability
8
2.1 Applications
....................................
9
2.2 Results
.......................................
10
2.3 Explicit JoinReachability
............................
11
2.3.1 Computational Complexity
.......................
11
2.3.2 Combinatorial Complexity
.......................
12
2.4 Implicit JoinReachability
............................
13
3 Connectivity and VertexDisjoint Paths
16
3.1 Vertex Connectivity
................................
16
3.2 Dominator Veriﬁcation
..............................
17
3.3 Independent Spanning Trees
..........................
19
3.4 Testing 2Vertex Connectivity
..........................
20
3.5 Computing Pairs of VertexDisjoint st Paths
.................
20
4 Further Applications
22
1
4.1 Interprocedural Dominance
...........................
22
4.2 Computational Morphological Analysis
....................
24
5 Conclusions and Future Work
26
2
List of Figures
1.1 A directed graph.
.................................
5
2.1 An instance of joinreachability for two digraphs.
..............
9
2.2 Reducing the size of a graph with the use of a Steiner vertex.
.......
12
2.3 Mapping the vertices of two paths to points in the plane.
..........
13
2.4 A Cartesian tree.
.................................
14
3.1 A strongly connected and a 2vertex connected digraph.
..........
17
3.2 A ﬂowgraph G(s) and its dominator tree D(s).
................
18
3.3 Two independent spanning trees of a 2vertex connected graph.
......
19
3.4 Two vertexdisjoint paths in a 2vertex connected graph.
..........
21
4.1 An interprocedural ﬂowgraph and its dominator dag.
............
23
4.2 A graph of a morphological analysis.
.....................
25
3
Chapter 1
Introduction
The area of graph algorithms is rich and successful as graphs are mathematical struc
tures that model many important and diverse entities such as the worldwide web,
transportation,communication and social networks,databases,biological systems and
the controlﬂow of computer programs.The problems of reachability and pathselection
are fundamental in graph algorithms,with numerous application areas,including inter
net routing,geographical navigation,knowledgerepresentation,computational biology,
program optimization and natural language processing.Our project was motivated by
recent advances in these areas,as well as emerging applications of graphbased data
structures.
We studied a collection of graph problems related to reachability and pathselection.
The outcomes of this program were the design of efﬁcient algorithms,the development
of new algorithmic techniques,the design and implementation of practical algorithms,
and the identiﬁcation of new applications of our algorithms and techniques.In this
report we restrict ourselves to an overview of the research project,in order to make the
content comprehensive to nonspecialists in theoretical computer science.For the full
technical details and proofs of our results we refer to the research articles,which are
preliminary versions of [
Geo10
,
GNP10
,
GT10
],posted at the project’s website:
4
V = fa,b,c,d,eg
G = (V,E)
e
d
E = f(a,b),(b,c),(c,d),(d,a),(d,e),(e,c)g
a
b
c
Figure 1.1:
A directed graph.
http://www.icte.uowm.gr/lgeorg/RPS/
In this chapter we introduce the basic terminology and deﬁne the problems in
our study.Chapter
2
deals with reachability problems and Chapter
3
discusses path
selection problems.In Chapter
4
we present further applications of our techniques.
Finally,in Chapter
5
we discuss directions for future research.
1.1 Fundamental Concepts in Graph Theory
Agraph G = (V,E) is an abstract representation of a set of objects V,called vertices,and
a set of links E,called edges,which connect pairs of objects.The edges may be directed
(asymmetric),in which case we have a directed graph or undirected (symmetric),in which
case we have an undirected graph.Figure
1.1
shows a directed graph with 5 vertices and
6 edges.
A path v
1
,v
2
,...,v
k
in G is a sequence of vertices v
i
2 V such that there is an edge
in E from v
i
to v
i+1
,denoted as (v
i
,v
i+1
),for i = 1,...,k ¡1;v
1
is the start vertex and
v
k
is the end vertex of the path.For example,the sequence a,b,c,d is a path from a to
d in the graph of Figure
1.1
.For any pair of vertices v,u 2 V,vertex v is reachable from
u (equivalently u reaches v) if there a path with start vertex u and end vertex v.A cycle
is a path such that the start vertex and the end vertex are the same,e.g.,e,c,d,e in the
graph of Figure
1.1
.A graph with no cycle is called acyclic.
An undirected graph G = (V,E) is connected if for every pair of vertices u,v 2 V
5
there is a path connecting u and v.A tree is an undirected graph that is acyclic and
connected.Let G = (V,E) and G
0
= (V
0
,E
0
) be two undirected graphs.Then G
0
is a
subgraph of G if V
0
is a subset of V (V
0
µ V) and E
0
is a subset of E (E
0
µ E).If V
0
= V
then G
0
is a spanning subgraph of G.If,moreover,G
0
is a tree then G
0
is a spanning tree of
G.
Adirected graph G = (V,E) is strongly connected if for every pair of vertices u,v 2 V,
u reaches v and v reaches u.E.g.,the graph of Figure
1.1
is strongly connected.A
spanning tree of G rooted at a vertex s is a subgraph of G such that for any other vertex
v 2 V there is exactly one path froms to v.E.g.,the spanning subgraph of the graph of
Figure
1.1
that is formed by the subset of edges f(a,b),(d,a),(d,e),(e,c)g is a spanning
tree with root d.
A planar graph is a graph that can be embedded in the plane,i.e.,it can be drawn on
the plane in such a way that its edges intersect only at their endpoints.The graph of
Figure
1.1
is planar.
All the graphs considered in this study are directed,although some of the problems
we discuss can be also deﬁned (and are interesting) for undirected graphs.
1.2 Reachability
In the reachability problem our goal is to preprocess an input graph into a data structure
so that queries of whether a vertex b is reachable from a vertex a can be answered
quickly.This has applications in internet routing,geographical navigation,knowledge
representation systems and other areas [
WHY
+
06
].In this project we introduced the
study of a related collection of novel problems which we call joinreachability problems.
These are motivated by recent work on graphstructured databases,social networks and
program optimization.Formally,we are given a collection of graphs G,where each
graph G
i
2 G represents a binary relation over a set of elements V.We deﬁne the join
reachability relation R as follows:a is related to b under R if and only if b is reachable
6
from a in all graphs in G.Our goal is to ﬁnd an efﬁcient representation of R such that,
for any given b 2 V,we can quickly report all the elements that are related to b in R.
In Chapter
2
we distinguish two versions of this problem,depending on the type of
desired representation of R,and provide an overview of our results.
1.3 PathSelection
The second main thread of our project deals with the design of algorithms for a spe
ciﬁc type of pathselection problems.Pathselection refers to the computation of paths
connecting a given subset of the vertices of a graph,such that certain requirements are
satisﬁed.Typical examples are the computation of a shortest path between two vertices,
ﬁnding a path connecting two vertices such that a region of the graph is avoided,or
computing edgedisjoint or vertexdisjoint paths.This area contains some of the most
important network optimization problems which have been extensively studied.
In this project we explored the computation of pairs of disjoint paths:Given a source
vertex our goal is to compute two disjoint paths to every other vertex,or to a speciﬁc
target vertex.We also considered the problem of testing if the input graph has the
necessary connectivity requirements for such disjoint paths to exist.An overview of
this study is presented in Chapter
3
.
7
Chapter 2
JoinReachability
In the reachability problem our goal is to preprocess a graph G into a data structure
that can quickly answer queries that ask if a vertex b is reachable from a vertex a.
This problem is fundamental for many application areas,including internet routing,
geographical navigation,and knowledgerepresentation systems [
WHY
+
06
].Recently,
the interest in graph reachability problems has been rekindled by emerging applications
of graph data structures in areas such as the semantic web,bioinformatics and social
networks.
The above developments together with recent applications in graph algorithms [
Geo08
,
Geo10
,
GT05
] have motivated us to introduce the study of the joinreachability problem:
We are given a collection G of l directed graphs G
i
= (V
i
,A
i
),1 · i · l,where each
graph G
i
represents a binary relation R
i
over a set of elements V µ V
i
in the following
sense:For any a,b 2 V,we have that a is related to b under R
i
,denoted by aR
i
b,if and
only if b is reachable from a in G
i
.Let R ´ R(G) be the binary relation over V deﬁned
by:aRb if and only if aR
i
b for all i 2 f1,...,lg (i.e.,b is reachable from a in all graphs
in G).An example is given in Figure
2.1
;a joinreachability query for vertex c returns
the set fa,b,f,gg which consists of the vertices reaching c in both G
1
and G
2
.
Our objective is to ﬁnd an efﬁcient representation of this relation.For simplicity,we
will restrict our attention to the case of two input graphs (l = 2).
8
e
c
G
2
G
1
f
h
g
d
b
h
g
f
e
d
c
b
a
a
Figure 2.1:
An instance of joinreachability for two digraphs.
The joinreachability problem admits a simple solution,which is to precompute the
answer to all possible joinreachability queries:For vertex a 2 V and for each graph
G
i
2 G we can compute the set reach(a,i) consisting of the vertices that reach a in
G
i
.Then we can store the answer to the joinreachability query for a by computing
the intersection
T
l
i=1
reach(a,i).With this representation joinreachability queries can
be answered in optimal time,but it requires O(n
2
) storage space,which is prohibitive
for large graphs.Our goal is to construct spaceefﬁcient representations that allow fast
joinreachability reporting.
2.1 Applications
Instances of the joinreachability problem appear in various applications.For exam
ple,in the rank aggregation problem[
DKNS01
] we are given a collection of rankings of
some elements and we may wish to report which (or howmany) elements have the same
ranking relative to a given element.This is a special version of joinreachability since
the given collection of rankings can be represented by a collection of directed paths with
the elements being the vertices of the paths.Similarly,in a graphstructured database
9
with an associated ranking of its vertices we may wish to ﬁnd the vertices that are re
lated to a query vertex and have higher or lower ranking than this vertex.Instances
of joinreachability also appear in graph algorithms arising fromprogramoptimization.
Speciﬁcally,in [
Geo08
] we need a data structure capable of reporting which vertices sat
isfy certain ancestordescendant relations in a collection of rooted trees.Also,in current
work in progress,we show that joinreachability structures for two trees can yield efﬁ
cient solutions to special cases of the interprocedural dominance problem [
dSvPdB07
].
See Section
4.1
.
There are also instances of joinreachability that are related to the topics considered
in Chapter
3
.In [
GT05
] (see also [
GT10
]) it is shown that any directed graph G with
a distinguished source vertex s has two spanning trees rooted at s such that a vertex a
is a dominator of a vertex b (meaning that all paths in G from s to b pass through a)
if and only if a is an ancestor of b in both spanning trees.This generalizes the graph
theoretical concept of independent spanning trees.Two spanning trees of a graph G are
independent if they are both rooted at the same vertex s and for each vertex v the paths
from s to v in the two trees are internally vertexdisjoint.Similarly,l spanning trees of
G are independent if they are pairwise independent.In this setting,we can apply a join
reachability structure to decide if l given spanning trees are independent.Moreover,a
variant of the joinreachability problemappears in our algorithmfor computing pairs of
vertexdisjoint paths [
Geo10
].
2.2 Results
In [
GNP10
] we explored two versions of the joinreachability problem.In the explicit
version we wish to represent R with a directed graph J ´ J (G),which we call the
joinreachability graph of G,i.e.,for any a,b 2 V,we have aRb if and only if b is reachable
from a in J.Our goal is to minimize the size (i.e.,the number of vertices plus edges)
of J.We presented results on the computational and combinatorial complexity of J.
10
In the implicit version we wish to represent R with an efﬁcient data structure (in terms
of storage space and query time) that can report fast all elements a 2 V satisfying aRb
for any query element b 2 V.First,we provided efﬁcient joinreachability structures for
simple graph classes.Then,based on these results,we considered planar graphs and
general directed graphs.
2.3 Explicit JoinReachability
In the explicit version of joinreachability we wish to construct a joinreachability graph
of small size.First we explore the computational complexity of computing the smallest
such graph,and then we provide bounds for its size in several cases.
2.3.1 Computational Complexity
We consider the computational complexity of computing the smallest J(fG
1
,G
2
g):
Given two graphs G
1
= (V,A
1
) and G
2
= (V,A
2
) we wish to compute a graph
J ´ J (fG
1
,G
2
g) of minimum size such that for any a,b 2 V,b is reachable from a
in J if and only if b is reachable from a in both G
1
and G
2
.We can further distinguish
two versions of this problem,depending on whether J is allowed to have Steiner ver
tices (i.e.,vertices not in V) or not:In the unrestricted version V(J) ¶ V,while in the
restricted version V(J) = V.
The problemof computing the smallest J in the unrestricted case belongs to the class
of NPhard problems.This is implied by a straightforward reduction to the reachability
substitute problem,which was shown to be NPhard by Katriel et al.[
KKS05
].
In the restricted case,on the other hand,we can compute J using transitive closure
and transitive reduction computations,which can be done in polynomial time [
AGU72
].
Note that the existence of Steiner vertices can reduce the size of J signiﬁcantly.
Consider for example a complete bipartite digraph G with V(G) = X [Y and A(G) =
X £Y.This digraph has the same transitive closure as the digraph G
0
with V(G
0
) =
11
z
Y
X
Y
X
Figure 2.2:
Reducing the size of a graph with the use of a Steiner vertex z.
V(G) [fzg and A(G
0
) = f(x,z),(z,y) j x 2 X,y 2 Yg.See Figure
2.2
.
2.3.2 Combinatorial Complexity
The objective here is to develop methods for constructing small joinreachability graphs
(but not necessarily optimal),and then provide bounds on the size (number of vertices
plus edges) of these constructions.Our starting point is to build joinreachability graphs
for paths and trees.The basic idea is to map the vertices of V to geometric objects in a
ddimensional space,where d is some constant.Then,the joinreachability relation can
be decided fromthe position of these objects in the ddimensional space.
An example for the case of two paths is depicted in Figure
2.3
:Each vertex v 2 V
receives coordinates (x
1
(v),x
2
(v)),where x
1
(v) corresponds to the position of v in the
ﬁrst path,and x
2
(v) corresponds to the position of v in the second path.Speciﬁcally,
x
1
(v) is equal to the number of vertices (other than itself) that reach v in G
1
;x
2
(v) is
deﬁned analogously.It follows that each vertex is mapped to a point in the 2dspace
[O,n ¡1]
2
and has integer coordinates.Moreover,for any two vertices a,b 2 V,we
have that b is reachable from a if and only if (x
1
(a),x
2
(a)) · (x
1
(b),x
2
(b)).In Figure
2.3
the vertices that reach f in both G
1
and G
2
are inside the dashed rectangle.Based
on this geometrical view we can ﬁnd the necessary edges (and Steiner vertices) that
12
g[6]
h
a[0]
b[1]
c[2]
d[3]
e[4]
f [5]
h[7]
a[0]
e[1]
c[2]
g[3]
b[4]
f [5]
d[6]
h[7]
G
2
0
1
2
3
4
6
7
1
2
5
4
3
7
6
5
x
1
x
2
0
a
c
b
g
e
f
d
G
1
Figure 2.3:
Mapping the vertices of two paths to points in the plane.
should be included in the joinreachability graph.The bound we derive is O(nlogn)
which turns out to be tight in the worst case:We presented examples where the smallest
joinreachability graph must have W(nlogn) edges.Based on similar ideas we provide
methods for building joinreachability graphs of size O(nlog
k
n),for some constant k ·
3,when we deal with trees and planar graphs.These methods can also be applied
to general graphs,but the quality of the produced structures depends on number of
disjointpaths into which the graphs can be decomposed.
2.4 Implicit JoinReachability
In the implicit version of the joinreachability problem our goal is to construct an efﬁ
cient data structure that supports the following type of query:Given a query vertex b
report all vertices a that reach b in J(fG
1
,G
2
g).We measure the efﬁciency of a data
structure in terms of the storage space it requires,and the time it needs to answer
a joinreachability query (i.e.,the time it needs to locate all vertices that reach b in
J(fG
1
,G
2
g)).To that end,we use the notation hs(n),q(n,k)i to refer to a data structure
13
G
1
g[6]
a[0]
b[1]
c[2]
d[3]
e[4]
f [5]
h[7]
a[0]
e[1]
c[2]
g[3]
b[4]
f [5]
d[6]
h[7]
G
2
0
1
2
3
4
6
7
1
2
5
4
3
7
6
5
x
1
x
2
0
a
b
g
e
f
d
h
c
Figure 2.4:
A Cartesian tree.
with O(s(n)) space and O(q(n,k)) query time for reporting k elements.
In order to design efﬁcient joinreachability data structures we apply the techniques
of Section
2.3.2
combined with data structures fromcomputational geometry.
Consider,for example,the case of two paths.Using the mapping of
2.3.2
we need
a data structure that returns the vertices a with (x
1
(a),x
2
(a)) · (x
1
(b),x
2
(b)).This
reporting can be accomplished with a Cartesian tree [
GBT84
].A Cartesian tree T is a
binary tree deﬁned recursively as follows:The root of T is the point a with minimum
x
2
coordinate.The left subtree of the root is a Cartesian tree for the points b with
x
1
(b) < x
1
(a) and the right subtree of the root is a Cartesian tree for the points b with
x
1
(b) > x
1
(a).See Figure
2.4
.The reporting algorithm uses the following property:
Consider two points a and b,and let c be the point with minimum x
2
coordinate such
that x
1
(a) · x
1
(c) · x
1
(b).Then c is the nearest common ancestor of a and b in T.
(The nearest common ancestor of two vertices in a tree is their common ancestor that
is farthest from the root.E.g.,in Figure
2.4
the nearest common ancestor of d and h is
e.) Now let z be the point with the smallest x
1
coordinate.In order to ﬁnd all points
a such that (x
1
(a),x
2
(a)) · (x
1
(b),x
2
(b)) we ﬁrst locate the nearest common ancestor
14
of z and b in T;call this vertex y.The returned point y has the smallest x
2
coordinate
in the x
1
range [0,x
1
(b)].If x
2
(y) > x
2
(b) then the answer is null and we stop our
search.Otherwise we return y and search recursively in the x
1
ranges [0,x
1
(y) ¡1] and
[x
1
(y) +1,x
1
(b)].Using the fact that nearest common ancestor queries in a tree can
be answered in constant time after linear time preprocessing [
HT84
],it follows that the
efﬁciency of the above data structure is hn,ki.
Again,based on similar ideas,we provide data structures for trees,planar graphs,
and general graphs.
15
Chapter 3
Connectivity and VertexDisjoint
Paths
In this chapter we present algorithms for computing pairs of vertexdisjoint paths in a
graph G = (V,E) froma common start vertex.We consider the following two problems:
(a)
Compute a pair of vertexdisjoint sv paths for all vertices v 2 Vn fsg,where s 2 V
is a ﬁxed source vertex.
(b)
Compute a pair of vertexdisjoint st paths for a given start vertex s and a given
terminal vertex t.
We also consider the connectivity requirements that G must satisfy in order for such
paths to exist.We remark that the more general problem of computing two vertex
disjoint paths that may connect different start and terminal vertices is NPhard [
BJG02
].
3.1 Vertex Connectivity
A directed (undirected) graph is kvertex connected if it has at least k +1 vertices and the
removal of any set of at most k ¡1 vertices leaves the graph strongly connected (con
nected).See Figure
3.1
.The vertex connectivity k ´ k(G) of a graph G is the maximumk
16
(i)
(ii)
Figure 3.1:
A strongly connected and a 2vertex connected digraph.
such that G is kvertex connected.
Graph connectivity is one of the most fundamental concepts in graph theory with
numerous practical applications [
BJG02
].Currently,the fastest known algorithm for
computing k is due to Gabow [
Gab06
],with O((n +minfk
5/2
,kn
3/4
g)m) running time.
A related problem is to test if a graph satisﬁes k ¸ k for a given integer k.Henzinger
et al.[
HRG00
] showed how to test kvertex connectivity in time O(minfk
3
+n,kngm).
They also gave a randomized algorithm for computing k with error probability 1/2 in
time O(nm).For an undirected graph,a result of Nagamochi and Ibaraki [
NI92
] allows
m to be replaced by kn or kn in the above bounds.Cheriyan and Reif [
CR94
] showed
howto test kvertex connectivity in a directed graph with a Monte Carlo algorithmwith
running time O((M(n) +nM(k)) logn) and error probability 1/n,and with a Las Vegas
algorithm with expected running time O((M(n) +nM(k))k).In these bounds,M(n) is
the time to multiply two n £n matrices,which is O(n
2.376
) [
CW90
].
3.2 Dominator Veriﬁcation
A ﬂowgraph G(s) = (V,A,s) is a graph with a distinguished root s 2 V such that every
vertex is reachable from s.The dominance relation in G is deﬁned as follows:A vertex w
dominates a vertex v if every path from s to v includes w;if w 62 fs,vg then w is a proper
dominator of v;otherwise,w is a trivial dominator of v.The dominance relation can be
17
h
h
d
c
e
g
g
f
e
d
c
f
b
a
s
s
a
b
G(s)
D(s)
Figure 3.2:
A ﬂowgraph G(s) and its dominator tree D(s).
represented compactly by the dominator tree D:This is a tree rooted at s that satisﬁes the
following property:For any two vertices v and w,w dominates v if and only if w is an
ancestor of v in D [
ASU86
].See Figure
3.2
.
The computation of dominators appears in several application areas,such as pro
gram optimization and code generation,constraint programming,circuit testing,theo
retical biology,and other areas [
GTW06
].Dominators can be computed in almost linear
time with the algorithm of Lengauer and Tarjan [
LT79
].This algorithm has some con
ceptual complexities,but it is used in many applications as more simple algorithms have
quadratic complexity or worse.There are also even more complicated truly lineartime
algorithms [
AHLT99
,
BGK
+
08
,
GT04
].
We deﬁne the dominator veriﬁcation problem as follows:Given a ﬂowgraph G(s) and
a tree T test if T is the dominator tree of G(s).An important special case of this prob
lem is the veriﬁcation of trivial dominators:Given a ﬂowgraph G(s) test if s is the only
proper dominator of every vertex v 6= s.We have shown that the dominator veriﬁcation
problemcan be reduced in linear time to the problemof verifying trivial dominators.
The dominator veriﬁcation problem was initially motivated by the complexities of
the efﬁcient algorithms for computing dominators.Moreover,in the next sections we
18
t
f
d
c
b
e
a
s
T
2
T
1
s
a
e
b
c
d
f
t
G
s
a
e
b
c
t
f
d
Figure 3.3:
Two independent spanning trees of a 2vertex connected graph.
show that the problems of testing the existence of pairs of vertexdisjoint paths of type
(a) to all vertices starting from a ﬁxed source,and (b) from a given source vertex to a
given target vertex,can be reduced to the veriﬁcation of trivial dominators.
3.3 Independent Spanning Trees
Let T
1
and T
2
be two spanning trees of a graph G = (V,E) rooted at a vertex s 2 V.
The spanning trees are independent if for each vertex v the two sv paths in T
1
and T
2
are
internally vertexdisjoint.See Figure
3.3
.The spanning trees are strongly independent if
they contain an sv path and an su path that are vertexdisjoint,for all pairs of vertices
u and v.Independent spanning trees have been used in faulttolerant communications
(see,e.g.,[
AB00
,
IR88
]).
The existence of two such spanning trees is implied by a result of Whitty [
Whi87
],
when G satisﬁes the following necessary and sufﬁcient condition:G contains two vertex
disjoint sv paths for all vertices v 6= s.This is equivalent to stating that the ﬂowgraph
with root s has only trivial dominators.Whitty gave a polynomialtime construction for
two strongly independent spanning trees.Simpler constructions were later provided by
Plehn [
Ple91
],Cheriyan and Reif [
CR94
],but the time complexity of these constructions
was not speciﬁed.Huck [
Huc94
] gave an O(mn)time construction of two independent
spanning trees.In [
GT10
] we provide lineartime constructions of two strongly inde
19
pendent spanning trees and other related concepts.
3.4 Testing 2Vertex Connectivity
Consider a 2vertex connected graph G = (V,E).For s 2 V,let G(s) be the ﬂowgraph
with root s.The deﬁnition of 2vertex connectivity implies that s is the only proper
dominator in G(s) for all vertices v 6= s.The same property holds for the reverse graph
G
r
,which is derived from G after reversing all edge directions.
In [
Geo10
] we show that for a graph to be 2vertex connected it is sufﬁcient that
the above two properties hold for two arbitrary vertices.Therefore,testing a graph
for 2vertex connectivity can be reduced to testing if constant number of ﬂowgraphs
have trivial dominators only.This reduction together with the results of [
GT10
] imply a
simple lineartime algorithmfor testing 2vertex connectivity.
3.5 Computing Pairs of VertexDisjoint st Paths
We consider next the problemof computing two internally vertexdisjoint paths directed
from s to t,for any given source vertex s and target vertex t.See Figure
3.4
.This
problem can be reduced to computing two edgedisjoint paths (by applying a standard
vertex splitting procedure),which in turn can be carried out in O(m) time by computing
two ﬂowaugmenting paths [
BJG02
].
In [
Geo10
] we presented a faster algorithm for 2vertex connected graphs.First we
note that our algorithm for testing 2vertex connectivity allows us to ﬁnd in linear time
a 2vertex connected spanning subgraph of the input digraph with O(n) edges.Hence,
the ﬂowaugmenting algorithm can compute two internally vertexdisjoint st paths in
O(n) time.We can further improve this with the use of independent spanning trees.
Based on the results mentioned in Section
3.3
,we can construct a linear space data
structure that computes two internally vertexdisjoint st paths,for any s,t,in O(log
2
n)
20
c
g
a
e
b
c
d
f
h
G
g
a
e
b
d
two vertexdisjoint de paths
h
f
Figure 3.4:
Two vertexdisjoint paths in a 2vertex connected graph.
time,so that the two paths can be reported in constant time per vertex.We remark that
the reporting algorithm needs to ﬁnd common ancestors of some vertices in pairs of
trees,which is a variant of the joinreachability problemdeﬁned in Chapter
2
.
21
Chapter 4
Further Applications
Now we consider additional applications of our algorithms and techniques.We remark
that the material we present here is part of ongoing research.
4.1 Interprocedural Dominance
As we already mentioned in Section
3.2
the computation of dominators is crucial in
the analysis and optimization of computer programs.In the context of wholeprogram
analysis and optimization,however,we have to take into account the fact that there are
pathconstraints which make some paths of the ﬂowgraph invalid [
RHS95
].As a result,
the most efﬁcient algorithms for intraprocedural dominators are unable to handle the
intrerprocedural case.
We formulate the interprocedural dominance problemas in [
dSvPdB07
].The vertices
of the ﬂowgraph are partitioned into sets corresponding to different procedures.Each
procedure P has a unique entry vertex s(P) and a unique exit vertex t(P);the main
procedure contains the root vertex s and the terminal vertex t.An edge e is directed
from tail(e) to head(e).A call edge has the form (x,s(P)) with x 62 P.Similarly,a return
edge has the form(t(P),y) with y 62 P.Each call edge has a unique corresponding return
edge and vice versa.We let f denote the (bijective) function that maps a call edge to
22
main
A
B
s
c
d
t
s(A)
t(A)
s(B)
t(B)
s(A)
t(A)
s(B)
t(B)
a
c
b
d
e
f
f
a
b
e
t
s
f(e
1
) = e
2
f(e
3
) = e
4
f(e
5
) = e
6
f(e
7
) = e
8
e
1
e
2
e
3
e
4
e
5
e
6
e
7
e
8
Figure 4.1:
An interprocedural ﬂowgraph and its dominator dag.Procedure call edges
(e
1
,e
3
,e
5
,and e
7
) and return edges (e
2
,e
4
,e
6
,and e
8
) are dotted;the callreturn corre
spondence is given by the f() function.
its corresponding return edge;if f((x,s(P))) = (t(P),y) then it is implied that x and y
belong to the same procedure.Figure
4.1
gives an example.
A full path starts at s and ends at t.A full path Q is valid if it has a proper nesting of
procedure callsreturns,i.e.,
²
if Qcontains a return edge e = (t(P),y) then the preﬁx of Qfroms to t(P) contains
the call edge f
¡1
(e),and
²
if Q contains the call edges e and e
0
,where e precedes e
0
,then f(e
0
) precedes f(e)
in Q.
A valid path is a preﬁx of a full valid path.A vertex w dominates a vertex v if every valid
path from s to v includes w.
The existence of pathconstraints modiﬁes the structure of the dominance relation
(with respect to the standard problem).Speciﬁcally,the transitive reduction of the inter
procedural dominance relation is no longer a tree but a directed acyclic graph.
We have developed efﬁcient algorithms for special cases of this problem by formu
lating them in the context of joinreachability.We are currently extending our solutions
23
to these special cases in order to derive efﬁcient algorithms for computing the interpro
cedural dominance relation in the general case.
4.2 Computational Morphological Analysis
Morphology is the study of the internal structure of words.Morphological analysis
consists of the identiﬁcation of the constituents of words.The smallest meaningful
constituents are called morphemes (e.g.,dog,dogs).The morphemes have grammatical
functions;they express inﬂectional properties.For instance,tense and aspect are the
inﬂectional categories expressed in verbs (play,played,playing).
Lexemes are abstract entities and can be thought of as a set of words (PLAY).Word
forms are concrete entities and belong to a single lexeme (plays,played belong to the
lexeme PLAY).The set of wordforms that belong to a lexeme is called a paradigm.
Wordforms with a concrete meaning are called roots (play).They also consist of afﬁxes
with an abstract meaning (playing,played,player).Afﬁxes that follow the root are
called sufﬁxes (playing,played,player).Afﬁxes that precede the root are called preﬁxes
(reread).
There are four major theoretical approaches to inﬂection;see [
Se01
].In the present
study,we adopt the framework of Distributed Morphology.For simplicity reasons,we
only provide a brief sketch of a possible morphological analysis of some forms of a
verbal paradigmin Greek [
Gal05
].
Some of the core questions and issues we need to ask and take into account in any
given morphological analysis are:

What morphological units languages consist of?

What features are expressed in each morpheme?

How do different morphemes interact with one another?

Can all morphemes be matched to one another?
24
i
os
tirio
tis
tik

¤
smen
an
¤

th
thik
ontas
omun
ome
a
o
en
apolim
o
Figure 4.2:
A graph of a morphological analysis.

How do we account for any constraints between the matching of morphemes?
Derivation is the process by which new words (with a new meaning) are formed
(read,readable,kind,kindness).Different languages employ different processes by which
derivation occurs.For instance,by afﬁxation.Here,some of the fundamental questions
one needs to answer are:

How roots combine with certain preﬁxes and afﬁxes?

What are the constraints in such formations?

What about the interface of inﬂection and derivation?
Computational approaches to morphology can provide empirical evidence that can
help in answering such questions.Parts of such approaches can be formulated as graph
reachability and pathselection problems.A simple example is shown in Figure
4.2
;
Constituents are combined in paths to form a wordform.(We stress that this ﬁgure
is not an exhaustive morphological representation.We leave aside any phonological
and/or lexical rules that may further apply.)
25
Chapter 5
Conclusions and Future Work
In this project we studied a collection of Reachability and PathSelection problems,and
designed efﬁcient algorithms for their solution.We believe that several related topics,
some of which are listed below,deserve further investigation.
Problems related to Reachability:
²
Determine the computational complexity of constructing the smallest joinreachability
graph for simple graph classes such as trees.
²
Provide bounds for the explicit representation of the joinreachability graph for
other interesting graph classes.
²
Consider the problem of approximating the smallest joinreachability graph for
speciﬁc graph classes.
Problems related to PathSelection:
²
Design fast algorithms for testing kconnectivity for constant k > 2.
²
Consider data structures that report fast more than 2 disjoint st paths.
²
Design fast (linear or near linear time) algorithms for computing a sparse 2vertex
connected subgraph of a given graph;The computation of the smallest such sub
26
graph is NPhard,so here we are interested in fast heuristics that achieve good
approximation guarantees.
We plan to investigate the above topics in our future studies.
27
Bibliography
[AB00]
F.S.Annexstein and K.A.Berman.Directional routing via generalized
stnumberings.SIAMJ.Discret.Math.,13(2):268–279,2000.
[AGU72]
A.V.Aho,M.R.Garey,and J.D.Ullman.The transitive reduction of a
directed graph.SIAMJ.Comput.,1(2):131–137,1972.
[AHLT99]
S.Alstrup,D.Harel,P.W.Lauridsen,and M.Thorup.Dominators in linear
time.SIAMJournal on Computing,28(6):2117–32,1999.
[ASU86]
A.V.Aho,R.Sethi,and J.D.Ullman.Compilers:Principles,Techniques,and
Tools.AddisonWesley,Reading,MA,1986.
[BGK
+
08]
A.L.Buchsbaum,L.Georgiadis,H.Kaplan,A.Rogers,R.E.Tarjan,and
J.R.Westbrook.Lineartime algorithms for dominators and other path
evaluation problems.SIAMJournal on Computing,38(4):1533–1573,2008.
[BJG02]
J.BangJensen and G.Gutin.Digraphs:Theory,Algorithms and Applications
(Springer Monographs in Mathematics).Springer,1st ed.2001.3rd printing
edition,2002.
[CR94]
J.Cheriyan and J.H.Reif.Directed st numberings,rubber bands,and
testing digraph kvertex connectivity.Combinatorica,14(4):435–451,1994.
[CW90]
D.Coppersmith and S.Winograd.Matrix multiplication via arithmetic
progressions.J.Symb.Comput.,9(3):251–280,1990.
28
[DKNS01]
C.Dwork,R.Kumar,M.Naor,and D.Sivakumar.Rank aggregation meth
ods for the web.In WWW’01:Proceedings of the 10th international conference
on World Wide Web,pages 613–622,2001.
[dSvPdB07]
B.de Sutter,L.van Put,and K.de Bosschere.A practical interprocedural
dominance algorithm.ACMTrans.Program.Lang.Syst.,29(4),2007.
[Gab06]
H.N.Gabow.Using expander graphs to ﬁnd vertex connectivity.Journal of
the ACM,53(5):800–844,2006.
[Gal05]
A.Galani.The Morphosyntax of Verbs in Modern Greek.PhDthesis,University
of York,UK,September 2005.
[GBT84]
H.N.Gabow,J.L.Bentley,and R.E.Tarjan.Scaling and related techniques
for geometry problems.In Proc.16th ACM Symp.on Theory of Computing,
pages 135–143,1984.
[Geo08]
L.Georgiadis.Computing frequency dominators and related problems.In
ISAAC ’08:Proceedings of the 19th International Symposium on Algorithms and
Computation,pages 704–715,2008.
[Geo10]
L.Georgiadis.Testing 2vertex connectivity and computing pairs of vertex
disjoint st paths in digraphs.In Proc.37th Int’l.Coll.on Automata,Languages,
and Programming,pages 738–749,2010.
[GNP10]
L.Georgiadis,S.D.Nikolopoulos,and L.Palios.Joinreachability in di
rected graphs.Manuscript,2010.
[GT04]
L.Georgiadis and R.E.Tarjan.Finding dominators revisited.In Proc.15th
ACMSIAMSymp.on Discrete Algorithms,pages 862–871,2004.
[GT05]
L.Georgiadis and R.E.Tarjan.Dominator tree veriﬁcation and vertex
disjoint paths.In Proc.16th ACMSIAMSymp.on Discrete Algorithms,pages
433–442,2005.
29
[GT10]
L.Georgiadis and R.E.Tarjan.Dominator veriﬁcation and independent
spanning trees.Manuscript,2010.
[GTW06]
L.Georgiadis,R.E.Tarjan,and R.F.Werneck.Finding dominators in
practice.Journal of Graph Algorithms and Applications (JGAA),10(1):69–94,
2006.
[HRG00]
M.R.Henzinger,S.Rao,and H.N.Gabow.Computing vertex connectivity:
New bounds fromold techniques.Journal of Algorithms,34:222–250,2000.
[HT84]
D.Harel and R.E.Tarjan.Fast algorithms for ﬁnding nearest common
ancestors.SIAMJournal on Computing,13(2):338–55,1984.
[Huc94]
A.Huck.Independent trees in graphs.Graphs and Combinatorics,10:29–45,
1994.
[IR88]
A.Itai and M.Rodeh.The multitree approach to reliability in distributed
networks.Information and Computation,79(1):43–59,1988.
[KKS05]
I.Katriel,M.Kutz,and M.Skutella.Reachability substitutes for planar
digraphs.Technical Report MPII20051002,MaxPlanckInstitut F
¨
ur In
formatik,2005.
[LT79]
T.Lengauer and R.E.Tarjan.A fast algorithm for ﬁnding dominators
in a ﬂowgraph.ACM Transactions on Programming Languages and Systems,
1(1):121–41,1979.
[NI92]
H.Nagamochi and T.Ibaraki.A lineartime algorithm for ﬁnding a sparse
kconnected spanning subgraph of a kconnected graph.Algorithmica,
7:583–596,1992.
[Ple91]
J.Plehn.
¨
Uber die Existenz und das Finden von Subgraphen.PhD thesis,Uni
versity of Bonn,Germany,May 1991.
30
[RHS95]
T.Reps,S.Horwitz,and M.Sagiv.Precise interprocedural dataﬂowanalysis
via graph reachability.In Proceedings of the 22nd ACM SIGPLANSIGACT
symposium on Principles of programming languages,pages 49–61,June 1995.
[Se01]
A.Spencer and A.Zwicky (eds).The Handbook of Morphology.Blackwell
Publishers,2001.
[Whi87]
R.W.Whitty.Vertexdisjoint paths and edgedisjoint branchings in directed
graphs.Journal of Graph Theory,11:349–358,1987.
[WHY
+
06]
H.Wang,H.He,J.Yang,P.S.Yu,and J.X.Yu.Dual labeling:Answering
graph reachability queries in constant time.In ICDE ’06:Proceedings of the
22nd International Conference on Data Engineering,page 75,2006.
31
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο