Scalable Graph
Decompositions
Presented by:
Blair D. Sullivan
Complex Systems Group
Center for Engineering Design & Advanced Research
Computer Science and Mathematics Division
Oak Ridge National Laboratory
Research supported by the Department of Energy’s Office of Science
Office of Advanced Scientific Computing Research
Applied Mathematics Program
Motivation
•
Massive data with an underlying graph
structure is emerging in many fields
including communication & transportation
networks, bioinformatics, and the power
grid.
•
Tree & branch decompositions
are
specialized mappings of graphs onto trees,
with quality measured by
width
metrics.
Many data sets exhibit very low

width
decompositions independent of the
number of nodes/edges in the graph.
Yeast Protein Interaction Network
•
Tree/branch decompositions may serve a dual role in analyzing such large graphs:
1.
A tree or branch decomposition naturally breaks up a graph, allowing data
parallel computation.
2.
Thanks to
fixed parameter tractability
, many
NP

Hard problems on a graph can
be solved in time that is exponential in the
width
of the decomposition, but
often
linear
in the size of the graph.
Background:
•
In 2003, Cook & Seymour used a branch decomposition
approach to discover new best known solutions to
several widely

studied large Traveling Salesman
Problem instances from the TSPLIB.
•
Some recent work in bioinformatics uses tree
decompositions for smaller combinatorial problems
where graphs have very small width (e.g. RNA sequence
alignment and protein side

chain placement).
•
Few examples of utilizing these decompositions to
subdivide large graphs for parallel computation and
exploiting fixed parameter tractability.
Optimal TSP tour of 15,112 German cities
Approach:
•
Use tree decompositions to transform algorithm complexity so that it is exponential in
the width, but polynomial in number of nodes.
•
Develop efficient, scalable algorithms for computing low

width decompositions for large
graphs.
•
Integrate parallel computing with decomposition algorithms and dynamic programming.
Tree Decompositions
•
A
tree decomposition
of a graph
G = (V,E)
is a pair
(X,T)
, where
X
is a collection of subsets of
V
and
T
is
a tree with nodes
{1, …, n}
so that each node
i
of
T
has an associated set in
S,
say
X
i
, so that
(X,T)
satisfies three conditions:
1)
The union of the sets in
X
is equal to
V
.
2)
For every edge
(
u,v
)
in
G
,
{u, v}
is a subset of
some
X
i
,
3)
For every vertex
v
in
G
, the set of nodes
whose bags contain
v
form a connected sub

tree of
T
.
•
The
width
of a tree decomposition is the maximum
of 
X
i
 − 1 over
i
=
1, 2, … , n.
•
The
treewidth
(
tw
) of a graph
G
is the minimum
width over all tree decompositions of
G
.
The Petersen graph and a width 4 tree decomposition are shown at right.
The
subtree
associated with vertex
w
3
is shown in orange.
Finding a Tree Decomposition
•
Tree decompositions are usually computed via graph
triangulation. Given an
elimination ordering
(linear order
of the vertices), one triangulates the graph by
sequentially adding edges among each vertex’s higher

numbered neighbors.
•
The bags of the tree decomposition correspond to sets of
higher

numbered neighbors in the triangulation. If the
max clique in the triangulated graph has size
k + 1
, the
associated tree decomposition has width
k
.
•
Heuristics
vary greatly in computational
complexity and the width of the
resultin
g
decomposition. Most were designed to
minimize the number of fill edges, not width.
Left: comparison of width and fill from 6
heuristics on graphs known to have
tw
<= 30
Above: A triangulation of the Petersen graph
Branch Decompositions
•
A
branch decomposition
of a graph
G
is a pair
(T,
φ
)
where
T
is a ternary tree (every node
of T has degree
1
or
3
), and
φ
is a
bijection
between the edges of
G
and the leaves of
T
.
•
The
width
of an edge in
T
is the number of vertices of
G
shared by leaves on both sides.
The width of the branch decomposition is the max width of its edges.
•
The
branchwidth
(
bw
) of
G
is the minimum width over all branch decompositions of
G
.
•
Robertson and Seymour proved that
tw
is bounded between
bw

1
and
3*
bw
/2

1
.
The Petersen graph (left) and a branch decomposition (center) are shown at the top. The red
edge partitions the graph edges into the sets colored green and blue (right), and thus has width
4, the number of vertices incident with edges of both colors (shaded in red).
Forming a Branch Decomposition
Initial Star
Common algorithms refine a tree with heuristics
until it is ternary, trying to keep middle sets small.
Techniques include searching for 2

and 3

separations, and using eigenvectors and network
flows.
Intermediate
Tree
Intermediate
Tree
Final
Ternary
Tree
•
Tree and branch decompositions provide a framework for a variety of dynamic
programming algorithms for NP

Hard decision and optimization problems on graphs
•
The general strategy is to root the tree and then work “up” from the leaves, solving sub

problems and storing partial solutions along the way
•
Solving the sub

problems requires information about only a small part of the original
graph, represented by the child nodes lower in the tree
•
The complexity of processing a specific node can be exponential in its bag size (tree
decomposition) or middle set of the neighboring edges (branch decomposition).
Dynamic Programming
In a tree decomposition, computing the
dynamic programming table at node
c
requires information about the vertices
in the bag
V
c
and the children’s tables,
T
a
and
T
b
.
The complexity of this
computation can be exponential in 
V
c
.
Parallelization of Dynamic Programming
•
Individual sub

trees in a tree or branch decomposition can be processed independently
•
Processing each sub

tree requires information about only a small part of the graph
•
As more and more sub

trees are processed, we move closer to the root
•
Will require non

traditional parallelization techniques due to difficulty in
estimating computational workload for each sub

tree, load

balancing, etc
A rooted branch decomposition for a TSP
graph is colored to illustrate the
independent sub

trees that can be
processed in parallel during dynamic
programming.
Parallel

friendly Decompositions
•
Decompositions need to have several
attributes to allow
efficent
parallel
computation:
1.
They should be far from being “path

like”
to allow multiple sub

trees to be
processed simultaneously
2.
The distribution of bag sizes/middle set
cardinalities should be controlled to allow
better load balancing
Sample Applications & Future Work
•
Cai et al have used tree decompositions to
solve the NP

hard problem of maximum
weight independent set for secondary
structure prediction in RNA on low

width
stem graphs.
•
Key combinatorial
scientific computing kernels
use NP

hard problems such as graph coloring
for which tree and branch decompositions
may offer faster (lower complexity) solutions.
•
How will graph decomposition based approaches scale when applied to massive graphs?
Contacts
Blair D. Sullivan
Complex Systems Group
Computer Science & Mathematics Division
Oak Ridge National Laboratory
sullivanb@ornl.gov
Chris
Groër
Computational Mathematics Group
Computer Science & Mathematics Division
Oak Ridge National Laboratory
groercs@ornl.gov
Comments 0
Log in to post a comment