Computational Complexity, Physical Mapping III + Perl

whooploafSoftware and s/w Development

Dec 13, 2013 (4 years and 20 days ago)

300 views

Computational Complexity,
Physical Mapping III + Perl

CIS 667 March 4, 2004

Computational Complexity
-

An
Overview


We are primarily interested in efficient
algorithms


Efficient means that the running time of the
algorithm is bounded by some polynomial
function
p(n)


The size of the problem is measured by
n


We use
big
-
oh notation
, e.g.
O(n
2
)
, in which lower
order terms are ignored


Thus for small problem sizes, an
O(n
2
)

algorithm
may run slower than an
O(n)
one

Computational Complexity
-

An
Overview


This means that we are talking about
asymptotic
behavior


An inefficient algorithm is one whose asymptotic
efficiency is exponential
-

e.g.
O(2
n
)


Problems for which efficient algorithms exist
belong to a class P


Problems for which no efficient algorithms are
known to exist belong to class NP

NP
-
complete Problems


An important subset of these problems is called
NP
-
complete


The solutions to problems in NP, once found, can be
checked in polynomial time


NP includes the class P as a subset


Any NP
-
complete problem can be transformed in
polynomial time to an instance of any other NP
-
complete problem


So all NP
-
complete problems are equivalent under
polynomial transformation

NP
-
complete Problems


So, if a polynomial time algorithm is found
for one NP
-
complete problem, there are
polynomial time algorithms for
all
NP
-
complete problems


If so, then P=NP


Most researchers believe that P

NP


The model of computation that is used in
defining NP
-
complete problems is the
Nondeterministic Turing Machine

NP
-
hard Problems


Classes P and NP include only
decision
problems

-

the answer is yes or no


An
NP
-
hard

problem is one which is at
least as hard as NP
-
complete problems


If an NP
-
hard problem can be solved in
polynomial time, then so can all NP
-
complete
problems


NP
-
hard problem is not necessarily a decision
problem

NP
-
hard Problems


NP
-
complete

NP
-
hard


Example: does there exist a solution to the
Traveling Salesman problem is NP
-
hard
and NP
-
complete.


Find a solution to the Traveling Salesman is
NP
-
hard, but not NP
-
complete (not decision
form)


But if we have a polynomial solution for the
2nd, we can use it to solve the 1st (and hence
all NP
-
complete problems)

NP
-
completeness


Initially, several hard problems were shown to
solvable in polynomial time on a
nondeterministic TM


Polynomial time reductions between the problems
were also shown


Nowadays, to show a problem is NP
-
complete


Verify the problem is in NP (solution can be verified in
polynomial time)


Show a polynomial time reduction of
any
NP
-
complete to your
problem

NP
-
completeness


So when faced with an NP
-
complete or NP
-
hard
problem
-

what to do?


See if a meaningful restriction of the problem can be
solved in polynomial time


See if the size of the problem in practice is always
small


Devise a polynomial time approximation algorithm
-

guaranteed to find a near optimal solution


Devise heuristics

Algorithmic Implications


We are trying to solve a real
-
life problem


The models we use may give us many
solutions, but we want to find the one solution
which corresponds to the real ordering of the
clones in the target DNA


Use the algorithmic results in an iterative
fashion with the experimental biologist

Algorithmic Implications


A mapping algorithm should


Work better with more data, assuming a
constant error rate


Give a solution which makes it clear how it
was obtained and tell which parts of the
solution are good and which bad


Give all candidate solutions

An algorithm for C1P


This algorithm determines whether an
n


m
matrix has the C1P for rows


Assume


All rows different


No row is all zero


Let
S
i

be the set of columns of row
i
with value
1 then
i
and
j
we can have


S
i


S
j


=

.


S
i


S
j
or
S
j


S
i


S
i


S
j






and neither of them is a subset of the
other

An algorithm for C1P


In the first case, we don’t need to consider
the two rows together, so we separate
them into two
components


Deal with them separately


For non
-
empty intersection


Suppose there is a row that is either a subset
or has empty intersection with every row in
the component
-

move it out of the component

An algorithm for C1P


To see if two rows belong to the same
component


Build a graph
G
c

using
M


Each vertex of
G
c

will be a row from
M


There will be an undirected edge from
i
to
j
if
S
i


S
j






and neither of them is a subset of the other


So the components we want are the connected
components of
G
c


Basic Algorithm


The algorithm will have the following
phases


Separate rows into components according to
above rules


Permute the columns of each component to
achieve C1P for component


Join components together

Example Matrix

c
1

c
2

c
3

c
4

c
5

c
6

c
7

c
8

c
9

l
1

1

1

0

1

1

0

1

0

1

l
2

0

1

1

1

1

1

1

1

1

l
3

0

1

0

1

1

0

1

0

1

l
4

0

0

1

0

0

0

0

1

0

l
5

0

0

1

0

0

1

0

0

0

l
6

0

0

0

1

0

0

1

0

0

l
7

0

1

0

0

0

0

1

0

0

l
8

0

0

0

1

1

0

0

0

1

Example Graph

l
1

l
2

l
3

l
5

l
4

l
7

l
6

l
8

a

b

g

d

Placing Rows in a Component

c
1

c
2

c
3

c
4

c
5

c
6

c
7

c
8

l
1

0

1

0

0

0

0

1

1

l
2

0

1

0

0

1

0

1

0

l
3

1

0

0

1

0

0

1

1

How can the first row (by itself) be arranged? (Keep track of all possibilities)

l
1



… 0 1 1

1 0 …


{2, 7, 8} {2, 7, 8} {2, 7, 8}

Now add the second row
-

it can go to right or left of first

l
1



… 0 0 1 1

1 0 …

l
2



… 0 1 1

1 0 0…


{5} {2, 7} {2, 7} {8}

Placing Rows in a Component


How do we place the third row?


In the graph, there are edges for both rows already
placed. Let’s place the third with respect to the
second


Does it go to the right or to the left?


If |l
1


l
3
|<min(|l
1


l
2
|, |l
2


l
3
|)
-

same direction second
w.r.t. first, else opposite direction


In our case, we have to place in the opposite (right
direction) as shown on the next slide


Placing Rows in a Component

l
1



… 0 0 1 1

1 0 0 0…

l
2



… 0 1 1

1 0 0 0 0…


{5} {2} {7} {8} {1, 4} {1, 4}

l
3



… 0 0 0

1 1 1 1 0…

Placing Rows in a Component


All of the other rows in the component are
placed in the same way, using two
previously place rows:


One which has an edge to the row to be
placed in the graph


Second has an edge to the previous row in
the graph

Joining Components Together


For the next part of the solution, we use a
graph
G
M

which tells us how the
components fit together


Each component of the original matrix will be
a vertex in
G
M



A directed edge is added between
a

and
b

if
the sets
S
i
for all i in
b

are contained in at least
one set
S
j
of component
a

Example Graph

a

b

g

d

Joining Components Together


We process components not contained in
any other component first


So process the components in the topological
order of the graph


We may come up with multiple solutions if
one or more columns is not constrained to
one value


The algorithm is polynomial