Ming Li Talk about Bioinformatics

fleagoldfishΒιοτεχνολογία

2 Οκτ 2013 (πριν από 3 χρόνια και 11 μήνες)

62 εμφανίσεις

Lecture 22 More NPC problems


Today, we prove three problems to be NP
-
complete


3CNF
-
SAT


Clique problem


Vertex cover

Dick Karp

3
-
CNF SAT is NP
-
complete


A boolean formula is in
3
-
conjunctive normal
form

(3
-
CNF) if it consists of
clauses

connected by ANDs, and each clause is the
OR of exactly three literals. (A literal is a
variable or its negation).




Example,


(x
1

OR x
2

OR ~x
3
) AND (~x
3

OR x
4

OR ~x
5
)


Theorem. 3CNF
-
SAT is NPC

Proof.



Step 1. 3
-
CNF
-
SAT is clearly in NP as given an
assignment, we can verify efficiently.



Step 2. We choose SAT to reduce to 3CNF
-
SAT.

Proof continues …


Step 3. To show SAT ≤
p

3
-
CNF
-
SAT, we design the
mapping function in three stages.


In the first stage,


We add parentheses so that no connective (AND OR)
has more than two arguments.


This formula gives a tree in which each node has out
-
degree at most 2. The leaves are the literals, and the
internal nodes represent the connectives.


Then apply the same transformation as in CIRCUIT
-
SAT ≤
P

SAT. This gives a formula with at most three
variables per clause. That is:



(a OR b) becomes: (c iff (a OR b) )


Proof continues …

In stage 2, we take this new formula, and treat each
clause separately. Since each clause C has at most
three variables, we can make a truth table for the
function represented by the clause. Now look at ~C,
and rewrite it in
disjunctive normal form
--
that is, as
the OR of AND
-
clauses. For example: we can rewrite


C = (y
1

IFF (y
2

AND ~x
2
)) as


~C = (y
1

AND y
2

AND x
2

) OR (y
1

AND ~y
2

AND x
2
)
OR (y
1

AND ~y
2

AND ~x
2
) OR (~y
1

AND y
2

AND ~x
2
).

By de Morgan's laws, we get


C = (~y
1

OR ~y
2

OR ~x
2
) AND (~y
1

OR y
2

OR ~x
2
)
AND (~y
1

OR y
2

OR x
2
) AND (y
1

OR ~y
2
OR x
2
).



Proof continues …

In stage 3, we fix up those clauses having only 1 or 2
variables per clause, as follows: replace the clause
(x
1

OR x
2
) by


(x
1

OR x
2

OR p) AND (x
1

OR x
2

OR ~p),


and replace the clause (x
1
) by


(x
1

OR p OR q) AND (x
1

OR ~p OR q) AND (x
1

OR p
OR ~q) AND (x
1

OR ~p OR ~q).


Here p and q are new variables. The new formula is
satisfiable iff the old formula is because the reduction
preserves the satisfiability.

Step 4. We can compute f in polynomial time, say O(n
2
)
time, all transformations being local.

Therefore 3
-
CNF
-
SAT is NP
-
complete. QED


Clique problem


Definition: we say a graph G=(V,E) has a clique of
size k if there exists a subset V', with |V'| = k, of the
vertices V such that for all u, v in V', the edge (u,v) is
in E. In other words, the induced subgraph on the
vertices V' is the complete graph of k vertices.


One version of the CLIQUE problem is as follows:
given a graph G, find a clique of maximum size.
(There may be several.)
--

this is not in NP.


We will use the decision version of the CLIQUE
problem: given a graph G = (V,E), and an integer k,
does it have a clique of size k?


Note that we can solve CLIQUE in (k choose 2)(|V|
choose k) step. But this is not polynomial time.


CLIQUE example


{2,4,5,7} is a clique of size 4, the largest in
this graph.

1

3

2

6

4

5

7

Theorem. CLIQUE is NP
-
complete

Proof.


Step 1. Clique is in NP (certificate: the clique)


Step 2. Pick an instance of 3
-
SAT,
Φ
, with
k

clauses


Step 3. Reduction. We construct a graph G:


Make a vertex for each literal


Connect each vertex to the literals in other
clauses that are not its negation


Any k
-
clique in G corresponds to a satisfying
assignment (needs a proof)


The reduction is polynomial time. QED


An example of reduction 3
-
SAT < CLIQUE

Claim. F is a satisfiable boolean formula if and only if G = (V,E) has
a clique V' of size k.

Proof. Suppose that F is satisfiable. Then each clause has at least
one literal that takes on the value 1. From each clause, choose
one such literal, and let V' be the set of vertices corresponding
to these literals. Then I claim that V' is a clique of size n. Since
there is one vertex for each clause, clearly |V'| = k. If two literals
in V’s are not connected by an edge, then they are negation to
each other
--
contradicting the fact that we chose a literal with the
value 1 from each clause.


Suppose that G has a clique V' of size k. Since there are no
edges between vertices whose labels appear in the same
clause, for all i, V' must contain at most one vertex labeled with
a literal in clause C
i
. On the other hand, |V'| = k, so V' must
contain
exactly

one vertex labeled with a literal in clause i, for all
i. Assign each literal in V' the value 1. This will be a satisfying
assignment. The assignment is consistent because V' will not
contain both a literal and its negation. QED


VEXTEX
-
COVER Problem


We say a graph G = (V,E) has a
vertex cover

of size k if there is a subset V’ of V such that
for all edges (u,v) in E, either u is in V' or v is
in V' (or both), and |V’|=k. Thus, a vertex
-
cover is a set of vertices V' such that every
edge in E is incident on a vertex in V'.



The VERTEX
-
COVER problem is: given a
graph G and an integer k, does it have a
vertex
-
cover V' of size k?


Vertex
-
Cover example



V’ = {2,3,7} is a vertex cover of size 3.

1

3

2

6

7

4

5

Theorem. VERTEX
-
COVER is NP
-
complete.


Proof.


1. VERTEX
-
COVER is in NP. A certificate is a list of vertices V'
forming the alleged vertex cover, checking is trivial

2. Choose to reduce CLIQUE ≤
P

VERTEX
-
COVER.

3. Lemma: G has a clique of size k iff G complement has a vertex
cover of size |V|
-
k.


This lemma shows that the map f that transforms an instance of
CLIQUE to an instance of vertex cover is:


(G, k)
----
> (G complement, |V|
-
k)

4. This transformation can be done in polynomial time.



QED


Proof of the lemma


Proof. Suppose V' is a clique of size k in G = (V,E). Then let
(u,v) be an edge of E complement in G complement. Then (u,v)
is not in E. Hence either u is not in V', or v is not in V', for if they
were both in V', (u,v) would be in E. Therefore either u is in V
-
V'
or v is in V
-
V'. Thus each edge (u,v) in E complement has at
least one endpoint in V
-
V', so V
-
V' is a vertex cover for G
complement, of size |V|
-
k.


On the other hand, suppose G complement has a vertex cover
V'' of size |V|
-
k. Then for any edge (u,v) in E complement, either
u is in V'' or v is in V''. Taking the contrapositive, if u is not in V'',
AND v is not in V'', then (u,v) is not in E complement. In other
words, if u is in V
-
V'', and v is in V
-
V'', then (u,v) is in E. In other
words, V
-
V'' is a clique of size |V|
-
(|V|
-
k) = k.


Proof by Picture

1

3

2

6

7

4

5

1

3

2

6

7

4

5

G

~G

Summary


To show a problem to be NP
-
complete, we do
it in 4 steps:


Show the problem in NP


Choose an NPC problem (it is important to
choose the right problem).


Do the reduction


Show the reduction is polynomial time.


Of course, not all problems are NPC. How do
we know if a problem may be NPC?


We fail to find a polynomial solution


We can guess a solution, and check it quickly.