Ming Li Talk about Bioinformatics

abalonestrawBiotechnology

Oct 2, 2013 (4 years and 1 month ago)

66 views

Lecture 21 NP
-
complete problems


Why do we care about NP
-
complete problems?


Because if we wish to solve the P=NP problem, we need to
deal with the hardest problems in NP.


Why do we want to solve the P=NP problem?


Because it will solve other 3000 NPC problems.


It is one of the fundamental mathematical problems:

Millennium Prize (CMI)
--

$1M for each problem:


P versus NP


The Hodge conjecture


The Poincaré conjecture

-

solved, by
Grigori Perelman


The Riemann hypothesis


Yang
-
Mills existence and mass gap


Navier
-
Stokes existence and smoothness


The Birch and Swinnerton
-
Dyer conjecture


Theorem
. If there is a polynomial
-
time algorithm for any
one NP
-
complete problem, then there is a polynomial
-
time algorithm for every problem in NP.


(Proved in previous lecture.)


Corollary
. If no polynomial
-
time algorithm exists for some
problem in NP, then there is no polynomial
-
time
algorithm for any NP
-
complete problem.

NP
-
completeness

Easy vs Hard


Here are some examples of easy (P) and hard
problems (NP
-
hard).



Easy Hard


2SAT 3SAT


Minimum spanning tree Traveling salesman
Shortest path Longest path

linear programming integer linear programming
Eulerian cycle Hamiltonian cycle

“First Natural” NP
-
hard problem


Circuit Satisfiability problem. An instance of the
problem is a Boolean circuit (using AND, OR, and
NOT gates connected by wires) that has n Boolean
inputs and a single Boolean output. The circuit has
no cycles. The
size

of the circuit is defined to be the
total number of gates and wires.



CIRCUIT
-
SAT = the decision problem, given a
Boolean circuit C, is it satisfiable?



CIRCUIT
-
SAT can be solved by simply trying all
possible 2
n

assignments, not polynomial time.


By S.A. Cook, 1971 STOC


Circuit SAT

AND

OR

AND

AND

OR

OR

AND

NOT

OR

NOT

x
1

x
2

x
3

x
4

x
5

AND

AND

NOT

NOT

AND

OR

NOT

AND

OR

AND

Find a satisfying assignment

Thm. CIRCUIT
-
SAT is NP
-
complete

Proof by hand
-
waving. CIRCUIT
-
SAT is clearly in NP. The "certificate” is a
satisfying assignment. Given this, we can easily verify in polynomial time
that the circuit outputs a 1.




We now give a polynomial
-
time transformation from
every

problem L in NP
to CIRCUIT
-
SAT. If L is in NP, then there is a verifier A(x,y) that runs in
time T(|x|) = O(|x|
k
) for some k. We now construct a single boolean circuit
M that maps one "configuration" of a machine that carries out the
computation of A(x,y) (recording such things as the memory state, program
counter, etc.) to the next "configuration".




We now hook together T(|x|) of these circuits together, making the inputs to
the circuit at the top the value of y, and the output the single bit that reflects
the value of A(x,y). This big circuit C(x) is satisfiable (by a value of y) if and
only if x was a "yes" instance of L.




The size of this circuit is polynomial in x, and the transformation can be
done in polynomial time. QED


4 step routine for proving NPC

Four
-
step routine

for proving NP
-
completeness of a
decision problem A:



1. Prove A is in NP by describing the polynomial
-
time
verifier V that verifies "yes" instances of A. What is the
certificate? How is it verified?


2. Select a problem B that you already know to be NP
-
complete.


3. Design a function f that maps "yes" instances of B to
"yes" instances of A, and “no” instances of B to “no”
instances of A, and justify that.


4. Show that f can be computed in polynomial time.


Proving other NPC problems


The following is the scheme of reductions we will use
to prove some problems to be NP
-
complete (there
are over 3000 of them


actually many more now):


CIRCUIT
-
SAT


|


SAT


|


3
-
CNF
-
SAT


/
\



CLIQUE SUBSET
-
SUM


/


VERTEX
-
COVER

SAT problem.


SAT. A
boolean formula

consists of boolean (1/0)
variables joined by connectives: AND, OR, IMPLIES,
IFF, NOT. Also, parentheses may be used.


An example,


F = ((x
1

IMPLIES x
2

) OR NOT ((NOT x
1

IFF x
3
) OR
x
4
)) AND NOT x
2

.


An
assignment

is a specification of truth values of the
various variables. For example,


(x
1
,x
2
,x
3
,x
4
)=(0,0,0,1)


is an assignment that makes F true (equal to 1). Such
an assignment is called a
satisfying

assignment. If a
formula F has a satisfying assignment, it is called
satisfiable
.


Theorem
. SAT is NP
-
complete

(
Given a boolean formula F, is it satisfiable?)

Proof.

First, we need to show that SAT is in NP. Clearly there is a
polynomial
-
time algorithm A(x,y) to verify that x is a "yes"
instance of SAT, using y as the purported satisfying assignment
for the boolean formula x.


Second, we choose CIRCUIT
-
SAT (we do not have other
choices at this moment) to reduce it to SAT.


Third, we design a function that maps circuits to Boolean
formulas.


For each gate, label each wire coming out of a gate with a new
variable name. Thus, for example, if an AND gate has inputs x
6
, x
7
,
and x
8
, and output x
9
, introduce a clause


x
9

IFF (x
6

AND x
7

AND x
8
)


Add a clause that just consists of x
output

, the label for output wire.


Take the AND of all these clauses.


NPC of SAT, proof continues


Clearly, C is a satisfiable circuit if and only if the formula f(C) is a
satisfiable boolean formula.


If C is satisfiable, then there exists some assignment to the input
wires that results in the output being 1. Therefore, the exists some
assignment to the variables input wires, plus the additional
variables representing the output wires, that results in the formula
f(C) evaluating to 1.


Similarly, if the formula f(C) is satisfiable, then it corresponds to the
circuit being satisfiable, and each output wire being correctly
computed in terms of the gate type.


Step 4. Clearly, f can be computed in polynomial time. We can
simply do a breadth
-
first search on the graph representation of
the circuit, outputting a formula for each gate.


This completes our proof that SAT is NP
-
complete. QED