---
Video Title: Cook-Levin Theorem
Description: Theory of Computation
https://uvatoc.github.io/week11

25.3 Cook-Levin Theorem
- Defining NP-Hard and NP-Complete
- The Cook-Levin Theorem
- NANDSAT
- Proving NANDSAT is NP-Hard
- Implications of P=NP

David Evans and Nathan Brunelle
University of Virginia
---

﻿Our definition of NP-hard,
these are problems that

are as hard as the hardest
problems in NP are harder.

These are problems that every problem in
NP can be reduced to.

If there's some problem that is NP-hard,
every problem in NP can be reduced to it.

G could be outside of NP.

We could pick some problem that we know is
in exponential time.

Or something else, right?

But it says every problem that's within NP
can always be reduced to this problem.

So that's what NP-hard means.

It's a hardness property.

And if you were that hard or harder,
you're within class NP-hard.

NP-completeness says it's NP-hard but also
within class NP.

To be NP-complete, this has to also be
within class NP.

So that means in order to prove this
property, our goal is to prove that every

problem within class NP can be reduced to
3SAT.

Don't want to go through every problem.

There are infinitely many.

So if we can find one
problem that is NP-hard, and

so that that can be reduced
to 3SAT, we're done.

Because the definition
of NP-hard says every

problem within NP can
be reduced to that problem.

So if we have one problem
that's NP-hard, and then we can

show that that problem can
be reduced to 3SAT, we're done.

So here's the problem.

The problem we'll call NANSAT.

This proof is in the book.

We're going to go through what I think is
the key insight part of it.

We're not going to go through all the
technical steps.

There are quite a lot of things to
actually make the proof work.

The problem that we're going to use is
this one.

Its input is a NANSERC program that has N
inputs.

It will output one if there is some string
of length N.

So the string is finite, but there's no
constraint on N.

Such that running the circuit on that
string outputs one.

Otherwise, the output of NANSAT is zero.

The name of it as well
as the ways defined should

make you think, well,
this is kind of like 3SAT.

So 3SAT we said there's some assignment to
variables that makes this formula true.

So that's where the satisfiability name
comes from.

This is, there's some input to this
program that makes it output one.

If this was already an
interesting problem, and we could

show that this was NP-hard,
we'd be pretty close to done.

What we probably should know first is
whether this is in NP.

We're not going to get the
property we want that helps

us separate P and NP if
our problem is not within NP.

So do we think NANSAT is within NP?

Two ways of thinking about class NP.

And the one that's definitely should start
to feel more useful to you, even if it's

maybe not as intuitive, is is there a
witness?

So when the output of this function is
true, is there some polynomial length

witness that can convince you that it's
true?

Yeah.

There's a reason W is the name for that.

It's the witness.

That's a witness, right?

Our goal is to have
something that if the

output is one, convinces
us the output's one.

For any value where Q, Q is the input to
NANSAT, any instance of this problem where.

Q is a circuit that there's some input
that makes it true, we can find a witness

that proves that there is, and that's the
input.

Can we verify this in polynomial time?

So we need some
verification function that will

take Q and the witness
and tell us if it's valid.

What does the verification have to do?

Yeah.

Just evaluate it.

So it's going to evaluate the circuit that
Q represents or the NANSARC problem.

And if the output is one, it's a valid
witness.

If the output is zero, it's not a witness.

So that works as a witness and it's easy
to verify.

Okay.

So it is within NP.

What we want to know is that it's NP-hard.

Between those two, we know it's
NP-complete.

So to know that it's NP-hard, well,
this is what we've got to do, right?

This is our definition of NP-hard.

So we've got to show for all functions
that are within NP, there is some

polynomial time reduction from that
function to NANSAT.

This sounds like this is hard to do.

So you haven't had too
many reductions yet, but you've

seen a few, and they're all
very specific to the function.

Even the one we just talked about where
you can use the generic function,

well, use the function itself to do a
polynomial time reduction between a

polynomial time function, we need to know
what the function was.

So we're going to need to do something
that only uses the definitions here.

We can't make any other assumptions about
F, but we do have one big thing we can do.

It's in class NP.

That means there is
some verifier that runs in

polynomial time on a
polynomial length witness.

So if we can show that that leads us to a
reduction, now we can use that verifier

and use NANSAT to verify any function,
just because we know the verifier exists.

We can't assume anything about it other
than it exists.

And this is what we want to find.

So we want to know that there exists some
reduction such that evaluating NANSAT on

the output of that
reduction on X will give us

the value of F of X for
any F that is in class NP.

We need to formalize our definition a
little more carefully to do this.

Because in order to do
this, we're going to need to

use the verifier and use
the witness to construct R.

We've got to construct R somehow in terms
of the verifier that we know exists.

Let's formalize that a little bit more
carefully.

First, I should make the correction I made
earlier.

So this only works when
the output is one, there should

be a witness, and we
should be able to verify it.

So we know that we've got some verifier.

What do we know about the verifier?

Good, yeah.

So the one thing we know about it,
we know the verifier is NP.

That's a requirement.

That comes from the definition of class
NP.

That the verifier must be within P. And
the verifier is a function from 0,

1 star to 0, or 1.

It's a function that has this property
that for all possible values of X,

there exists some witness.

We don't want the witness to be in 0,
1 star.

Because we have some constraints on the
witness.

What's the constraint we have on the
witness?

Yeah.

So we need a length constraint.

We're going to give X a length.

So X has length N.

The length of the witness can be bigger
than N, but it's got to be polynomial N.

So the witness is in 0, 1, some polynomial
in N.

A is some natural number.

So the witness is polynomial N,
its length is bounded.

The verifier is going to take an input of
some length.

The verifier gets two things.

You can't just get the witness.

It's got to get something else as well.

What else does it get?

You should be awake enough to answer this
question.

Don't trip on the stairs or if you want to
do like jumping jacks.

So what are the inputs to the verifier?

It needs to be able to take
a witness and know that

that witness proves
that the result is correct.

That when the output of the function is
one, like we saw the longest path one.

When the output of
the longest path is one,

that means there is a
path of length at least N.

The witness is that path.

What inputs does the verifier need to do
that?

You need X.

You need the original input and you need
the witness.

Because it doesn't make
sense to verify that the

output is correct if you don't
know what the input was.

The inputs to the verifier are going to be
X and W.

And it's outputting a 0 or a 1.

So that means if we think
of encoding on a Turing

machine, we're just going
to paste those together.

We may need some kind of separator or
something else to deal with it.

But it's going to be an input of length N
plus N of A.

It's a sum bounded length.

I'll introduce a new variable to represent
that.

Some length, which is finite and is
polynomial in the original input length N.

So, how long does the verifier take?

In general, all we know is it takes
polynomial time.

What about the one for NANSAT?

So NANSAT, we said we can verify it by
simulating a circuit.

How long does that take?

The number of lines in the program?

We know that.

The length of Q gives us that size.

But in this case, we've got a verifier for
some function in NP.

All we know is it's polynomial time.

But we do know it's polynomial in the size
of the input.

And what does it mean to be NP?

There's a Turing machine that runs it in
that number of steps.

So can we simulate a Turing machine with a
Boolean circuit?

If you have an arbitrary Turing machine,
can you simulate it with a Boolean circuit?

Which is more powerful, a Turing machine
or a Boolean circuit?

Okay, good.

The circuit only works on a fixed size
input.

So in general, we cannot simulate a Turing
machine with a circuit.

But this is not simulating any Turing
machine.

This is simulating a Turing
machine that we know

runs in some polynomial
in L number of steps.

The number of steps that takes the verifier
to run is L to the C for some constant C.

Can we simulate a
Turing machine that runs

for a bounded number
of steps with a circuit?

Yes, good.

How do we know we can do that?

Remember, we wrote our Turing machine
execution model as a loop.

So we definitely can't do loops and
circuits.

If we know a bound on the number of steps,
what can we do to the loop?

Yeah, we can unroll it, right?

We know the maximum number of steps.

If we know the maximum number of steps in
the loop, we can unroll the whole loop.

It's big.

It's L to the C steps.

But we know that there
is some straight line

program that does what
that Turing machine does.

Because we can just unroll it for that
number of steps.

And now, well, we've got that.

So we've unrolled it.

And we know the verifier exists.

And we know it runs in that number of
steps.

So we can actually simulate the verifier
with a NANSERC program of polynomial length.

And if we can simulate it with a NANSERC
program, this is what R of X needs to do

is transform whatever F is, the verifier
for F, which we know exists.

We don't know what it is.

But we know it exists and we know it can
be simulated by a Turing machine that runs

for some number of steps that is
polynomial in the length of that input.

So we can unroll that and
we can make a circuit that's

going to be polynomial
size in the original input.

And then we can use NANSAT to verify that.

We can run the verifier that we needed on
this input.

And remember what NANSAT does?

NANSAT outputs one if there's some input
that makes that true.

So if there's some input that makes the
verifier true, what does that mean?

It means there's a witness, right?

There's some input that makes it true.

So now we've solved that.

So for cases where F of X is true,
that means that the circuit that

represents that has some input that makes
it true because that's the witness that we

would use to verify the output of fx when
it's one.

Okay, this is really the key step in the
proof.

And one of the things, the way the proof
is in the book, and there's actually code

in the repository, that actually does all
these transformations.

So there's lots of things
you've got to be careful

to get right to make
those transformations.

But the real key idea is that,
because the verifies polynomial time,

we can unroll it, and we can simulate it
with a circuit.

And then NANSAT can
check if there's a witness that

would verify for that
input that there's a witness.

And if there's a witness, then we've got a
result that we know the output is one.

Have we finished proving Cook-Levin
theorem?

Yeah.

Good.

So we haven't done anything about 3SAT.

The Cook-Levin theorem was about 3SAT.

What we've done is shown a problem,
NANSAT, that is NP-hard.

So what we would need
to do to finish this would

be to show that we can
reduce 3SAT to NANSAT.

And the way the book
does this is actually introduce

another problem, but
it does it in two steps.

So it reduces 3SAT to 3NAND, and then
reduces that to NANSAT.

And we've shown that NANSAT is harder than
every problem in NP.

And we can do the reductions to show that
NANSAT and 3SAT are equivalent.

We'd have to do them in both directions,
but we could.

And we only need to do it in one direction
to show this property.

Our goal is to show every function in NP.

If we had a polynomial time-solver for
3SAT, we could use that to solve every

other function in NP, or to compute every
other function in NP.

This means that 3SAT is NP-hard.

It's also NP-complete, because we know
it's within class NP.

So it's as hard as the hardest problem
within class NP.

So now you should understand what it would
mean.

So let's assume for now that P is not
equal to NP.

We don't know if this is true.

But if this is true, that means P is a
subset of NP, proper subset.

So what would NP-complete look like in
this picture?

So it's not drawn there yet.

How should I draw NP-complete to add it to
this picture?

Is it going to be a circle?

Is it going to be the complement of a
circle?

Is it going to be a square?

An octagon?

Octopus?

What's it going to look like?

This is NP.

NP-complete is going to be some subset of
NP.

We could draw it any
way we want, as long as it's

a subset of NP that
doesn't include anything NP.

So it could be an octopus.

What's important is it's entirely contained
with NP, and none of it's with NP.

If we think of it as, as you go further
away, problems get harder.

Thinking of a ring kind of works.

What it means is every problem within NP
can be reduced to one here.

At least from everything we've seen so
far, there's nothing that would tell you

it couldn't actually include all of that
space.

It could be that every
function is either, if it's

within NP, it's either
within P or it's NP-complete.

There's no functions
that are outside of P that

are not as hard as the
hardest functions within NP.

We actually know this is not the case,
that we know that if there is a separation

between P and NP, there are some functions
that are within NP and harder than P,

but not as hard as the ones in NP-complete,
which is actually kind of surprising.

So what about this other universe?

What if they're equal?

So now, P equals NP.

Those are the same set.

What does NP-complete look like?

NP-complete is problems within
NP that are as hard with respect

to polynomial time reductions
as the hardest problems in NP.

So if P equals NP, there is
a polynomial time reduction

between every problem in
NP and every problem in NP.

They're the same class.

It's polynomial time.

Everything within that class now can be
done in a Turing machine in polynomial time.

So there's a trivial polynomial
reduction between those, which

means they can all be mapped
to every one of those problems.

NP-complete is everything with a little
hole in the middle.

What's the little hole in the middle?

It kind of came up a little while ago.

There are two functions that don't work
for these reductions.

The two constant functions are not in
NP-complete.

And what is NP-hard?

It's every function in the universe other
than those two functions.

All of those are as hard or harder.

So everything is NP-hard.

If P equals NP.