--- Video Title: Proving a Problem is in NP Description: Theory of Computation https://uvatoc.github.io/week11 25.1 Proving a Problem is in NP - Recap: Class P and Class NP - How can we prove LongestPath is in NP? - Proof by describing Nondeterministic Turing Machine - Proof by giving a Witness David Evans and Nathan Brunelle University of Virginia --- What we are going to do today is really get at the crux of the story that we started. What is this whole P equals NP question about? And how can we understand it? And what would it mean to answer it? To recap what you should understand already, we've defined these two very big complexity classes. The class P is defined in terms of our model of the standard Turing machine. And it is all of the functions that can be computed in time that's polynomial. Time is number of steps. There is some polynomial that grows with the size and input that we can find a. Turing machine that computes that function. So it's a very precise definition. It's a very large, robust class. C can be anything. That's why it's a robust class. So this means that even if we adjust our definition of the Turing machine in little ways like saying, well, let's have two tapes. Let's have it infinite in both directions. Let's allow it to take four steps every step. None of those things are going to change the functions in this class. We also introduced last class, this class NP, which are functions that can be computed in polynomial time. So the number of steps still has to scale at most with some polynomial in terms of the size of the input. But the machine is now a non-deterministic Turing machine. And our intuitive notion of that is this is a machine that every time it has to make a decision can always magically guess the right decision. It can give rules that allow for different transitions from the same configuration and it always guesses it. Or we can think of it as dividing into two copies every time we have a choice and keeping all those copies. And if we find one path through any of those copies that leads to a halting state, then we halt and we interpret that output on that path. The other way to define this, which we will get more precise about. We can define it in terms of problems where if the function can be verified. And we're going to make this more precise today by some witness. We can use that to verify the output of the function in polynomial time. So the other thing to recap and make sure that everyone understands. So we have this longest path function, which we've now stated as a decision problem. How do we know that longest path is within class NP? How would you convince someone that this longest path function can be computed in class NP? Yeah. Good. Yeah. So you're on the right track. What you're doing is describing a non-deterministic Turing machine. You're saying what the machine is going to do is start from node S. We specified the node that you start from, and what the machine is going to do is going to try a path with every other node. This is going to be every node in V other than S. We can't use S. We're gonna have a bunch of different transitions. All of them are in a non-deterministic machine valid, right? We can have all these different transitions, and one's going to correspond to trying each node first. And then from this state, we have to try every other node. So we're going to try V1 up to the node. And how do we know that's going to finish in polynomial time? How long is the path? Yeah? The path length is actually, in the decision problem, part of the input. And we know the longest path could be going through every vertex. The longest path is the size of the set of vertices. And that path length, if it was just a number, we'd have to be careful. Because the value of the number could be exponential in size of the input. But because we also need the input as the description of the graph, that's gonna mean the input's long enough that the length of the longest path is polynomially in the size of the input. So, we can show that. The other way we could show it is the other way of defining class NP. Designing non-deterministic Turing machines. Well, you've seen how tedious it is to design a regular. Turing machine and really work out all the details. This is a very sketchy argument that if you were skeptical of my proof, you would say, well, show me the actual transition function and prove to me it always works. I really don't want to have to do making the details of a non-deterministic Turing machine. The other way to do it is to use this part of the definition. Now we're gonna have to start to be a little more careful. What would be a witness that would convince you that the answer is correct? First thing we should really think about, what are the two possible answers? The answers could be zero or one. Zero meaning there is no path of that length. One meaning there is some path of that length. So if the answer is zero, what would convince you that that's correct? What would convince you that there is no path of the length you asked for? That's not easy. If that was easy, you would have an easy way to actually answer this. If you could be convinced that there's no path of some length, Some answers in some graph is gonna be easy, but in general that's gonna be hard. So, we can't verify the correctness of any output. What we need to be able to verify is, if the output is one, then there should be a witness that would convince us the output is one. So, for any input to this function, anytime the output of the function is one, there should be a witness, something you can verify. That you don't need to trust anyone, you don't need to trust anything else. You have some algorithm that you can use that can take in that witness and verify the answer is really one. So, what would the witness be? What would convince you there is a path of length n? Yeah. Yeah, good. The witness is the path. Someone gives you the path and says, this is my witness. I'm gonna start at node S. I'm gonna go through these nodes. And I'm gonna add up node T. And this is a path of length 5. And you're gonna check that these are all different. You could very quickly check if that's a valid path. You'd have to check that those edges are in the graph. You'd have to check that it doesn't repeat a vertex. You'd have to check that it starts at S and ends at T. Those are all things we can check in polynomial time pretty easily. So, if the answer is one, there should be a witness that confirms that the answer is one. Any time where the answer is one, there is a valid path of that length. That's your witness that would convince you that there is a valid path of that length. You can provide the path. So, as long as the path exists, there's a way to verify it in polynomial time. So, that's the first clarification to this notion that we can verify the output. It's only for one kind of output. This means that the way we define the function actually matters a lot. Whether we output 0 or 1, you would think, in most cases, we'd say, well, with a Boolean circuit, you flip the meaning of output, you just add an extra NOT gate, and you're done. You can always compute the same functions. Why is that not the case with our Turing machines? It seems kind of surprising how we define the output matters. We've got this notion of our execution model for a non-deterministic Turing machine that says, if there's one path through all these choices that you can make that leads to an accept state or a halt state, that's what it means to execute. Whereas, if we wanted to know that there are no paths that do something, well, if there's no paths, it can run forever. So, there is a big difference. And that's why what's defined as the co-NP class is not the same as the NP class. These are different things, because the witness only works when the output is one.