Lecture 7: P vs NP and Karp's Reducibility - CSCI 381 Goldberg

Administrative Updates

For the next two weeks of lectures, bring Karp’s paper. We will be discussing the relevant (application-oriented) portions.
Quiz #1 is on February 25 (a week from today). Monday February 23 will be a review class - come prepared with questions from the notes. The quiz will be available under the quiz tab on Brightspace.

P vs NP: Reduction and NP-Completeness

Context from Previous Lecture: Cook, Karp, and Levin

The first problem shown to be NP-hard was Satisfiability - find a sequence of true/false (i.e. 0/1) values that will cause a Boolean Predicate to be True.

Cook (1971) proved this and showed that ALL possible NP-hard problems must “reduce” to Satisfiability.
Cook defines reduction in one direction: from your Problem/Algorithm to Satisfiability.

Karp argued that to show equivalent hardness, one must reduce two problems to each other. Yet, in Karp’s paper he only shows one direction: namely, Satisfiability reduces to other problems (and continues for a total of 20 other problems). This paradox is readily solved when you realize that Cook already showed the other direction, so Karp only needed to show the second direction. But in terms of definitions, Karp held that two problems must reduce to each other in order to prove that they both have similar hardness in complexity.

A third definition of reduction was given by Levin, inspired by the fact that Nondeterminism is relativized computing. Levin said that reduction is about seeking an external piece of information or source that will help you solve the second problem given a first problem. The hardness is then how hard (complexity-wise) the cost of obtaining that missing piece of information is. Although we do not entertain Levin’s reduction definition when discussing the KARP-21, there is an interesting case discussed at the end of these notes which will benefit from Levin’s perspective.

Reduction Notation: $B \leq A$

In the literature, if we write $B \leq A$ , this notation is read as “B reduces to A”. The actual meaning (in Karp’s paper) is:

$B$ is known to be hard and $A$ is unknown.
$A$ solves $B$ ‘s problem (consistent with both cases above).
$B$ is the parent considered in the reduction tree (cf. Karp, p. 96, Reduction Tree Summary).
$A$ is the child (of $B$ ) considered in the reduction tree.
The root of the entire tree is SATisfiability (Karp #1).
In this context, it is the CHILD ( $A$ ) that solves the PARENT’s ( $B$ ) problem.
$B$ reduces TO $A$ .
$A$ reduces FROM $B$ .
Conclusion. $A$ is as hard as $B$ . Note: For Karp, this is only valid based on the fact that the other direction was already provided by Cook’s analysis.

Another notation for reduction, consistent with the above, uses the letters “oc” written with no space between them - think of it as an 8 lying sideways with the rightmost tip cut off. To emphasize the need to transform between NP problems within a reasonable amount of time, many put a “p” at the foot of the $\leq$ , giving $\leq_p$ .

Example: Chromatic Number and Clique Cover (KARP #12 $\to$ #13)

Let us take a pair of problems from Karp’s Reduction Tree Summary (Karp, p. 96) and derive the nine pieces of information as they apply:

Problem #1 (“B”): Chromatic Number (KARP #12)
Problem #2 (“A”): Clique Cover (KARP #13)

Then, at the point of the tree where Chromatic Number connects to Clique Cover:

Chromatic Number is known to be hard (at this point), whereas Clique Cover is not yet known.
Clique Cover is $A$ and Chromatic Number is $B$ (i.e. $B$ reduces to $A$ ).
Chromatic Number is the parent.
Clique Cover is the child.
The root of the tree is SAT.
Clique Cover solves Chromatic Number.
Chromatic Number reduces TO Clique Cover.
Clique Cover reduces FROM Chromatic Number.
Conclusion. Clique Cover is as hard as Chromatic Number. (See note at end of point 9 above.)

KARP-21

KARP-21 refers to the 21 NP-Complete problems in Karp’s paper (historically: Cook #1; Karp #2-21, i.e. Karp-20).

In order to define each problem as NP-Complete - even if the problem inherently is an optimization problem and would more easily be classified as NP-Hard - Karp adds an extra threshold parameter to augment the underlying problem. This is necessary because optimization problems generally do not have an easy verification process short of an exhaustive search, and therefore can typically only be shown to be Hard and not Complete.

To see why, consider a decision problem like a system of equations: the NP computer guesses the values $x_1, \ldots, x_n$ , and you can verify the answer in polynomial time by plugging them in - it reduces to a yes/no check. But for an optimization problem (e.g. maximizing a stock portfolio), even if the NP computer tells you exactly which stocks to buy, how do you verify it’s truly optimal? You would have to check every other possible combination - potentially an exponential number of alternatives. Since $n! \geq 2^n$ , even permutation-based optimizations (like Job Sequencing) are worse than exponential to verify exhaustively. This is why optimization problems cannot be shown to be Complete without modification.

The transformation works as follows: considering the underlying NP-Hard optimization problem, can a solution be found for a related decision problem formed by adding an extra threshold parameter and asking if the solution can be found $\leq$ or $\geq$ the supplied parameter? This then is NP-Complete.

Why $\leq$ Was Chosen as Reduction Notation

Reduction involves two problems: one known to be (complexity-class) hard, and another whose complexity class is not yet known. The following two scenarios illustrate why $\leq$ was chosen as the notation.

Let Alice be a Chemistry Professor and Bob be a Physics Professor. Each is an expert in their own field and also knowledgeable about the other.

Case 1 - Alice’s problem is hard, Bob’s is unknown:

Alice		Bob
hard (complex)	$\geq$	? (unsure)

Alice realizes that solving her problem yields a solution to Bob’s problem. What can we assume about the relative complexity of Bob’s problem?

The fact that Alice’s problem solves Bob’s problem does not indicate that Bob’s problem is at least as hard as Alice’s - since Alice’s approach is only one such approach. There may be an easier approach to solving Bob’s problem. (Analogy: one can always use an exhaustive search/brute-force to solve a problem, but that does not mean it is the only way.)

Case 2 - Alice’s problem is unknown, Bob’s is hard:

Alice		Bob
? (unsure)	$\geq$	hard (complex)

Alice realizes that solving her problem yields a solution to Bob’s problem. What can we assume about the relative complexity of Alice’s problem?

If it turns out that there exists an easy solution to Alice’s problem, in what manner can Bob’s problem still be considered hard? Alice’s problem yielded an “easy” solution to Bob’s problem. Since Alice’s problem cannot be $<$ Bob’s problem in complexity, it must be $\geq$ .

Conclusions from both scenarios:

In both scenarios, Alice’s problem yields a solution to Bob’s problem.
In both scenarios, Alice’s problem is shown to be $\geq$ Bob’s problem in terms of complexity cost.

This latter conclusion led to the iconic notation for “reduction” in the literature. Note that mathematicians prefer writing $\leq$ rather than $\geq$ , so:

B \leq A \quad \text{(keep in mind who is reducing to whom)}

B(\text{ob}) \leq A(\text{lice})

This notation is read as ” $B$ reduces to $A$ ”, FROM $B$ TO $A$ .

A Caveat: Levin’s Perspective on Case 2 (*)

The argument in Case 2 was based on the following: if Alice’s Problem has an “easy” (in terms of complexity costs) solution to Bob’s Problem (hard), in what manner can Bob’s Problem be considered “hard” if an “easy” solution is provided by Alice’s Problem?

We left Case 2 with an asterisk because in light of Levin’s perspective, we can provide an (albeit rare) situation of exactly that - Alice’s Problem is easy, it solves Bob’s hard problem, and yet no contradiction occurs.

What is this special situation?

As mentioned earlier, Levin was concerned with the intermediary information necessary to solve the other problem - what we can colloquially call “the cost of doing business.” Here, the cost of doing business can technically be very great (“hard”), because Alice’s Problem is a Chemistry Problem and Bob’s Problem is a Physics Problem, and there is no guarantee that the (chemistry) information Alice’s solution provides can “easily” be used by Bob to solve his (physics) problem. In such a case, the “cost of doing business” can be rather complex (“hard”). Mathematically then:

\underbrace{\text{Alice's Problem}}_{\text{"Easy"}} + \underbrace{\text{cost of transforming into a Physics-ready formulation}}_{\text{"Hard"}} = \underbrace{\text{Bob's Problem}}_{\text{"Hard"}}

It is precisely because of this that the literature subsequently added a small $p$ at the foot of the $\leq$ icon for “reduction”:

The $p$ in $\leq_p$ stands for Polynomial, which is the basis for “easy” complexities (according to Edmonds).
This guarantees that the “cost of doing business” is “easy”, and therefore the argument in Case 2 holds true.
Under this assumption, Alice’s Problem must be “hard” as well.