Skip to content

CSCI 381 Goldberg Lecture 5: Nondeterminism and Computational Complexity

Papers - Get These If You Haven’t Already

Section titled “Papers - Get These If You Haven’t Already”

Must download/print BOTH papers:

  • Full Title: “The Complexity of Theorem-Proving Procedures”
  • Author: Stephen Cook
  • Topic: Introduces NP-completeness concept; foundational theoretical work
  • Key Contribution: Defined NP class rigorously; proved Satisfiability is NP-complete
  • Significance: Showed ALL NP problems reduce to Satisfiability (core problem)
  • What to Focus On:
    • Definitions of P (polynomial deterministic) and NP (polynomial non-deterministic)
    • Concept of “reduction” between problems
    • Why Satisfiability is the “hardest” NP problem
  • Later Reference: Karp’s work built directly on Cook’s foundation
  • Full Title: “Reducibility Among Combinatorial Problems”
  • Author: Richard Karp
  • Topic: Demonstrates 20+ real-world problems are NP-hard/NP-complete
  • Key Contribution: Shows practical problems (graph coloring, scheduling, knapsack, clique, vertex cover, etc.) are all equivalently hard
  • Significance: Proved same reduction works in opposite direction for many real-world problems
  • Course Timeline: Will spend 2-3 weeks discussing this paper in detail
  • What to Focus On:
    • How different problems reduce to each other
    • Real-world applications of complexity theory
    • Why “Karp-21” problems matter (21 problems shown NP-complete)
    • Practical implications despite theoretical nature
  • Cook (1971): Established NP-completeness concept using Satisfiability as anchor problem
  • Karp (1972): Extended by showing MANY practical problems are NP-complete
  • Together: Foundation for understanding which computational problems are “easy” vs “hard”
  • Note: Karp only proved one direction (satisfiability reduces TO his 20 problems); Cook had already proven the other direction (all NP problems reduce TO satisfiability)
  • This is why literature says “Karp-21” instead of “Cook-Karp”
  • Search: Author name + year + paper title on Google Scholar or arXiv
  • Available freely through most academic sources
  • Save as PDF for reading/annotation or print for class
  • Bring to class (can be digital or physical copy)
  • Recommend having copies ready by next lecture
  • Karp’s work has significant practical real-world applications despite being “pure theory”
  • Understanding these papers helps solve many computational problems in industry, optimization, scheduling
  • Published 54 years ago but remains as relevant now as then
  • These papers are the foundation of the P vs NP problem (Millennium Prize)
  • Professor finalizing all materials
  • Assignment deadlines still being established
  • “Don’t worry about Brightspace deadlines right now”
  • Some days closed for presidential/federal holidays
  • Monday (Feb 10): Regular class
  • Following week: Some closures (dates TBA)
  • Check calendar for exact schedule before planning
  • Check spam/junk/clutter/trash folders for emails
  • Professor will email updates as materials are finalized
  • Text word “present” for attendance tracking
  • Notes emailed same day after class
  • Check all email folders (including spam)
  1. Download both Cook and Karp papers immediately
  2. Have copies ready for next lecture
  3. Do NOT worry about Brightspace deadlines
  4. Watch for email updates on schedule changes

Our discussion of nondeterminism will be framed within the context of programming and follow foundational work in this area.

  • E. Dijkstra. (1976). A Discipline of Programming. Prentice-Hall Press.
  • R. Floyd. (1966). “Nondeterministic Algorithms.” Journal of the ACM, Vol. 14(10), pp. 636-644.

Deterministic vs. Nondeterministic Computation

Section titled “Deterministic vs. Nondeterministic Computation”

According to Dijkstra (building on Floyd’s work), we can understand deterministic and nondeterministic computations using standard programming language concepts. Consider Java as a deterministic language, and imagine a modified version called NJava designed for nondeterministic computation. The key difference between Java and NJava is a single command:

All other commands in NJava are identical to Java: loops, methods, classes with inheritance, recursion, etc.

Key Insight: Providing Options to the Machine

Section titled “Key Insight: Providing Options to the Machine”

The programmer’s responsibility is crucial: the programmer must list all values or the range of values that could possibly be correct. In essence, the programmer says to the machine: “I know the answer is one of these options. I don’t know which, but it’s definitely here.” The nondeterministic machine then instantly selects the correct choice from the provided set.

For the Choose command to be meaningful in nondeterministic computation, several key assumptions must hold:

(1) GIGO: Garbage In, Garbage Out

Applied to the Choose command:

  • (a) A correct value must be included in the set
  • (b) Although a nondeterministic programmer may have logical errors in their code, the nondeterministic computer cannot correct them

If no valid option is provided, the computation will not find a correct answer.

(2) The Choice is Always Correct

Assuming no GIGO, the choice selected by the nondeterministic computation is always correct.

(3) Nondeterminism Does Not Increase Computational Power

Both Turing and Kleene established that whatever can be computed nondeterministically can also be computed (simulated) deterministically. Nondeterminism provides no additional functional power beyond what deterministic computation can achieve. So what is the advantage of nondeterminism?

(4) Constant-Time Cost of the Choose Command

Nondeterminism is not about solvability, but about complexity and efficiency. The fundamental assumption in the literature is:

Oracle Computing and Relativized Computing

Section titled “Oracle Computing and Relativized Computing”

Theoretically, how can a machine be guaranteed to make the right choice instantaneously? Computer theorists established a provocative precedent: Consult a prophet.

More specifically, they invoked the ancient Oracle at Delphi (a prophetess in ancient Greece famous for consulting a crystal ball). This gave rise to terminology:

  • Oracle computing: Accessing an external oracle that “knows the future” and provides the answer
  • Relativized computing: Decisions made relative to consulting an external source (not obtainable from the Turing machine itself)

The point is philosophical rather than practical: nondeterminism makes decisions based on external information beyond the machine’s own computational power. This external oracle could even be equipped with knowledge of unsolvable problems (like the halting problem itself).

Nondeterminism and Free Will: A Crucial Distinction

Section titled “Nondeterminism and Free Will: A Crucial Distinction”

Periodically, philosophers argue that nondeterministic computation proves machines possess free will. This is a fundamental misconception, arising from terminology confusion.

Example: Consider the TV show The West Wing (starring Martin Sheen as President Bartlet). The show was so well-received that 70,000+ viewers wrote in the fictional president’s name as a write-in candidate. This represents one expression of free will: the freedom to choose something not on the provided list of options.

However, nondeterminism forbids this:

  1. Predetermined choices: The programmer must enumerate all valid options beforehand—the machine cannot choose something outside the given set
  2. Guaranteed correctness: Nondeterministic machines always select correctly, whereas genuine free will offers no such guarantee

Therefore, while nondeterminism involves choice-making, it fundamentally differs from philosophical free will because the choices are pre-approved and constrained by the programmer.

Turing recognized that nondeterministic computations can be efficiently simulated using parallel processing. When a decision-point arises in the computation:

  1. Assign each possible value to a different processor
  2. Let each processor run independently with its assigned value
  3. Since one value is correct, the first processor to complete reports success
  4. The entire computation halts

Note: In operating systems, we distinguish between heavyweight threads (separate memory pools per processor) and lightweight threads (shared memory pool). However, Turing chose to present only sequential computation in his foundational work.

The Complexity Paradox of the Choose Command

Section titled “The Complexity Paradox of the Choose Command”

Interestingly, there is a theoretical paradox: in analysis of algorithms, all basic built-in commands (read, print, arithmetic operations) are assumed to run in constant time . The Choose command, according to nondeterministic theory, also runs in time.

However, when analyzing the overall complexity of nondeterministic code, the Choose command typically does not affect the final complexity expression because complexity analysis drops constant factors. The overall complexity of nondeterministic algorithms depends on the surrounding loops and non-constant-time operations, not on the constant-time Choose command.

Deterministic vs. Nondeterministic Complexity

Section titled “Deterministic vs. Nondeterministic Complexity”

Complexity measures differ fundamentally:

  • Deterministic complexity: Based on the number of steps or instructions required to solve a problem
  • Nondeterministic complexity: Based on the number of decisions required to solve a problem

Since the primary contribution of nondeterminism lies in fast decision-making, Cook (1971) and Karp (1972) explored problem classes from both deterministic and nondeterministic perspectives.

In the 1960s, Jack Edmonds proposed a crucial distinction:

  • Easy problems: Polynomial-time solvable
  • Hard problems: Exponential-time solvable

Cook (1971) and Karp (1972) adopted this distinction for polynomial-time algorithms in both deterministic and nondeterministic settings. This led to the central question:

In other words: Is the set of problems solvable deterministically in polynomial time identical to the set of problems solvable nondeterministically in polynomial time?

Hartmanis’s Alternative: Practical vs. Theoretical Complexity

Section titled “Hartmanis’s Alternative: Practical vs. Theoretical Complexity”

Computer scientist J. Hartmanis proposed an alternative (though less widely adopted) perspective on what makes problems “easy” vs. “hard”:

  • Edmonds’ view (mainstream): Polynomial = easy, exponential = hard
  • Hartmanis’s view (practical): Consider actual resource availability. The polynomial-time algorithm is theoretically “easy” by Edmonds’ definition, yet no practical system could support the resources required—so Hartmanis calls it “hard.”

While philosophically interesting, Hartmanis’s pragmatic distinction has not been widely adopted in mainstream complexity theory, which remains committed to Edmonds’ polynomial/exponential divide.

Membership Predicates and Characteristic Functions

Section titled “Membership Predicates and Characteristic Functions”

A predicate in discrete mathematics implies availability of both Boolean values (True/False). Thus we can answer both:

  • Who is a member of the set?
  • Who is not a member of the set?

However, computability theory reveals a subtle asymmetry: there exist sets for which membership can be verified but non-membership cannot. This asymmetry arises from the unsolvability of the halting problem.

The Halting Problem and Asymmetric Decidability

Section titled “The Halting Problem and Asymmetric Decidability”

Consider a program and its set of inputs that cause to enter an infinite loop. By the unsolvability of the halting problem:

  • We can verify membership: Given an input, run with that input. If halts, the input is definitely not in .
  • We cannot verify non-membership: If doesn’t halt after time , we cannot determine whether it will eventually halt or loop forever.

This asymmetry motivates terminology:

  • Discrete mathematics: Uses “membership predicate” (assumes both decidable)
  • Computability theory: Uses “characteristic function” (acknowledges potential asymmetry)

When a characteristic function encounters an unsolvable case, the program enters an infinite loop by definition. This prevents logical contradictions with the halting problem’s unsolvability.

We now examine an interesting case that exhibits both deterministic and nondeterministic properties. To understand this hybrid categorization, we must first establish the fundamental notion of mappings.

Consider mappings from set to set . Three distinct types are relevant to understanding computation:

Diagram showing three mapping types: Relations (A→B with unrestricted connections), Partial mappings (A→B with at most one connection per element), and Total mappings (A→B with exactly one connection per element)

(Note: Although drawn here with same cardinality, need not equal )

Relations: Any element of may map to any subset of (including zero or multiple elements). No restrictions.

Partial mappings: Each element of maps to at most one element of (zero or one).

Total mappings: Each element of maps to exactly one element of .

Deterministic Computation: Total Mapping Model

Section titled “Deterministic Computation: Total Mapping Model”

Deterministic computation corresponds to total mappings:

  • (a) Every input has exactly one output
  • (b) The code always knows the correct next action at each step
  • (c) The same input always produces the same output

Notation:

This reads as: “Function (where denotes deterministic) maps from the natural numbers to the natural numbers.” The deterministic property means the mapping is total—well-defined for all inputs, as explained above.

Nondeterministic Computation: Relational Mapping Model

Section titled “Nondeterministic Computation: Relational Mapping Model”

Nondeterministic computation corresponds to relations, because at each decision point the output can be any one of several possible elements (“don’t care state”).

Notation:

Here, maps natural numbers to the power set (set of all subsets of ).

Partial functions present an interesting ambiguity. On one hand, they resemble nondeterministic functions:

  • The empty set is a member of the powerset
  • An empty output set means the function is undefined for that input

On the other hand, on actual computing devices, partial functions produce predictable error messages:

  • Division by zero always returns the same error
  • This is a deterministic response (“error state”)

From a computational perspective, partial functions behave deterministically since errors are predictable.

Computability theorists encounter this ambiguity when designing incompletely specified automata. Given a state and input character, if the transition table does not specify a next state, should we treat this as:

  • An error condition (deterministic)?
  • A don’t care situation (nondeterministic)?

Authors like Sudkamp concluded that the classification depends on the application. Thus, partial mappings are labeled “partially deterministic” rather than purely deterministic or nondeterministic.

A partial function is a function that is not defined for all inputs in its domain. On computing devices, when a partial function is undefined for an input, the result is typically an error message.

Example: A scientific calculator contains at least 8 partial functions. Try computing or divide by zero—these produce error states.

When encountering an undefined input in a partial function:

  • Mathematically: The function “crashes” or is undefined
  • Computer devices: Returns a predictable error message (e.g., “undefined”, “infinity”, “error”)
  • Computer theory: Defines undefined inputs as entering an infinite loop

Why this restriction? To avoid contradicting the unsolvability of the halting problem.

The Halting Problem and Asymmetric Decidability

Section titled “The Halting Problem and Asymmetric Decidability”

The halting problem states: No algorithm can determine whether an arbitrary program will halt or loop infinitely for a given input.

This creates an asymmetry in characteristic functions:

  • Testing non-halting: If you run program with input and it halts, you’ve definitively proven does not cause an infinite loop. This is decidable—you simply test it and observe halting. This is an efficient procedure.

  • Testing halting failure: If hasn’t halted after time , you cannot conclude it will never halt. It may halt later. This is undecidable—there’s no way to know when to stop waiting.

In discrete mathematics, membership predicates are total functions answering both “who is a member?” and “who is not?” But the halting problem prevents this asymmetry: we can verify non-membership (via testing) but cannot universally verify membership (via waiting).

Consequence: Computer theorists use the term characteristic function (not membership predicate) to acknowledge this asymmetry.

Computing with Partial Functions: The Classification Issue

Section titled “Computing with Partial Functions: The Classification Issue”

Partial functions occupy an ambiguous position:

  • Mathematically similar to nondeterministic computation (empty set in power set, multiple possible outputs)
  • Practically similar to deterministic computation (predictable error states, not random)

Since error responses are deterministic and predictable, some theorists classify partial mappings as “partially deterministic”—sharing properties of both categories depending on application context.

Cook’s Contribution: NP-Hardness of Satisfiability

Section titled “Cook’s Contribution: NP-Hardness of Satisfiability”

The first problem proven NP-hard was Satisfiability (SAT):

Given a Boolean predicate where each (or in Cook and Karp’s original formulation):

Find: An assignment of Boolean values to such that

Boolean predicates consist of combinations of AND, OR, and NOT operations on inputs.

Cook’s Breakthrough (1971): He proved that all NP-hard problems reduce to SAT. In other words, at the core of every NP problem lies the satisfiability question—understanding SAT means understanding all of NP.

Cook’s work introduced the fundamental concept of reduction in complexity theory. When we say “Problem A reduces to Problem B”, we mean:

  • Problem A’s solution requires solving Problem B
  • An algorithm for B can be used to solve A (via transformation)
  • The transformation is efficient (polynomial-time)

Under Cook’s definition, reduction works in one direction: all NP problems reduce to SAT.

Karp’s Contribution: Bidirectional Reduction (1972)

Section titled “Karp’s Contribution: Bidirectional Reduction (1972)”

Karp argued that to prove two problems are equally hard, one must show reduction in both directions. His paper demonstrates that satisfiability (SAT) reduces to 20 other NP-complete problems:

  • Clique Cover (KARP#13)
  • Chromatic Number (KARP#12)
  • Job Scheduling (KARP#19)
  • Knapsack problem
  • Graph coloring
  • Vertex cover
  • And 14 others…

Notable point: Karp only shows one direction in his paper (SAT ⟹ other problems). But this is sufficient because Cook already proved the opposite direction (other NP problems ⟹ SAT). Together, Cook’s and Karp’s results establish bidirectional reduction.

Popular literature refers to these results as “Karp-21” or “Karp-31”, but this is misleading:

  • Correct attribution: Cook 1 (Satisfiability) + Karp 20 (additional problems) = 21 problems total
  • Why the confusion: Karp’s 1972 paper is far more famous and cited than Cook’s 1971 paper, despite Cook establishing the foundational concept
  • Historical footnote: Everyone remembers Karp’s work, and Cook’s fundamental contribution is often overlooked, even though Cook proved the harder direction (all NP-complete problems reduce to SAT)

Levin’s Definition: Relativized Reduction

Section titled “Levin’s Definition: Relativized Reduction”

A third perspective on reduction comes from Yuri Levin: reduction is about acquiring external information needed to solve a problem.

Under Levin’s definition, if problem A reduces to problem B:

  • External knowledge of problem B’s solution helps solve problem A
  • The hardness is the cost of obtaining that missing knowledge
  • This aligns with the “relativized computing” view of nondeterminism

In literature, the notation:

is read as reduces to and means:

  1. is known to be hard; ‘s status is unknown
  2. ‘s algorithm solves ‘s problem
  3. is the parent in the reduction tree
  4. is the child of (child solves parent’s problem)
  5. The root of any reduction tree is SAT
  6. reduces to ; reduces from
  7. The subscript emphasizes polynomial-time transformation

Example: Chromatic Number ⟹ Clique Cover (from Karp’s work)

Section titled “Example: Chromatic Number ⟹ Clique Cover (from Karp’s work)”
  • Chromatic Number (KARP#12): Known hard; parent in the tree
  • Clique Cover (KARP#13): Unknown; child in the tree
  • Reduction: If you solve Clique Cover, you solve Chromatic Number
  • Notation: Clique Cover Chromatic Number
  • Direction: Chromatic Number reduces to Clique Cover

Karp proved 21 NP-complete problems (building on Cook’s single problem, Satisfiability). However, many of these are inherently optimization problems, which typically cannot be proven NP-complete (and are only NP-hard).

Why? Optimization problems lack easy verification—checking if a solution is optimal requires exhaustive search.

Karp’s solution: Add a parameter to convert optimization problems into decision problems:

  • Original optimization: Which solution minimizes cost ?
  • Karp’s decision variant: Does there exist a solution with cost ?

The latter is polynomial-time verifiable: given a candidate solution, simply check if its cost is .

Example: Job Scheduling/Sequencing (KARP#19)

Section titled “Example: Job Scheduling/Sequencing (KARP#19)”

Original problem: Given tasks with deadlines and penalties, find a schedule that minimizes total tardiness.

Karp’s variant: Given tasks with deadline penalties and threshold , does there exist a permutation where total penalties ?

By introducing parameter , an optimization problem becomes a decision problem and can be proven NP-complete (not just NP-hard).