Lecture 12 - KARP 21 Problems Continued (Node Cover, Hitting Set, Steiner Tree, 3D Matching) and NP-Hardness - CSCI 381 Goldberg
Administrative Updates
Section titled “Administrative Updates”-
Exam reference material: For exams, students are responsible for Karp’s original definitions of the NP-hard problems. Do not rely on alternative formulations found elsewhere, as variations exist but the course uses Karp’s paper for consistency across exams.
-
Recommended reading (informal homework): The professor encouraged students to read Karp’s original paper on their own, particularly the section on the Steiner Tree problem. This will not be formally checked, but is strongly recommended for solidifying definitions.
The Rest of the Problems (5, 15, 16, 17)
Section titled “The Rest of the Problems (5, 15, 16, 17)”These problems don’t have a single category to unify them. They are as follows:
Problem 5: Node Cover
Section titled “Problem 5: Node Cover”Find a set of nodes (user decides on ) such that each edge of the entire graph has some vertex in common with a vertex found in the node cover. Note: any number of edges are allowed to be connected to the same vertex of the node cover. As such the mapping is surjective and not necessary to be injective.
For example: consider the following Star Graph
V5 alone is a node cover for this graph since all edges touch V5.
Note: This problem can be technically be grouped with any of the graph theory problems listed above, but then the category would be too generic, whereas the above categories are more specific (such as cycles or cliques).
Problem 15: Hitting Set
Section titled “Problem 15: Hitting Set”Consider subsets of a set of numbers (or items). Construct a set such that has only exactly one element in common with each of the subsets.
The user provided a collection of subsets from some Universal set . (Universal set is the set of all possible elements under consideration.) You are tasked to find a representative set (termed “hitting set”) such that
For example, consider the following set of numbers with the following four subsets constructed.
| Subset | Elements | Common with W |
|---|---|---|
| Element 5 in common with chosen below. Only one element in common. | ||
| Element 1 in common with chosen below. Only one element in common. | ||
| Element 5 in common with chosen below. Only one element in common. | ||
| Element 9 in common with chosen below. Only one element in common. |
NOTE: It is quite acceptable that the common element should be shared by other subsets as well: such as, element 5 is in common with both Subsets and .
What is not alright is that 2 or more elements of the hitting set are in common with a given subset (such as subset below).
This is a Hitting Set for subsets -.
is not a Hitting Set for subsets - since 2 and 5 are both present in Subset .
Problem 16: Steiner Tree
Section titled “Problem 16: Steiner Tree”Defined in a worksheet tab “Tweaked P problems” starting at line 38.
(It has a tweaked version “Minimum Spanning Tree” which is in .)
[Note: An example of both of these are in separate tabs.]
Problem 17: 3D Matching
Section titled “Problem 17: 3D Matching”Let with numbers.
where Points.
The user chooses a subset of . is the User’s Universal set.
Problem: Find a 3D Match from the points in such that .
- Here, points will be chosen from the user’s .
- In essence the solution will be a listing of points (obtained from ) such that in each dimension, the values found there are a permutation of the values found in the original .
- Note: The permutation of each dimension will be assumed independently so there could be three different permutations used, one for each dimension.
For example, the following 10 points were chosen from a supposed User’s subset of points, .
These points are a valid 3D Matching:
| Point | X | Y | Z |
|---|---|---|---|
| P1 | 1 | 2 | 3 |
| P2 | 6 | 7 | 8 |
| P3 | 9 | 9 | 9 |
| P4 | 2 | 1 | 4 |
| P5 | 5 | 5 | 1 |
| P6 | 4 | 3 | 2 |
| P7 | 7 | 8 | 7 |
| P8 | 10 | 10 | 10 |
| P9 | 8 | 6 | 5 |
| P10 | 3 | 4 | 6 |
Looking at the X column: — this is a permutation of
Looking at the Y column: — this is a permutation of
Looking at the Z column: — this is a permutation of
Remember that the columns are independent for the permutations.
The above is a “3D-Matching” and ironically because the numbers that appear in each given dimension (1st, 2nd, 3rd positions) do NOT have to match at all the other numbers in that dimension (position).
2D Matching (in P): If we were to limit the points to 2D as with , then in .
Minimum Spanning Tree (MST)
Section titled “Minimum Spanning Tree (MST)”An application of a weighted graph which involves a tree.
Minimum Spanning Tree (MST) — A subgraph that has no cycles. All nodes are present in this subgraph tree.
The cost of a Spanning Tree is the sum of the costs of all the edges involved. Choose the one that has Minimum cost.
Kruskal Greedy Algorithm
Section titled “Kruskal Greedy Algorithm”Weighted Graph where
//R+ means positive real. Weighted does not require positive but most applications do.
Note: Greedy does not always guarantee that the true solution is found. But, in our case Kruskal proves that it works.
Start with a weighted graph and we find a subgraph that involves:
- ALL the original nodes (not all of the original edges): Spanning
- A further restriction is that the subgraph be a TREE.
There can be many such “skeleton” Spanning Trees.
Which do you choose? The one with the smallest (Minimum) total cost where cost of the tree is defined by simply adding together the costs of each edge in that tree.
Kruskal has three general steps:
Section titled “Kruskal has three general steps:”-
- Sort all edges from smallest to largest cost (weight) associated with each edge.
- Of the remaining edges not yet chosen, choose the smallest cost edge available now (that is what makes this greedy) such that when added into the “tree” you are growing, a cycle does not result.
- Stop when all vertices have been reached by the edges of your chosen tree.
Kruskal proves that this Greedy (choose what is best now without consideration of future) approach actually works.
Worked out example of Kruskal’s algorithm.
Section titled “Worked out example of Kruskal’s algorithm.”This weighted graph has nine vertices number V0..V9 and 14 edges.
The edges in sorted order from smallest to largest is as follows:
| FROM | TO | WEIGHTS | Selected | Order Chosen | Notes |
|---|---|---|---|---|---|
| V7 | V6 | 1 | ✅ | 1st | |
| V2 | V8 | 2 | ✅ | 2nd | |
| V6 | V5 | 2 | ✅ | 3rd | |
| V0 | V1 | 4 | ✅ | 4th | |
| V2 | V5 | 4 | ✅ | 5th | |
| V6 | V8 | 6 | ❌ | This would form a cycle: V6-V8-V2-V5-V6 | |
| V2 | V3 | 7 | ✅ | 6th | |
| V7 | V8 | 7 | ❌ | This would form a cycle: V7-V8-V2-V5-V6-V7 | |
| V0 | V7 | 8 | ✅ | 7th | |
| V1 | V2 | 8 | ❌ | This would form a cycle: V1-V2-V5-V6-V7-V1 | |
| V3 | V4 | 9 | ✅ | 8th | Algorithm stops: all vertices reached (spanning) |
| V5 | V4 | 10 | Algorithm stopped: all vertices reached (spanning) | ||
| V1 | V7 | 11 | Algorithm stopped: all vertices reached (spanning) | ||
| V5 | V3 | 14 | Algorithm stopped: all vertices reached (spanning) |
Cost of the MST = 37
Worked out example of Kruskal’s algorithm:
Step 1: Initial weighted graph with all edges:
Step 2: Selection process with valid edges highlighted in red (invalid edges in gray):
Step 3: Final MST (cost = 37):
Steiner/Ghost Points
Section titled “Steiner/Ghost Points”In this example, the user is only interested in the region filled in.
The vertex is outside the region. Within the region, the shortest distance between nodes and has weight = 17. However, if one considering neighboring vertices outside the region such vertex , then a Cheaper weight is obtained for traversing from (indirectly) to . Namely, and ; .
This extra vertex outside the region is called a Steiner/Ghost point.

Understanding Karp’s 21 Problems: Application and Relationships
Section titled “Understanding Karp’s 21 Problems: Application and Relationships”We will be presenting Karp’s 21 Problems from an application point of view. This will make understanding problems easier, but does not necessarily follow Karp’s order. Karp’s order was based on the order of reduction necessary to prove that any given problem is also NP-Hard/Complete.
Historical Context
Section titled “Historical Context”Although the 21 problems found in Karp’s paper comprise what colloquially is called “Karp-21”, the fact is that Karp only introduced 20 of these problems. The first problem was introduced by Stephen Cook a year earlier. Cook (1971); Karp (1972).
The Root of the Reduction Tree
Section titled “The Root of the Reduction Tree”The first problem which is also the root of the entire reduction tree (Karp, p.96) is SATisfiability and all other problems are proved to be NP-Hard or Complete through a chain of reductions from SAT to problems underneath it till you reach a given problem interested in.
Tweaking Parameters and Complexity Classes
Section titled “Tweaking Parameters and Complexity Classes”Karp emphasizes earlier in the paper that modifying details about the 21 problems can cause a NP problem to now be part of .
Note: Tweaking parameters can change into NP problem and vice versa. Tweaking means slight changes to parameters.
cf. Karp, p. 89-90 () as compared to p. 94-97 (NP). Karp calls these problems their “close relatives”.
NP-Hard Problem Variants and Their P Counterparts
Section titled “NP-Hard Problem Variants and Their P Counterparts”The following table shows how various NP-hard problems relate to their polynomial-time solvable variants through parameter modifications:
| From NP | KARP(NP#) P | How tweaked? | NOTES |
|---|---|---|---|
| SAT-3 | 11 | SAT-2 | At most TWO variables per clause | |
| STEINER TREE | 16 | MINIMUM SPANNING TREE (MST) | Limiting the weight of the edges chosen to a specific region | Relaxation |
| MAX-CUT | 21 | MINIMUM CUT | Minimizing vs. Maximizing | |
| NODE COVER | 5 | ARC COVER | Set of Edges to Set of Vertices | |
| CHROMATIC NUMBER | 12 | 2-COLORING | If it can be colored with 2 colors then aka Bipartite Checking the graph is bipartite. Can easily be solved using a BFS. | |
| 3D MATCHING | 17 | BIPARTITE (2D) MATCHING | Lowering the dimensions by one | Matching actual here means NOT ie no duplicates |
| JOB SEQUENCING | 19 | SEQUENCING WITH DEADLINES | Removing the penalty parameters | Later authors rewrite the NP problem in terms of profit, instead of penalty. |
| 0/1 INTEGER PROGRAMMING | 2 | SOLVABILITY OF LINEAR EQUATIONS | The variables are reals instead of Boolean | Gaussian Elimination; Linear Programming is in (1984) |
| KNAPSACK | 18 | FRACTIONAL KNAPSACK | The variables are reals instead of Boolean | Consider sales of spices for example; Karp did not mention this variation. |
| UNDIRECTED HAMILTONIAN CIRCUIT | 10 | SHORTEST PATH | Shortest path determines minimum length, Hamiltonian Circuit tries to find the theoretical maximum. | |
| FEEDBACK ARC SET | 8 | ARC DELETION | Karp uses Arc Deletion to verify the solution to Feedback Arc Set. |
The above problems and their tweaked variants are elaborated on below.
Section titled “The above problems and their tweaked variants are elaborated on below.”SAT-3 (KARP#11) and SAT-2 (P)
Section titled “SAT-3 (KARP#11) and SAT-2 (P)”SAT-3 (KARP#11): Identical to SAT but the Predicate has a “normal form” ie a certain format. Each logic clause (represented by parentheses) can only contain upto 3 variables that are connected disjunctively (OR) and the clauses are connected conjunctively (AND). Intraclause (within) are the OR operators and interclause (between) are the AND operators. NOT (negation) can be applied to any variable without restriction, but cannot be applied to the clause (cf. DeMorgan’s law from discrete mathematics).
Karp showed that any Boolean Predicate can be written in SAT-3 normal form.
SAT-2 (P): Identical to SAT-3 but the number of variables that appear in any clause is at most 2.
STEINER TREE (KARP#16) and MINIMUM SPANNING TREE (MST) (P)
Section titled “STEINER TREE (KARP#16) and MINIMUM SPANNING TREE (MST) (P)”STEINER TREE (KARP#16): Is similar to MST but asks a decision problem of whether the Spanning tree has a sum of weights <= k (threshold provided by user). The key issue is Spanning Tree of which Subgraph. The user chooses a set of vertices for G’ (ie subgraph of G) but the problem allows you to select additional vertices not in the user’s set but in the original graph G. Suppose original graph had . The user has an interest in a subgraph G’ of the original graph G which also include the vertices V.
See next worksheet (“Steiner”) for an example of this. The weighted graph there is referenced to here. The idea is that in the user’s region for example, the cost of edge from Vi to Vj is 17 but there is another vertex Vk which is in G but NOT amongst the vertices chosen by the user such that the cost Vi -> Vk -> Vj is only 15! Steiner wants the program to find THOSE vertices. In modern literature these nodes not selected or specified originally but chosen when constructing the Steiner Tree are called Steiner/Ghost points (vertices). The MST however can ONLY use vertices / edges in the user’s SELECTED region.
MINIMUM SPANNING TREE(P). (MST): Analysis of algorithms provides Dijkstra, Kruskal, Boruvka algorithms to solve this problem.
Graph is the set of Vertices and is the set of Edges
is a subset of the Cartesian Product .
The problem involves a weighted graphs:
Graph as above where function maps to — weight aka cost
Find a subgraph of such that
-
- It is a connected TREE (no cycles).
- ALL vertices are present in this tree. (Spanning; Cover)
- (So, the subgraph is a subgraph of the edges, but not a subgraph of the nodes.)
- The output will be the subgraph that has properties 1 and 2 and in addition, the sum of the costs/weights of all the edges present in the subgraph chosen is minimal when compared to any other such spanning tree.
MAX-CUT (KARP#21) and MIN CUT (P)
Section titled “MAX-CUT (KARP#21) and MIN CUT (P)”MAX-CUT (KARP#21): Consider a weighted Graph where maps from to .
Divide the vertex set into two distinct subsets (termed a cut and no vertices in common between the subsets) such that the sum of the weights of the edges between these two sets (any edge with one vertex end in one of the sets and the other end vertex end in the other vertex set) is at least a user supplied threshold .
MIN CUT (P): Similar to Max-Cut but trying to divide the original vertex set into two disjoint subsets such that the cut (the edges between these disjoint subsets) has a total weight of edges <= to a user supplied threshold .
NODE COVER (KARP#5) and ARC COVER (P)
Section titled “NODE COVER (KARP#5) and ARC COVER (P)”NODE COVER (KARP#5): Find a set of nodes (user decides on ) such that each edge of the entire graph has some vertex in common with a vertex found in the node cover. Note: any number of edges are allowed to be connected to the same vertex of the node cover. As such the mapping is surjective and not necessary to be injective.
ARC COVER (P): Same problem but find a set of edges (arcs) such that each node of the graph is connected to some edge of the arc cover. Again, the user decides a priori a size restriction on the cover set.
CHROMATIC NUMBER (KARP#12) and 2-COLORING (P)
Section titled “CHROMATIC NUMBER (KARP#12) and 2-COLORING (P)”CHROMATIC NUMBER (KARP#12): Given a Graph with vertices. Choose (chromatic) colors and assign each color to a subset of vertices such that only one color is assigned per vertex and no edge has its two endpoints having the same color.
The verification is straightforward once the nodes have been “painted” (assigned a color).
2-COLORING (P): Same problem but specifically. The decision of whether and if so, how this can be done, can simply be accomplished by a breadth-first search. For any node with color , its neighbors will be given color . If a vertex has been colored with color and then it turns out that the neighbor has already been colored also with color , then the Graph cannot be colored as such. (Also known as Bipartite Checking.)
3D MATCHING (KARP#17) and BIPARTITE (2D) MATCHING (P)
Section titled “3D MATCHING (KARP#17) and BIPARTITE (2D) MATCHING (P)”3D MATCHING (KARP#17): Start with a finite set . Consider the search space universe where x is the Cartesian Product. (Simply, consider a lattice structure of points in a 3d space.) The user selects a subset of these 3d points and further selects a subset of this subset such that and amongst these points in , focusing on each dimension separately, no two points have the same number in any of the dimensions.
BIPARTITE (2D) MATCHING (P): Identical to 3D matching, except that the original search space universe is , instead of .
JOB SEQUENCING (KARP#19) and SEQUENCING WITH DEADLINES (P)
Section titled “JOB SEQUENCING (KARP#19) and SEQUENCING WITH DEADLINES (P)”JOB SEQUENCING (KARP#19): Given tasks (jobs) J1,…,Jn each requiring a specific number Ti days respectively to complete and each must be completed by their respective calendric deadline Di. If a Task Tj misses its deadline Dj, then Penalty Pj is deducted from the total payment. The problem is to determine whether a sequence (permuted order) of these jobs can be found so that the total amount of penalty accrued is <= a tolerance threshold.
SEQUENCING WITH DEADLINES (P): The same problem without penalties. The decision problem is to find such a sequence of jobs so that no more than k jobs miss their deadline.
0/1 INTEGER PROGRAMMING (KARP#2) and SOLVABILITY OF LINEAR EQUATIONS (P)
Section titled “0/1 INTEGER PROGRAMMING (KARP#2) and SOLVABILITY OF LINEAR EQUATIONS (P)”0/1 INTEGER PROGRAMMING (KARP#2): Given a system of linear equations consisting variables whose main coefficients appear in a matrix where is the set of integers. is a two dimensional matrix of dimensions consisting of rows and columns. is also the number of variables to be solved for but these variables can only have binary values. This system is fully described by where is as mentioned and , the target output of each of the rows (equations). The problem is to find binary values for such that .
SOLVABILITY OF LINEAR EQUATIONS (P): Same problem but every number can be real numbers. (This is directly solved by Gaussian Elimination algorithm.)
If Karp wrote this paper today, he probably would have said linear programming which Karmarkar(1984) showed is in . The word “programming” was first used by mathematicians (most typically numerical analysts) and simply means a system of equations to be solved. Linear programming is to optimize a function where is a vector of variables subject to a set of linear equations (linear means that the variables in the constraints are not to raised to any power other than 1 (ie the default).
KNAPSACK (KARP#18) and FRACTIONAL KNAPSACK (P)
Section titled “KNAPSACK (KARP#18) and FRACTIONAL KNAPSACK (P)”KNAPSACK (KARP#18): Mathematically, given a sequence of items whose weights are , find binary values for such that .
means packing item; means don’t.
All and are positive integers and are binary values. (For example, pack a suitcase with no more than 50 lbs. on airplane.) In Prose (avoiding mathematical formulas), this can be reworded as the following: Given a set of items whose weights are , select a subset of them such that the total weight of the items selected is . In the literature, this formulation is called the Subset Sum Problem.
FRACTIONAL KNAPSACK (P): The same problem with values . (For example, spices salesperson with a partially filled bottle as a sample.)
UNDIRECTED HAMILTONIAN CIRCUIT (KARP#10) and SHORTEST PATH (P)
Section titled “UNDIRECTED HAMILTONIAN CIRCUIT (KARP#10) and SHORTEST PATH (P)”UNDIRECTED HAMILTONIAN CIRCUIT (KARP#10): Given graph where (directed in #9; undirected in #10), find a cycle that traverses (“goes through”) all nodes of the graph (termed Hamiltonian). The exhaustive search will generate all permutations of the nodes. A cycle is a permutation of the nodes such that each node is connected to the node in front of it. Verification is simply validating that each node is connected to the next node in the proposed permutation and the last node is connected to the first. The solution to the problem will be store in an array of size n.
SHORTEST PATH (P): Given where (each edge has a weight or cost w). Find a path between two vertices in a weighted graph such that the sum of the weights of its constituent edges is minimized. Undirected Hamiltonian Circuit is not a weighted graph per se, but the connection between the two can easily be achieved by simply considering an unweighted graph as if each edge has weight or cost = 1. There is a major distinction however in that the Hamiltonian Circuit of a simply weighted graph is the longest such path (max) whereas shortest path is shortest (min). I assume that Karp considered this a tweak because max vs min cut is P vs NP and is mentioned in this paper (lines 61-68 above). One last note to this discussion is that longest path is actually in P for directed acyclic graphs. In addition, a longest path problem can be converted to a shortest path problem if negative lengths are allowed. (They are dual problems of each other.) cf. Bulterman et al (2002) on trees and others have expanded to directed acyclic graphs based on Topological Sorting (Kahn, 1962).
FEEDBACK ARC SET (KARP#8) and ARC DELETION (P)
Section titled “FEEDBACK ARC SET (KARP#8) and ARC DELETION (P)”FEEDBACK ARC SET (KARP#8): Given a graph and a bound on the size of the Feedback set,
The user selects a subset (cardinality k) of edges such that every cycle C in graph G shares (at least) one member of this feedback set with cycle C. How do we verify the perfect solution? There is no simple algorithm for generating all cycles found in a graph.
ARC DELETION (P): If a group of k arcs break all cycles of a graph, then delete these from the original graph.
If the proposed feedback set is correct, then the subgraph remaining after deleting the feedback set will not contain any cycles, and there polynomial timed algorithms that will verify that the resulting subgraph does not contain any cycles.
Conclusion: Karp’s 21 Problems and Tweaked Variants
Section titled “Conclusion: Karp’s 21 Problems and Tweaked Variants”This concludes the list of NP-Complete problems mentioned in KARP-21 which have a “tweaked” version that is in .
Grouping the KARP-21 Problems
Section titled “Grouping the KARP-21 Problems”The next worksheet groups a number of problems based on some commonality (the commonalities are summarized here.) Some other “groupings” for the KARP-21.
| KARP# | TITLES | In Common |
|---|---|---|
| 1 | Satisfiability | The solution is expressed in terms of finding n boolean (0/1) variables. |
| 2 | 0/1 Integer Programming | |
| 11 | SAT-3 | |
| 18 | Knapsack | |
| cf. line 105 | ||
| 7 | Feedback Node Set | The problem definitions of these all involve cycles. |
| 8 | Feedback Arc Set | |
| 9 | Directed Hamiltonian Circuit | |
| 10 | Undirected Hamiltonian Circuit | |
| 3 | Clique | These problems seek cliques for their solutions. |
| 13 | Clique Cover | |
| 9 | Directed Hamiltonian Circuit | The solutions are finding a permutation of the elements of an original set. |
| 10 | Undirected Hamiltonian Circuit | |
| 19 | Job Scheduling | |
| 12 | Chromatic Number | The solutions are finding a permutation of the elements of an original set. |
| 20 | Partition | |
| 21 | Max Cut | |
| 4 | Set Packing | These are all related to the mathematical (discrete math) notion of a partition. |
| 6 | Set Covering | |
| 14 | Exact Cover | |
| 20 | Partition | |
| 5 | Node Cover | These are all the remaining problems not listed above. |
| 15 | Hitting Set | |
| 16 | Steiner Tree | |
| 17 | 3D Matching |
Now, we concentrate on the actual problem definitions and which problems are related in definition or in problem solving.
See the next worksheet tab for Karp 21 Problem Definitions and the elaboration on the above groupings.
P-Completeness
Section titled “P-Completeness”We have concluded Karp’s paper and have discusses extensively the notion of NP-Completeness and 21 examples of this. Analogously, we can define
P-Completeness: A decision problem is P-complete (complete for the complexity class P) if it is in P and every problem in P can be reduced to it by an appropriate reduction.
Two Classic P-Complete Problems
Section titled “Two Classic P-Complete Problems”(1) Circuit Value Problem
Given a hardware boolean circuit comprised of NOT, AND, OR circuits,
What will the boolean output be of a given boolean circuit for a given provided input.
(2) Step Counter Predicate
Given a Turing Machine TM implementing an algorithm and an input x for that machine, with a number T (as TIME) written in unary,
does that machine halt on that input within the first T steps?
Although beyond the scope of this course, one of the reasons that P-completeness is explored so much is that it helps understanding complexity on parallel processing machines.
cf. P-complete