To start, again, we're going to continue
our discussion of the 21 problems.

We got pretty far.

We'll be completing
the mathematical and

practical definitions of
these problems today.

So, the problem we are up to, well,
the way I grouped them anyway,

is the rest of the problems.

Last time we did the partition and packing
problems.

So, the rest of the problems.

Now, the rest of the problems,
besides not having a place, you know,

unifying category that
can put them together,

they're actually
individually not so simple.

Well, a number of them.

Anyway, so I may have... I'm going to
elaborate on the discussions regarding them.

After doing those discussions,
we then are going to talk about the

cousins to these 21 problems, if they're
obvious.

Meaning, the ones that if you tweak to
tweak, T-W-E-A-K, that's one of the tabs.

We are going to tweak
the problems so that the

NP-hard problem all of
a sudden becomes NP.

All right.

So, we have node cover, carp number five.

Hitting set, carp 15.

Steiner tree, carp 16.

And 3D matching, carp 17.

Of these four, the Steiner tree is the
most used in practice.

The one that's more, the most important,
I guess, in practice.

The others are a little bit, slightly
esoteric.

You don't bump into them too much.

But anyway, here we go.

So, let me just make sure.

Let me do a quick sound check.

I'm going to make sure that the sound's
good by you.

If you could just write a quick sound
check.

Sound good by you?

Could somebody respond?

Yeah.

Okay, good.

Thank you.

That's fine.

Good, good, good.

All right.

So, node cover.

So, node cover... So,
the issue of node cover...

And here's going to be a tweaked problem
called arc cover, the cousin.

And why that's important is because above,
carp pointed out that the feedback sets,

in the problems seven and eight,
feedback node set, feedback arc set,

which we change node to arc, but basically
the problem's the same.

Now, the problem that we have here is not
related.

But the reason I'm
connecting it is that here

we have node cover, but
we don't have arc cover.

And carp is going to point out
that, in fact, arc cover... Here we are.

Arc cover is actually in P. So,
that's an interesting point, that tweaking

it in one place, both of them are MPHard,
the seven and eight feedback node set,

tweaking it from node to arc, feedback arc
set.

But yet, when you come to node cover,
it's cousin arc cover is actually in P.

Alright, what is this node cover?

So, again, with a parameter, let's say
parameter M.

So, you have to choose, you have N Nancy
nodes, and then you, the user chooses M,

presumably less, than the number of nodes
that actually are in the entire graph.

Such that, every edge touches at least one
of those nodes in your node cover.

So, we want you to make a representative
set of nodes, and the parameter M

restricts how many we're allowing you to
join that set.

Because, obviously, if you
were to take the entire set

of nodes, well, then every
edge touches those nodes.

Right?

So, that's trivial.

So, obviously, the problem only becomes
hard when we say you can't use...

You have a graph of 100 nodes,
and then we tell you, pick 10.

And the question is, will the edges touch
all those nodes?

Okay?

Now, obviously, an edge can only touch,
at most, two nodes.

Right?

Because an edge is, by definition,
defined by two endpoints, two nodes,

two vertices.

Okay?

But, so you can have this case in node
cover.

In discrete mathematics, they would talk
about, for example, the star graph.

I'll put it here on this side.

Let's see if I could quickly do that.

It's just a very interesting case for node
cover.

Let me quickly draw that.

Okay.

So, let's say we take
this one, and then we take

this, and then we take
this, and we take this.

Okay, let's do it almost there.

And then we'll have,

and then, in that cell.

Okay.

That would be this one.

Good.

And then we have this over here.

And then one more, and then we're done.

Let's say we do this.

Yes.

Good.

Oh, no.

That was wrong.

Sorry.

Well, it's not so bad.

Okay, good.

Excellent.

So now, let's imagine.

So this is our graph.

Mistake here.

Format cells.

Let's get rid of that one.

Yeah.

That's it.

No.

Oh, another one.

Another mistake.

Good.

Format cells.

Now we're clean.

Good.

All right.

So this is, let's say, V1, V2,
vertex 2, vertex 3, vertex 4, vertex 5,

vertex 6, vertex 7, vertex 8, finally
vertex 9.

So there are nine.

So there are nine,
let's say there are nine,

what you call it,
there are nine vertices.

And this is called a star graph.

In discrete mathematics, it's called a
star graph.

The point is that V5 alone is a node
cover.

Right?

Because every edge touches V5.

So that's the idea.

So it can be very minimal.

Just because you have a few nodes,
it doesn't... it can actually represent,

in some sense, all the edges that are
incident or adjacent to it.

Okay?

So that's the story here.

So that's a very good
example, extreme example, of

what we're trying to
accomplish, the node cover.

You know, okay, well,
later... You know what?

Okay, give me a sec.

No, you know, I don't want to touch it.

Okay, later, I'll update it, then I'll send
you the final version when we're all done.

Okay, I was going to put this underneath
here.

Anyway, not important.

Okay, good, good, good.

So that's the story, and that's called
node cover.

So node cover is MPHard.

Fine.

But it's tweaked version called arc cover,
in fact, is P. So that's very interesting.

So what would be arc cover?

Arc cover would say, choose a bunch of
edges, and the opposite.

Does every node touch those edges?

Well, that's obviously
MP, because you just have

to list the nodes that the
edges are comprised of.

And, you know, and that's it.

So you'll be able to verify or not whether
all the nodes touch those edges.

Anyway, in the case of edges, in the case
of nodes, there are many, much more work,

because there's many more edges.

Anyway, it turns out, and CARP proves it,
that whereas node cover is in NP,

arc cover, its cousin, will be in P. Okay,
so that's our first problem.

Not too hard.

Again, I'm not aware... Okay, I guess
I could see a little bit of application.

But anyway, we get to hitting set.

It's an independent problem.

So let me move the screen over a bit.

Hitting set.

So hitting set's another... All of
these are interesting in their own right.

All right, hitting set.

And here I give an example.

It's easiest to understand, and then we'll
read it through with the example.

So let's look at the example lines 177
through lines 183.

So we have a set U of elements.

The letters I tend to borrow from CARP.

So that way, when you read the paper,
you can relate to what CARP's saying.

Anyway, so the question is... So
you have a bunch of subsets, right?

You have a bunch of subsets of the
original set.

So the original set on line 176,
the universal set, as it were,

called U, is 1, 2, 3, 4, 5, 6.

7, 8, 9, 10, 11, 12.

Okay.

Then you have a bunch of subsets.

Someone had subsets.

That existed for whatever reasons.

You have these subsets.

U1 is 2, 3, 4, 5.

U2 is 1, 2, 3, 4.

U3 is 5, 6, 7, 8.

U4 is 9, 10, 11, 12.

Good.

What's a hitting set?

So a hitting set is also a representative
set.

So it's a set of elements
from the original set, such that

each subset has only one and
only one number in common.

So the hitting set here, W on line 185,
is subset 159.

W is the hitting set is also a subset.

So here the hitting set
really means a touching set,

but it only has one and
only one element in common.

So let's just verify that's the case.

So in subset U1, the 5 is the only element
in common with 159.

And subset U2 on line 179, 1, 2,
3, 4, the 1 is the only element in common.

And notice, immediately
we notice, that the

element in common is not
the same element, right?

As opposed to here, for
example, when I gave you the V5

was actually the representative
node for all the edges.

But here it's not the case.

We just need that
sum, and well, that's an

extreme case, okay,
to be perfectly honest.

Node cover also will have a bunch of nodes,
and it's not necessary that one node.

..

That's not true.

So one node does not have to represent
everybody.

You can have a bunch of nodes.

Here also you have a bunch of numbers.

Anyway, so subset U1 has the number 5 in
common, and that's it.

1 and 9 is not in common.

Subset U2 on line 179 has the 1 in common,
does not have the 5 and 9 in common.

And the notes in column E describe this.

In line 181, subset U3 has the 5 in
common, but not the 1 and the 9.

And finally, on line 183, subset U4 has
the 9 in common, but not the 1 and the 5.

So that is legitimately a hitting set,
okay?

Now, if I had used 2359, so maybe I should
give an example here.

But...

Let me get rid of that.

One second.

Let me get rid of the format.

Fine.

But...

X equal...

1... Not on 5, 9.

Let's say I do 2, 3... Yeah.

Actually, 2, 5, 9 would also be.

Let's say I would have said 2...

Oh, no.

It wouldn't have been.

Good.

All right.

Good.

Let's say I would have said 2,
5, 9.

Is not... That's what I used to.

..

My mouse jumps sometimes.

Is not a hitting set for subsets.

Why is it not a hitting set?

So that, hopefully, is clear.

That is bad news.

Hold on a second.

Wait one more time.

Okay.

Good.

Good.

But that is not a hitting set for...

I'm sorry.

Because...

Since... 2 and 5 are present
in subset... In subset U1.

Okay.

So here are two elements.

In subset U1, 2 and 5 are present.

In subset 2, only 2
is... No, it's from X.

In subset U2, only 2 is present.

That's good.

Subset U3, 5 is present.

Only that one.

That's good.

Subset U4, 9 is present.

But here are the...
In... In subset...

It is not... Why?

So it's not.

Because in subset U1, both the 2 and the 5
are present.

And the...

The requirement is...

And mentioned mathematically on...

And that's what's written in Karp's paper
on line 175.

That the intersection of the hitting set
with each subset has to be one and only

one element.

Okay.

So that's called hitting set.

Good.

All right.

All right.

Now... Give me one second.

Did the... Oh,
that's interesting.

I don't know why I did that.

I... Okay.

I usually put the line only to separate
groups.

Here I put the line, I think, because the
next two need a lot more description.

Anyway, but it's still part of our group.

5, 15, 16, and 17, which is the
miscellaneous leftovers.

All right.

Steiner tree.

So Steiner tree is a very interesting
problem.

Very, very, very important problem.

I should mention in general, the problems
that Karp define are...

A lot of them are incredibly,
you know, applicable and

have applications to many
applications still today.

The thing... One point you
have to keep in mind is that the...

As time developed, people started adding
other...

More and more constraints or more...

Not constraints, but more
and more generalities, I

should say, to the problem,
making it even harder.

So, for example, simple example.

Partition.

What's that problem?

20, I think.

Whatever it was.

Partition problem, right?

You have a bunch of elements.

Can you divide them into two groups such
that when you add up the weights of one,

it equals the weights of the other.

But Karp said it's a two-partition.

But people love that problem.

And a lot of research is done on that
problem.

It has a lot of applications even to
parallel processing.

I think.

The loads of processing loads.

Cycle times of different processors.

Anyway, you have tasks and processors.

It's called scheduling, load balancing,
other names.

So, that is part of K-Partition.

In other words, it...

Sorry.

It generalizes the problem from 2 to K.

Now, for the exam, you've got to be
careful.

Please understand that you're responsible
for...

What?

You're responsible for Karp's definition.

All right?

So, please be...

I understand people, when they write it
up, sometimes change it slightly.

And they may also be empty-hard.

But we have to stick to Karp's paper for
consistency of exams.

But it is true that a lot of problems
exist and variations, many, many,

very practical variations that are still
empty-hard have been proposed and

discussed in practical terms over the many
years following Karp, right?

We're... What are we?

We're 50... Almost 55 years.

Wow.

That's crazy.

Almost 55 years since Karp wrote his
paper.

54.

Since Karp wrote his paper, published his
paper.

Anyway, Steiner Tree...
That's why I'm mentioning it.

Steiner Tree is one of those problems
that...

It has tremendous, tremendous applications
till today for network and civil...

For civil planning and network planning.

So anyway, here we go.

Steiner Tree.

What do I want to... I want
to add a comment to this.

We mentioned it, but I don't think I wrote
it.

Give me a second.

Here I want to say it.

My screen is going a little slow.

Where is it?

Yeah, good.

Note.

It is quite acceptable
that the common element

should be shared by
other subsets as well.

such as five...

Element five...

is in common with both subsets U1 and U3.

That's okay.

You just can't have two elements.

What is not okay.

What is not okay.

What is not okay.

What is not all right.

Is that two elements of the hitting set are
in common with one, with a given subset.

subset.

Such as...

As... Such as... Uh...
Such as x... Subset x below.

Right?

That's the one that had both two and...

Two and five in common with U1.

Okay.

Two or more.

Good.

All right.

That was just a little note.

Okay, here we go.

Steiner Tree.

All right.

We mentioned it... All right.

Uh... We did?

Give me one second.

I don't think... Well, I mentioned it in a
tab.

I thought we mentioned it elsewhere.

Come on a second.

Do I mention it in here?

Oh, here.

Okay.

So it's a... Okay, let's write this.

Let me reword that.

Defined in worksheet tab.

Oh, what happened?

All right.

I thought I got it.

Defined in worksheet tab.

Tweak pProblems.

Starting at line 37.

Okay.

Is that true?

Let's see.

38.

Okay.

I put a space.

All right.

So just for consistency, at line 38.

Okay.

So we'll go through the definition.

And why is it there?

It has a tweaked version,
minimum spanning tree, which is

in P. And Karp mentions that
on page 89 and 90 in his paper.

Right?

Page 89 and 90 of the paper.

I'm not going to load it, but we've
mentioned it before.

We went through it before.

And then when we first started.

So on page 89 and 90, Karp collects a
bunch of problems.

And it's not so exactly... He mentions it
briefly, so you may not catch it.

But the reason he picks
those problems is because

they're somehow related
to the MP hard problems.

But so Karp collects a bunch of problems
in P. And if you just look at the problems

and you didn't realize why Karp's including
them, it looks like an esoteric bunch.

Right?

But they're actually
there because Karp felt

they are similar tweaks
to the MP hard problems.

But the 89 to page 89
and 90 are in P. Whereas

the subsequent
problems are in MP hard.

All right.

So the question is, let's start with the P
problem.

And then we will ask... Because the P
problem is easier to relate to.

And then we'll say, what does the Steiner
tree do differently?

Okay?

So let's go through a minimum spanning
tree.

MST.

All right.

So it's an application of a weighted graph
which involves a tree.

Minimum spanning tree.

And in analysis of algorithms,
you will learn three different algorithms.

Prim, Barovka, and Kruskal.

Kruskal is the simplest to describe.

So that's the one we're going to do here.

It's a greedy algorithm.

And we will mention greedy means that you
do what's best now without thinking of

whether it's going to be best for the
future.

You don't think about the future.

That's not necessarily a good policy.

But that's what greedy, the definition of
a greedy algorithm is.

What's best now?

Right?

A child eats lots and lots of cake and
cookies and candy and chocolate.

For the child's perspective, that's what's
best now.

And then a little bit
later, the child goes to its

guardian, its parent, and
says, I have a stomach ache.

Right?

The child is not yet capable of
considering the future.

Hopefully, as adults, we do.

But whatever.

But that's called the greedy algorithm.

So one has to be careful in general when
you learn analysis of algorithms because

there's a misimpression
that's given, not

purposely, but sometimes
I find students have this.

And that when they leave
analysis of algorithms,

there's a whole chapter
usually on greedy algorithms.

In analysis of algorithms, 323,
700, whichever course you're taking.

And the students sometimes
walk away believing

that greedy algorithm
always gets the solution.

It's not true.

Greedy algorithm is an approximate
solution.

Once in a while, it actually gets the true
solution, but you have to prove that.

And maybe it's a misimpression,
but what the typically analysis of

algorithms books do is put the best foot
forward.

And they'll show you, give you an examples
of the greedy algorithm where it works.

So they collect the places where it works,
and then you walk away thinking that,

oh, greedy algorithm solves problems.

No, it doesn't.

More cases than not, it does not solve the
complete problem.

It may be a, you know, if you have no
other choice, it's a quick, it's a quick,

usually runs quickly, and may get,
you know, somewhere along the line.

So it's an approximate solution.

You may even be happy
with that approximate solution,

but it does not necessarily
get you the true solution.

Okay?

Okay.

Now, that's why I put in line five.

No, greedy does not always guarantee the
true solution, but in our case,

Crisco proved that it works.

So here, greedy does work.

All right?

So we start with a weighted graph.

A weighted graph is a graph, vertices,
and edges.

But in addition, you have, each edge is
associated with a weight, a cost, a toll.

Imagine it's bridges, and their bridges
are roads, and their tolls.

T-O-L-L-S.

You have to pay something to the city
who's maintaining the road.

So each edge, let's assume, has a cost.

Right?

So that's E, edge.

The W is the weight
or weight function that

maps the edge to R as
real numbers, typically.

It could be integers, but typically real
numbers, and typically plus.

It doesn't have to be.

To be a weighted graph, you don't have to
be R plus.

It's just that positive real.

R plus means positive real.

Weighted does not require a positive,
but most applications do.

So that's why most people
will not get involved with

the value zero, and certainly
not the value negative.

It messes up many algorithms, so people
just don't get involved.

So you'll find that people
will say a weighted graph is

positive, but it's not really the
definition of weighted graph.

It's just very practical,
and most applications

have that the weighted
graph is, in fact, positive.

The weights are, in fact, positive.

Okay, now.

So we know what... hopefully we remember
what a tree is.

A tree is the data
structure or math structure,

graph structure, where
there are no cycles.

A cycle means that you... what?

A cycle means that you have a pathway from
node to node to node.

It doesn't have to be every node.

If it is, it's called a
Hamilton or Hamiltonian,

named after the person
who first analyzed it.

Anyway, whatever it is, the cycle means
you have some nodes, and there's an edge

from one node to the
next, and one last edge from

the last node in your
sequence back to the first.

So you're cycling, you're coming back,
you're making full circle.

In Old English, that was called a circuit.

Okay, we mentioned that
about the circuit courts, because

that's what they used to do
when they used to go town to town.

The judges used to go town to town on a
given day, and they would adjudicate and

then come back eventually to their home
base.

And they would make these monthly,
yearly, I don't know, cycles, circuits.

Anyway, those were called circuit courts.

Anyway, that's the case.

So now, you have a tree.

A tree has no cycles.

Now, your graph has cycles.

What we're interested in is finding a
subgraph.

Not all the nodes, but not all the edges,
at least for Minimum Spanning Tree.

In the case of Steiner,
it may not involve

all the nodes, and
we'll talk about that next.

But right now, for Minimum Spanning Tree,
we want all nodes involved.

So we want a subgraph that has all nodes,
but not all the edges, such that the

remaining subgraph is a tree.

No cycles.

Spanning, and that's the key word.

Spanning means involving all...

Spanning means involving all nodes.

Okay?

So let's just write that quickly,
what each word is trying to say.

But that's interesting.

Maybe if I do it this way.

I wonder...

All right, I'm curious if this will make
things a little more clear.

Maybe, yeah, maybe not, but I'll try it
this way.

Give me one second.

Okay.

So let's... Okay.

Tree.

In other words, I'm trying to emphasize
what each word refers to.

So tree means no cycles.

A subgraph that has no cycles.

Spanning means all nodes or vertices are
present in this subgraph, in this tree.

Minimum.

Minimum.

And the word minimum is saying the
following.

The cost.

The cost of a spanning
tree is the sum of the

costs of all the edges,
of all the edges involved.

Choose the one that has minimum cost.

Right?

There may be many.

There's certainly many subgraphs that are
trees.

And if you add up the corresponding...

If you add up the corresponding costs,
there are many different costs.

Choose the one that has minimum cost.

Okay.

Hopefully that's a little bit easier way.

And this is just the acronym, MSD,
minimum spanning tree.

So...

Okay.

Good.

So Kruskal came up with an algorithm.

And again, prim and berufka, but we're not
analysis of algorithms.

So we'll do the simplest one.

It's a greedy algorithm.

All right.

So what's the greed... What
is Kruskal's greedy algorithm?

Start with a weighted graph and we
find a subgraph that involves... Okay.

That's what I mentioned.

Good.

All right.

Okay.

This is sort of a repeat of what I just
said above.

Which do you choose?

The one with the smallest totals.

Cost?

Fine.

Okay.

I'm not upset that I'm repeating it.

I think I'll leave it in.

Because it just... Otherwise
it's a little too cryptic.

But here it is.

Anyway, I'll elaborate on this.

All right.

Kruskal has three general steps.

And we're going to go through an example.

Sort all edges from smallest to largest
cost associated with each edge.

Okay.

So sort the edges based on the costs or
weights.

We call them costs.

But it's a way to graph some...
We'll also call them weight.

Associated with each edge.

Okay.

Of the remaining edges not yet chosen,
choose the smallest cost edge.

Again, that's what makes it greedy.

It makes sense locally.

The idea is to pick minimum.

So if you had to add one more edge,
and you don't think about the future,

so what makes sense right now is to take
the smallest edge.

That's what makes it greedy.

Right?

Because now we're doing this without
consideration for the future.

It may or may not be a good idea.

And Crisco shows, in this case,
it works.

Such that when added into the tree you are
growing, a cycle does not result.

Obviously, you can't ruin the fact that
it's a tree.

Stop when all vertices have been reached
by the edges of your chosen tree.

Crisco proves that this greedy,
right, because it's what's best now

without consideration for the future
approach, actually works.

Okay, let's do an example.

So let's...

Okay, let me just make sure it
appears well on the... Oh, it does.

It appears nicely on the page.

Good.

On the screen.

All right.

So on the left is an example
of how many different... Nine.

Nine different nodes numbered from zero to
eight.

Nine different vertices with an edge
associated with each.

And it turns out, on the right,
I gave you the answer.

That's gonna end up being, after we go
quickly through this greedy algorithm,

that's gonna end up being the minimum
spanning tree.

All right.

So who's the smallest cost?

So the smallest cost edge is the cost one.

And that's between seven goes to six.

So that's chosen first.

On line 40, we choose it first.

Okay?

All right.

Now, who is the next edge?

So the next edge will be...
The one was already taken.

So you can't take that.

So now the next edge has a cost two.

Now there are two of them.

There are two edges with cost two,
and you could arbitrarily pick them.

It's two goes to eight, or six goes to
five.

In fact, those are gonna be in sequence,
the next two, and it won't be relevant

which one you pick first, or which one you
pick second.

The bottom line is that neither of these,
from two to eight, or six to five,

neither of these will cause a cycle.

All right.

Now, who's next?

After you did the cost one, you did the
cost two, cost four is next.

Now cost four also has two.

Two goes to five, and zero goes to one.

Right?

Again, which one you pick is arbitrary.

In our case, both of them, when adding
them, do not cause a cycle.

So we will pick zero goes to one,
let's say, and two goes to five.

And those are the two edges that have cost
four.

Again, we did cost one, we did cost two,
and now we did cost four.

Okay.

What's after cost four?

After cost four is cost six.

And that does exist, the edge six to
eight.

But we cannot choose it.

Why?

Well, it may not be so obvious,
but there's a cycle right now.

If you choose six to eight, or eight to
six, there's a cycle.

Why?

Because we chose the cost twos and the
cost fours.

So six goes to five with a cost two.

Note five goes to note two with a cost
four.

Two goes to eight with a cost two.

If you were to choose eight goes to six,
that would complete the cycle.

So we cannot, cannot use the edge six,
eight goes to six, or six goes to eight,

because that would form a cycle.

Okay.

So cost six is out.

And now we have cost seven.

There are two of those.

Two goes to three, and seven goes to
eight.

Both of those would cost seven.

Again, which one you choose first is not
going to make a difference.

Neither of them create a cycle.

So two goes to three, seven goes to eight.

Ah, oh, oh, oh, oh.

Okay, hold on.

Two goes to three is okay.

But seven goes to eight is not.

Let's see why.

Seven goes to eight
would also complete the

cycle, because seven
goes to six with cost one.

Six goes to five with cost two.

Five goes to four with cost...
Oh, it's even simpler than that.

Okay, I don't need the big one.

Very simply.

Seven goes to six with...
No, no, we do need it.

I'm sorry, because we didn't use the cost
six.

Okay, good.

Seven goes to six with cost one.

Six goes to five with cost two.

Five goes to two with cost four.

Two goes to eight with cost two.

So if we were to choose 7 goes to 8,
we will have completed that cycle.

So 7 goes to 8 with cross 7 will not be
used.

Okay, so now that after we did cross 1,
cross 2, cross 4, cross 6, and cross 7,

now we do cross 8 and there are two of
them.

One goes to 2 and 0 goes to 7.

One of them is acceptable, one of them
causes a cycle.

0 goes to 7 will not cause a cycle and
that's our seventh choice.

So we will take that.

But 1 goes to 2 will cause a cycle because
we've already chosen 0 goes to 7.

1 goes to 2 would be
a cycle because nodes

1, 2, 5, 6, 7, 0, 1 will
have caused a cycle.

Those involve the other 8 and the costs 1
and 2 and 4, which we've used.

So that would be a big cycle, but it's a
cycle nonetheless.

So on the note... sorry, note.

Column F, I give the
actual cycle if you want

to follow along later
when I send you the notes.

Well, you have the notes, okay.

Actually, when you read the notes.

Okay, now, the last one, it turns out, is
the... So we did cost 1, 2, 4, 6, 7, and 8.

We didn't take all of the edges because
they'll cause cycles.

And we're all up to cost 9.

And now 9 is 3 goes to 4 and we take it.

But we will stop at this point because all
vertices have been reached, spanning.

So we're done.

And that will result in the tree on the
right.

And the cost, if we add it up, the cost
to this... So let's read it on the right.

1 goes to 0 is 4.

0 goes to 7 is 8.

That's 12.

7 goes to 6 is 1.

That's 13.

6 goes to 5 with a 2.

That's 15.

5 goes to 2 with a 4.

That's 19.

2 goes to 8 with a 2 is 21.

2 goes to 3 with a 7 is 28.

Finally, 3 goes to 4 with a 9 is 37.

So when you add it up, the sum is 30... The
cost of this minimum spanning tree is 37.

And I merged the two pictures.

So now we could...

I emphasize the edges within that are,
in fact, the minimum spanning tree.

And it does...
In fact, it's a tree.

It's a subgraph.

It's a tree.

It does not have cycles.

And it touches every note.

Okay.

That is minimum spanning tree.

And again, there are variations.

For instance, the issue with Prim...

Prim, just to throw it out, we're not
going to really spend time.

But if you're interested, Prim was
bothered by the following.

Let's take an example.

Our first... Let's
see if this... Yes.

So 7... When we first started
on line 40, 7 goes to 6 with cost 1.

But then let's say we took...

2 goes to 8.

We could have taken 6 goes to 5.

But we took 2 goes to 8.

So 7 goes to 6.

2 goes to 8.

What bothered Prim is that if you take 2
goes to 8, like Criscoll suggests you can

do, 7 to 6 and 2 to 8 are not connected.

They're disconnected.

And Prim was concerned
that how can you prove that the

disconnected components
eventually connect to each other?

To become one connected component.

So Prim, we're not going to go
through it, but Prim is... It has...

When you're building the minimum spanning
tree, Prim will require that you could...

You have to grow it from the nodes and
edges that...

Nodes that you've already chosen.

Only somebody who's connected to what
you've chosen can be added in.

In order to guarantee that's a connected
component.

And what's... Barovka is the most
interesting to me for the following reasons.

For multiple reasons.

Prim and Criscoll, I think, are the 1950s
in America.

Barovka was in Russia and he did his work
in the 1910s or 1920s, many years earlier.

And the sad effect of what was called the
Cold War, the Iron Curtain, as it were,

was that there were many
brilliant mathematicians

who published in Cyrillic
very important math papers.

And the rest of the world did not have
access to those journals.

And Prim and Criscoll had no idea of
Barovka's algorithm.

Barovka discusses minimum spanning tree.

He does graph theory on his own,
not computers.

He does graph theory.

And he discusses the minimum spanning
tree.

And in it, he mentions both Prim and
Criscoll's suggestions of anachronism

because they weren't around yet.

And Barovka's algorithm is even better
because it can be parallelized.

So it's very fascinating and very sad.

And then in the late 70s, early 80s,
there was a big effort to translate all

Russian math journals because they
realized it's a wealth of information.

Anyway, interesting enough.

Anyway, that's Criscoll's algorithm and
that's the minimum spanning tree.

Okay, so that's the minimum spanning tree
and all that is in P. Now we're going to

tweak it and become Steiner's tree problem,
which is going to sound very similar.

And the question is going to become,
why is Steiner in fact not P?

But Steiner is empty heart.

All right, so the issue at hand is the
following.

I make life a little simpler for you.

The fact is that I could have done more of
a running start to build up the momentum

and excitement, if
mathematics excites you that is,

where you would realize
what the problem at hand is.

Not the problem that the Steiner problem,
what the problem with having Steiner is,

which is that Steiner, when you read Karp's
paper, Steiner sounds exactly like this.

That's the problem.

When you read Steiner, when you read
Steiner, when you read Karp's original

paper, and please do that for homework,
I'm not going to ask you if you did that,

but I think you should try it on your own.

The words sound exactly like this.

He basically talks about a minimum
spanning tree.

The only difference is, he says, if you
start with a subgraph as your first graph.

So you have a big graph, you have a
subgraph, and now you're looking for a

subgraph of that subgraph that's a minimum
spanning tree.

So what?

So you've knocked out some of the nodes,
and now you make less nodes.

So it's again a minimum spanning tree.

What's the big deal?

Why is Steiner MP hard when the minimum
spanning tree is MP?

So I got to tell you, when I read Karp's
paper, I must have read it, I don't know,

four or five times, concentrating on this,
and I couldn't figure it out.

What's the difference?

And then, well, I saw a rerun episode of
Matlock.

Matlock is a lawyer detective show.

A lawyer who proves
and solves the crime in

order to prove that
his client is innocent.

Anyway, so it's like a lawyer show,
but it's, in essence, a detective show.

Anyway, he has private investigators,
et cetera.

A spinoff of the Perry Mason show.

Anyway, whatever, if you're interested in
TV history.

Now, what happened?

So there was a tough case involving
organized crime, and he couldn't figure out.

He just couldn't figure it out.

He couldn't prove it.

He knew his client was innocent.

He just... It just nothing...

No matter how many times he went over the
evidence, it just didn't show anything,

not a bit, on behalf of his client.

And then, after thinking
it over and over and

over again, he realizes
he didn't make a mistake.

There was nothing in the evidence that
showed that his client was innocent.

What showed his client was innocent,
what was not in the evidence.

So he was trying to read the
evidence and trying to, every

detail, trying to figure out
why his client was innocent.

And then he realizes that that's not...
that was not what's happening.

The fact is that his client is innocent,
but it's not what's in the evidence.

It's what was not in the evidence.

That what was not in the event that the
evidence was a tape recording of a meeting.

And what's not in the meeting what was
relevant.

The point was whose voice was not heard.

And that tipped him off as to what really
happened.

Okay.

So, how does it apply here?

So again, when you read
Karp's paper, the minimums...

the Steiner tree is a
minimum spanning tree.

So why is it MPHard?

So now let's go to the Steiner tab.

I should put MST first.

Let's put... let's go to the Steiner tab
and let's do this following.

Let's imagine you have a big graph G.

Okay.

And G has lots of vertices.

But we're not gonna... we're interested in
a window.

Graph G exists.

In that graph G... It's not that we're
gonna ignore the edges and nodes in graph G.

We are not.

But our concentration, our network that
we're building is in G prime, a sub window.

Okay.

Now, in the case of, let's say,
cell towers, et cetera, you need to know

where to put the towers, the poles,
right?

So that you could maximize the
transmission of data, right?

And that's how Steiner Trees variations of
Steiner Trees are used till today for

planning networks and especially for
wireless networks.

All right.

Now, so here's the case.

Let's imagine B goes to C, right?

B goes to C in the graph line 12.

You have a node B.

The dots are nodes.

There should be dots elsewhere.

I was debating whether to put them in or
not.

Maybe I should.

It's just that, you know, technically
there are nodes here and nodes here.

I'm wondering if it makes it easier or
harder.

Maybe I should put them
in there to emphasize

the fact that there
are nodes everywhere.

Just give me one more moment.

Again, I want to emphasize the point.

There are nodes everywhere.

But our concentration is only in the graph
G prime.

Our concentration, our focus is only in
the region that's in G prime.

Okay.

Now, so let's consider the minimum
spanning tree.

And the minimum spanning
tree here, the minimum

spanning tree, and I'm
not going to write edges.

But the point is there are edges in the
general G as well.

In the minimum spanning tree, imagine that
node B goes to node C.

And the cost is 17.

And that's the smallest.

But it's only the smallest edge
considering the smaller window G prime.

But if you were to venture out what's not
there, what's not there is a node A.

Because A is outside.

It's in the complement.

It's outside the G prime.

It's in the greater G graph.

But if you were to go to A, for some
reason, you know, the B, B going to A is 7.

And A going to C is 8.

So then the cost of B
going to C will be equal to

the cost of B going to
A plus the cost of what?

A going to C.

B.

And that will be equal to B goes to A is,
what did we say?

7.

And then 7 plus 8, which is equal to 15.

Okay?

Whereas... Okay.

Let me say that first.

The whereas should be in second.

In graph G prime, the smallest cost
between B and C, so cost B and C,

B and C and G equals 17.

But if we evolve...
Let's move this way.

But if we relax... And that is an AI
and the operating systems... Operating...

Operations... What's it called?

I'll give it a second.

Whatever it's called.

Sorry.

One second.

Operations research, OR.

Term relaxed.

But if we relax the
constraint and allow edges

or vertices and edges
from the greater G, then.

..

Then the cost, in fact, is 15.

Okay?

Even smaller.

Even less.

All right.

I guess... Well, even smaller.

Colloquial.

All right.

That's the idea.

So the point was, it's not what's in
there.

It's what was not.

And that's the Steiner tree problem.

The Steiner tree problem
says that sometimes venturing

out and coming back in
will actually lessen the cost.

And that's what you're trying to find.

So the minimum
spanning tree would have 7

goes to 8, will be B
goes to A, A goes to C.

But instead of B going to C directly.

All right?

So that is the Steiner tree problem.

Okay?

So that, in fact, is NPR.

A very important problem in reality for
defining different networks.

Good.

So that's that problem.

And now we have one more problem for the
last of the 21 problems.

And then we will talk about the other
tabs, such as the tweaked P problems.

And if we have time, the definition of P
complete.

All right.

So 3D matching.

So this is a funny one, not hilariously
funny.

It's a little bit of an
irony in that what you're

actually trying to figure
out is what's not matched.

It's called matching, but you're trying to
find something that's not.

And then just here.

I'm 78.

Let's find out.

Yeah, 78.

That's correct.

All right.

3D matching.

And now... Okay.

So this is the definition.

We'll get...

Here's the formal definition.

And there's a variation, which is instead
of 3D matching, called 2D matching,

which actually is in P. But in order to
understand it, we need an example.

All right.

Let's do an example.

So the setup is a little bit messy because
there's a lot of subsets and sub pieces.

But here it goes.

Let's imagine you have the values T.

T is your original set.

Again, the letters were chosen by CARP.

I just kept them.

So T is 1 through 10.

All right.

Let's just let it space here.

1 through 10.

All right.

The cardinality of T is 10.

Good.

All right.

Now let's create 3D.

Let's create... Again, in 2D
matching, you'll be T times T.

But here it's T cubed.

T, Cartesian product, Cartesian product T.

Right?

So those are the three-dimensional cube,
right, of lattice points where 1,

1, 1 is in one corner, 10, 10, 10 is in the
other corner, and everyone in between.

1,000 points.

Right?

It has 1,000 points.

All right.

Now, that's the setup.

Now, what does the user do?

The user chooses a subset of T cubed.

The user will only take some of these.

Not all of T cubed, but some of them.

Okay?

Some of them.

So the user chooses some of them.

Okay?

And that's the user's universal set.

Now comes the problem.

Find the 3D match w from the points in the
user's set, u.

How many points, though?

You can only choose how
many points in the original

set, from the number of
points in the original set.

Here happens to be 10.

So now you're not
choosing 10 points, you're

choosing 10 numbers,
you're choosing 10 points.

Okay?

Okay, you're going to choose 10 points
from the user's set u.

Okay, now, in essence, the solution will
be a listing of these points, such that in

each dimension, the
values found are a

permutation of the values
found in the original T.

Note, the permutation of each dimension
will be assumed independently,

so there could be three different
permutations used, one for each dimension.

So, enough words, let's do an example,
and then I hope you'll understand.

So let's take all these thousand points,
ignoring what u is, because it's really

not relevant, but let's imagine that the
following 10 points, ignoring the rest of

u, it's not going to help us, let's
imagine the following 10 points were

chosen from a supposed user's subset of
points u.

Okay?

So let's imagine, we
don't care what the rest

of the points are,
it's not relevant to us.

But let's imagine the following points
were in the u.

So p1 is 1, 2, 3, and I break them up
accordingly.

Right?

1, 2, 3, et cetera, et cetera,
10, 10, 10, 3, 4, 6, these are the points.

Okay, line them up.

Now, the, even though, even though they
are, you know, right to left, in other

words, even though the, even though the
what?

Even though the, what you call it, the
points, the columns don't make a difference.

The columns are dimensional, but the
points are, points are, you know, X, Y, Z.

But the definition of 3D
matching is go across all

the points and consider the
numbers in the dimensions.

So here, for example, in dimension X,

it's a three dimensional point.

So X, the values are 1, 6, 9, 2,
5, 4, 7, 10, and A3.

So if we take a look
at all the values in the

first dimension, X, in
fact, it's a permutation.

All 10 numbers are there, 1 through 10,
and no duplicates.

Okay?

All numbers present and no duplicates.

Similarly, with column Y, 2, 7,
9, 1, 5, 3, 8, 10, 6, 4.

Again, all numbers are there, no
duplicates.

And finally, likewise in column J, the
Z column, the third column, 3, 8, 9, etc.

, 10, 5, 6.

All numbers are there, 1 through 10,
and no duplicates.

So each one, in fact, is a permutation.

As such, these points are, these points
are a 3D matching.

Okay.

Now, just at some point, why I've filled
in the shades of some of these numbers.

I chose number 1, only
to quickly get you started,

in case you're trying
to find the permutation.

But again, it's columns are independent.

Remember, the columns are independent.

Okay, so what happens in column 1 is not
relevant to what happens in column 2, etc.

All right, now...

Maybe I should put that there.

Hmm.

Okay, let me see something interesting.

A little better.

Anyway, whatever it is.

Fine.

Okay, so that's the ones.

Why did I pick the fives and the 10?

Because I wanted to show you that it
doesn't affect us.

For example, all 10s are lined up.

That's not required.

That happened only once.

The 10s are lined up.

Happened once.

But that's not a requirement.

It's just coincidentally.

And the fact that two fives and a five is
elsewhere.

Also, you can have one of each number on a
row.

You can have two of a number on a row.

Three of a number on a row.

But the rows are irrelevant.

The columns are the issue.

Okay, so that's why I wanted to make some
points about this.

The above is a 3D matching.

And ironically, because the numbers that
appear in each dimension, the first,

second, third positions, do not match the
other numbers in that dimension,

do not have to match.

Okay, they may.

They may, but they do not have to match at
all, the numbers in that dimension.

Okay, sometimes they may, coincidentally.

All right, now the cousin, though,
2D matching, in fact, is in P. So if you

did T squared, that would
be in P. All right, now, so let's

go to the last page here,
where I list a bunch of problems.

Most of these were provided by CARP.

Some of them I added on my own.

Okay, no particular order.

SAT3, that's the Boolean predicate.

You plug it in.

You plug in the zeros and ones.

And the question is, does the Boolean
predicate become true?

SAT3 requires that the clauses,
the parentheses, only involve three

variables with ORs inside the parentheses
and ANDs outside the parentheses,

not on the variables alone, if you want.

That's the case, but it is in NP.

But SAT2, where inside the clauses
are at most two variables, that's in P.

Today, we learned
about Steiner tree.

But Kruskal already showed the minimum
spanning tree.

In fact, it's a very quick algorithm.

And the minimum spanning tree is,
in fact, in P. And we mentioned that the

difference of Steiner tree is by
relaxation, relaxing the constraint.

You are only interested in the inner
window, G prime.

But in fact, it came from a bigger graph,
G.

Relaxing, allowing you to go
out and come back in, and you

could sometimes do even better
for the minimum spanning tree.

Okay.

MAXCUT.

So MAXCUT versus MINCUT is NP versus P.
MAXCUT was to divide, to create,

to consider a bipartite subgraph.

Divide the nodes into two, divide the
nodes into two groups, the nodes of an

original graph into, a weighted graph,
into two groups.

Only consider the edges between the
groups.

Do not consider the edges within the
groups.

Add up those weights, and that's the cost
of your cut.

And the idea was to, for vulnerability,
to figure out which two subgroups have the

most traffic between them, when you add up
those edges.

MINCUT, minimization, turns out to be NP.

Today we did node cover.

Find a representative
set of vertices, such that

all edges touch at
least one of those nodes.

The cousin, our cover, take a
representative set of a graph,

a representative set of
edges, and does every

node, is every node
touched by the edge.

So that's, if you can find
that, that, verifying that is in P.

In other words, that
problem is in P. R cover.

Chromatic number.

In general, K paints.

Lots and lots of paints.

And the idea is to literally paint the
vertices of your graph, such that every

edge has two different colors on its end
points.

But if we were to limit the colors to
two colors, so that if it's doable, not.

..

Again, you can have a graph that's not
doable, but to find out whether it's

doable or not, and to do it if it is doable,
that's in P. That's called two color.

Like K partition, two partition,
like partition.

This is two coloring, also known as
bipartite checking.

Anyway, whatever.

3D matching, we mentioned today,
is in NP.

But if you were to limit it instead of T
cubed, but T squared to

two dimensions, as T
squared equals T, Cartesian

product T, with size of T
squared equal to 100, then NP.

Okay?

So that problem is in P. It's also known
as two-dimensional matching.

It's also known as bipartite matching.

And the oddity, the note to the irony, is
that means that there are no duplicates.

There's no match, but okay.

Job sequencing.

You have a bunch of tasks, and not only is
there a deadline for the entire project,

there's an agreed upon, in the contract,
agreed upon deadline for each of the tasks.

More so, there's a penalty.

Now, okay, so let's take that out.

Carp never had profit.

Further authors, later authors, rewrite
this in terms of profit instead of penalty.

It doesn't really make the problem that
much harder.

It's just that...

Please realize it.

Removing the... Okay.

So...

Write the original problem.

Okay.

So job sequencing.

You have these tasks, and
according to Carp, if you do not

make the deadline for a given
task, you pay the Pied Piper.

You pay the price.

You don't get paid for that task.

And there's a parameter, K, because every
contractor knows there's, you know,

some things go over cost.

And if you made a high enough bid,
you don't mind the fact that there's extra

cost because you made enough profit.

Anyway, that's called job sequencing.

If we drop the penalty parameter or the
profit parameter, so the sequencing with

deadlines without the issue of cost,
can you do that?

Well, then that's basically similar to a
topological sort.

And that is in P. 0, 1 integer
programming, where the input, the x sub i,

so you have a matrix A of coefficients.

You have a bunch of
variables, x sub i, which

have only our Boolean
0 or 1, hence the name.

And then you have a system of equations so
that they're equal to a b vector,

ax equals b.

The question is, can you satisfy these
equations?

Not satisfying the Boolean, but a system
of equations.

And in mathematics, mathematics solving a
system of equations is called programming.

And that is in NP.

If you drop the 0, 1, an integer
part, and just use general,

so solvability of linear
equations, is Gaussian elimination.

And that is, in fact, in P. A further
problem is called linear programming,

which is meaning you add further
constraints.

It's even a higher level.

And in fact, that's in P as well.

Knapsack.

Knapsack is when you pack.

You have a bunch of items.

Each item has a weight.

And you can't take everything with you.

The airlines restrict your suitcase to 50
or 70, whatever the number of pounds are.

Right?

They restrict it.

So, you can't take everything.

And now you...

But it's worth it to you.

It makes more sense, except for the fact
that you're carrying heavy...

Except for the fact that you're carrying a
heavy suitcase by maxing out, from a

travel point of view, it's not bad having
an extra shirt, an extra pair of shoes,

an extra book, an extra can of food.

Right?

Because it's simpler.

Your home base on the other side, where
you're traveling, you have more items to use.

It makes it a little bit easier for you.

So, you would like to max out.

But you can't take all the items.

It goes overweight.

And there's a penalty of cost.

And you don't want to do that.

So, that's in NP.

Fractional knapsack would be, for example,
a spice sales person.

Where if you were to be a few ounces over,
or you could pour some of the garlic out.

And now you're good.

Right?

You had a pound, pour out your few ounces,
and you still have spices to show.

A little less garlic, but you still could
go to the show, and you maxed out.

Right?

My example that actually happened,
where I was told to remove one shoe.

I just love it.

Anyway, because I was one shoe overweight
in my luggage.

All right.

Now, CARP did not mention this problem.

This variation.

Right?

I just wanted to mention that.

Consider sales with spices, for example.

And then, CARP did not mention this
variation.

Okay.

That's called fractional knapsack.

That's in P. Happens to
be another example which

really actually works,
but that's not for us.

Undirected Hamiltonian circuit.

So, CARP, this is my conjecture.

CARP creates, mentions the shortest path
problem.

CARP does not tell you which problem is
related to the shortest path problem.

I'm not clear on it.

It could be either Steiner tree.

In some sense, I don't want to get why.

Some people think that minimum spanning
tree is a shortest path problem.

Not exactly.

Could be.

I thought maybe undirected Hamiltonian
circuit.

The shortest path determines minimum
length.

Hamiltonian circuit tries to find the
theoretical maximum.

Right?

The longest path.

Hamiltonian circuit
finds the longest path,

whereas shortest path,
so maybe max versus min.

I don't know if that's my conjecture.

Feedback arc set versus...

And then CARP has a different problem
called arc deletion, which we mentioned.

CARP uses arc deletion to verify the
solution to feedback arc set.

Arc deletion is in P, whereas to destroy
all the cycles, if you have it.

And feedback arc set is not.

All right.

The above problem and their tweaked
variance are elaborated on below.

And here are lots of details that you can
read through.

And that's the summary I alluded to
earlier today.

But anyway, these are a summary of his
problems.

And there are a bunch here.

Okay.

So that pretty much concludes our
discussion of CARP's problem.

There's a definition of peak complete.

Read it.

It's just the definition.

We don't have time this semester to deal
with it.

I mention it.

Read it.

Use it for your definitions.

We've concluded these notes.

I will update these notes and send you a
final version later today.

All right.

It's the end of class.

If you did not
have a... Oh, sorry.

I have question.

Yes.

Okay.

Ali, thank you.

Could you email me?

I will reply to you.

I have to look into that.

And then I'll answer you, Ali.

Okay.

But please email me.

Good.

If you didn't have a chance, it's a good
question.

Professor, can you tell me?

Is it possible to... We can
talk just a few seconds.

Yeah.

You could talk to me.

Sure.

Okay.

Let me just repeat.

Whoever didn't have a chance, please type
the word present.

Okay.

Ali, sure.

If you want to ask.

Yes.

So... Sure.

I want to upload all the tags,
the project.

Okay.

Ali, you may not realize it.

I answered your question.

Let me repeat.

No problem.

Please email me.

I'll look into it.

Okay.

And I will personally reply to you.

I'm not sure.

It may be a setting.

It may be purposeful.

I'm caught off guard.

No, don't get me wrong.

Nothing bad.

Okay.

You are totally correct for asking.

Okay.

But email me at the class designated
email.

And I have class till 12.

Sometime in the early afternoon,
I will personally email you back.

All right.

Thank you so much.

I need to tell the... Yeah.

Thank you.

And if it's something I need to tell the
class, I'll email the class.

But let me just respond to you first.

Thank you all.

All right.

If you didn't have a chance, last call.

Then type the word present.

And we will continue on Wednesday.

Okay.

Thank you.

Bye all.