---
Video Title: Shortest and Longest Paths
Description: Theory of Computation
https://uvatoc.github.io/week11

22.3 Shortest and Longest Paths
- Graph Review
- Shortest Path Problem
- Efficient Algorithm for Shortest Path
- Longest Path Problem
- Inefficient (Exponential Time) Algorithm for Longest Path

Nathan Brunelle and David Evans
University of Virginia
---

﻿So the next thing we're going to do is I'm
going to talk about a few different

problems, talk about some of the
preliminary things we need to know,

some of the preliminary definitions we
need for these various problems,

and we're going to talk about finding
running times for those problems.

We're going to look at four different
problems today that are kind of in two pairs.

The two in each pair are only very
slightly different from one another,

so they're very similar problems with One
slight change to each.

But we're going to see that that slight
change makes actually a huge difference in

the running time of that algorithm and how
efficiently we can implement a solution to

that question, that problem, that
function.

The first thing we're going to look at is
looking at the shortest path in a graph.

So given a graph and two nodes,
I want to find what is the shortest path

in terms of number of hops between those
two nodes.

And then I'm going to
make a slight modification

where I'm going to look
at the longest path instead.

Same problem, but now I want to find the
longest path from point A to point B

rather than the shortest path from point A
to point B.

The next pair of problems
we're going to be looking

at are what are called
satisfiability problems.

With these satisfiability problems,
the idea for these is that we're given

some Boolean expressions over some
variables.

And I want to know, can I assign true or
false values to these variables in order

to cause the whole expression to evaluate
to true?

I'm going to very quickly go through sort
of a recap on what graphs look like.

A graph is defined by a
pair of sets, where the first set

is going to be a set of what
we call vertices or nodes.

And then the second set is going to be
edges that go between two nodes.

Those can be ordered, those can be
unordered.

If we care about the order that those edges
appear, we call that a directed graph.

If we don't care about the order,
we call that an undirected graph.

So in this case, we can see that the nodes
are A through I.

Edges are going to be potentially ordered
pairs of nodes.

In this case, the way that I drew this,
this was an undirected graph.

So the edge that goes between A and B
should have been the same as the edge that

goes between B and A, because order
doesn't matter.

Typically, if you draw arrows on this,
then now you're saying that order matters.

So this says AB is a different edge from
BA.

So you could be able to, say, traverse from
A to B, but not in the opposite direction.

We can also, if we have something called a
weighted graph, have numbers that are

associated with each of
these edges that tell us what

is the weight or the cost
of that edge in some sense.

And there are several different ways that
we can represent these graphs.

If we wanted to write them
down as data structures,

one way we might do this
is with an adjacency list.

We say that two nodes are adjacent to one
another if they share an edge.

So we can have this adjacency list
representation where we kind of say,

for every single node that's
in our graph, I'm just going

to have a separate list of
everything adjacent to it.

So this says that A is adjacent to B and
C.

So we have edges that go from A to B and A
to C.

B has edges to A and to C and to E and so
forth.

So that is one way
we could represent our

graph, write down the
definition of our graph.

We could also have this adjacency matrix
representation of our graph.

Where this adjacency matrix representation
says, I'm going to have this big table

where there's going to be a row and a
column for every node in this graph.

And I'm going to write a one in every cell
that represents an edge that exists.

So here I have an edge from A to B.

So the cell AB has a one in it.

And then zeros are going every place else.

I left those out just so that it was more
readable.

Since there is no edge from A to I, we're
going to have a zero there from A to I.

This is an adjacency matrix
representation.

Each one has its own advantages,
disadvantages.

For the purposes of what we're talking
about in this class, those advantages and

disadvantages aren't going to make a huge
amount of difference.

So the next thing we can talk about is a
path.

A path is a sequence of nodes where any
two things that are next to each other in

the sequence have to have an edge in
between them.

So here we have this path A, B,
E, G, F, D.

So that is that sequence of nodes A,
B, E, G, F, D.

And in order to be a valid
path, we have to have an edge

from each thing in the path
to the next thing in the path.

So that is we need to be able to follow
edges to traverse that sequence of nodes.

If I can't go directly from one
node to the next one in the

path or in that sequence,
then that is not a valid path.

So for instance, A,
I is not a valid path

because there is no
edge that goes from A to I.

Any such sequence we
call a path, we call it a simple

path if each node that
appears appears at most once.

That is, we're never visiting the same
node twice.

If we never visit the same node twice,
that's a simple path.

If we start and end at the same location,
that's called a cycle.

So we can have paths that are neither
simple nor cycles.

For instance, a path that looks
something like that would be

neither simple nor a cycle
because we visited that node twice.

A cycle is something that has to look like
this.

It has to start and end at the same place.

A simple path is one that never crosses
the same node twice.

Something that is neither
simple nor a cycle is like a path

that has maybe an interior
cycle or something like that.

So the first problem we're
going to look at and try to

figure out the runtime for is
this shortest path problem.

The shortest path problem says if I have a
graph and you gave me a start node,

let's call that S, and
an end node, let's call

that T, how long is the
shortest path from S to T?

So in this case, our input is three
things.

A graph, a start node, and an end node.

And our output is one thing, which is a
number.

That number being the length of that
longest path.

We're not interested in the identity of
that path, just the length of it.

In this case, if I wanted to find the
length of the shortest path from A to E,

that's going to be two, because I can do
this in one, two hops.

There are other paths that go from A to E
as well.

I could go A to C, C to D, D to E.

So that's a path of length three, but that
is longer than this path of length two.

So our answer is two.

So that is the shortest path, the length
of the shortest path.

So what's the runtime of this?

I only care about the number of hops.

I don't care about the weights of the
edges, if that graph has weighted edges.

So instead, I can use a much simpler
algorithm to solve this.

So I can use something called a
breadth-first search, where the idea of a

breadth-first search is we can start with
our start node and sort of ripple outward

in terms of distance until we find our end
node.

So if we wanted to look at
some pseudocode for this,

we could have all of our
nodes in some sort of queue.

And what we're going to do is add our
start node to the queue.

And then over and over and over again,
until I have run out of nodes to visit,

we're going to remove
something from the queue and

then add all of its unvisited
neighbors to the queue.

Over and over and over again.

In this case, the first
thing I'm going to add to the

queue is A, because
that's what I'm starting with.

So I'm going to have A in the queue.

And then I'm going to
remove A from the queue and

then add all of A's neighbors
to the queue, B and C.

So I'm going to remove A from the queue,
and then add B and C to the queue.

And the next thing I do
is I remove B from the

queue, and then I'm going
to add CDE to the queue.

C was already there, so I didn't re-add
it.

But here I'm going to
add CD, and when I add E,

I'm going to say, oh,
that was my destination.

I could get there in two hops.

So every single time we are kind of going
one further distance away, we're going to

keep track of that was a node that was one
farther away.

As soon as we see our
end node, then we knew

our answer for this
breadth first search.

So it's basically how
many ripples we had to do

outward before we found
our destination is our answer.

So this super easy algorithm...

This is basically going to be linear time
by the size of the graph.

It's the number of nodes in the graph plus
the number of edges in the graph.

This is a super easy
algorithm, so this is going to be

roughly a linear time algorithm
in order to solve this one.

So this is something that's
generally considered to be a

pretty efficient algorithm,
or a very efficient algorithm.

Linear time algorithms
are generally understood to

be efficient algorithms for
solving various problems.

So we're going to make now a very minor
change, where instead of talking about the

shortest path through our graph from point
A to point B, I want to talk about the

longest path in our graph from point A to
point B.

So if I'm trying to find the longest path
from, say, A to E, then maybe that's going

to be like A to C to D to F to G to I to H
to E.

So in this case, this would be like the
longest simple path is what we're after.

Let me add that in here.

How long is the longest simple path from
the start node to the end node?

Turns out this one minor change of just
changing it from a minimization problem to

a maximization problem makes it a
substantially more difficult problem to solve.

One way that we could do this is we could
enumerate all of the possible sequences of

the nodes, of which there's going to be n
factorial of them.

Really n minus 1
factorial of them, since we

knew which node was
going to be our start node.

So we could just enumerate all n minus 1
factorial different sequences of nodes,

and then check whether or not each one of
those was a path.

And then once we have
all of those that were valid

paths, we just return the
length of the longest one.

So this would certainly give us our
answer.

But this is very, very slow.

This is like an exponential time
algorithm.

This is more than 2 to the power n.

Turns out we don't really know a way of
doing substantially better than this.

We don't know of any way to do this in
better than an exponential time algorithm.

So just this one minor change
to a very simple problem

makes this a very
challenging problem to solve.

We don't know a way to solve it kind of
faster than this.

It's substantially faster than this sort
of silly brute force method.

This, by the way, is something that we
would call an inefficient algorithm.

It's going to take like years to solve
this for even graphs of like 20 nodes.

So this is an inefficient algorithm.