---
Video Title: Proving FSAs are as Powerful as Regular Expressions (Part 4: Kleene Star)
Description: Theory of Computation
https://uvatoc.github.io/week7

15.4: Proving FSAs are as Powerful as Regular Expressions
Part 4: Kleene Star
- Meaning of Kleene Star
- NFA Implementation of Kleene Star
- Concluding the Equivalence Proof

Nathan Brunelle and David Evans
University of Virginia
---

﻿We have only one more
operation that we need

to talk about, and
that is this cleany star.

So we said if I have
some regular expression r,

then I can build this
regular expression r star.

Where r star meant a string
matched r star if we could break it

into any number of chunks where
every single chunk matched r.

So you can think about this as an
arbitrary number of concatenations.

So a string matches r star if there was
some r, r, r, r, r, r, etc.

There's some number
of r's that I could

concatenate together that
would match that string.

That's one way you could think about this
r star.

So is there some way I can
split it up into some number of

chunks such that every single
one of those chunks matched r.

That's that regular expression r.

So what this is doing,
the operation that this

is doing is this cleany
star on the language.

Where we say that the language raised to
the star power.

So the cleany star on the language.

It says concatenate that language with
itself zero times.

And take all of those strings and union
them with the strings you get by

concatenating the language with itself one
time.

And then take all of those strings and
union those with all the strings you get

by concatenating the language with itself
twice.

And then three times, and then four times,
and five times, and so forth.

Where we say that the language
concatenated with itself zero times is

just the empty string, the language
containing the empty string.

The language concatenated with itself one
time is just the language itself.

And then like L2, that's equal to L
concatenated with L.

L to the power 3, that's L, L,
L, and so forth.

So for L to the power K, that says have K
Ls in a row concatenated.

For instance, if I had the language 0,
0, and 1, 1 starred, so the claney star of

the language 0, 0, 1,
1, this will give me any

string that is pairs of
zeros and ones in any order.

So I can get the empty string because that
was L0.

I can get 0, 0 because that belonged to
L1.

I can get 1, 1 because that also belonged
to L1.

0, 0, 1, 1, that belonged to L squared,
L2.

1, 1, 0, 0, 1, 1, that one belonged to L
cubed, and so on.

How big is the claney star of a language
going to be?

There are actually two sizes that it could
be.

Yeah.

Either one or infinity.

Okay.

So if L equals the set containing the
empty string, then that implies L star is

equal to the set containing the empty
string.

Because this is saying we can take things
from the language and concatenate them

together in any order, any number of
times.

But the empty string
concatenate with the empty

string that only can ever
give us the empty string.

Also, if L is equal to the empty language,
the language with no strings.

Languages are just sets of strings.

So the empty set is a language that just
doesn't have any strings in it.

Then that implies that L star is equal to
the set containing the empty string.

Because L0 is just always the set
containing the empty string.

So L0 is the set containing the empty
string.

L1 and on would be the empty set for this
case.

So if L is the empty set, then L star is
the set containing the empty string.

In all other cases, L star is going to be
infinite.

Yeah.

The language for this automaton is the
empty set.

The language for that automaton is the set
containing the empty string.

They're different languages.

Okay, so we can do the Claney star using
non-deterministic finite state automata.

Something to keep in mind for the Claney
star, we're only doing this on one machine.

So whereas like union, concatenation,
that sort of thing, those operated on two

languages or two machines, Claney star,
we only have one thing that we're

operating on, kind of like complement, only
had one thing that we were operating on.

So in this case, our goal is that we want
this new machine we're building to return

one whenever the input string can be
broken into a bunch of chunks,

such that our machine returns one for
every chunk.

So that's what it means for some string to
belong to the language of

the star of the language for that machine,
is to say I can break it into a bunch of

chunks, such that the machine returns one
for every chunk.

So every chunk would take us from a start
state to a final state.

So our strategy is going to be kind of
like our strategy for concatenation,

where every single time we get into a
final state, that meant here was a chunk

that that machine could have returned one
on.

But it might have been problematic later
if I chose to stop it there and continue

on to the next chunk versus if I wanted to
keep growing that first chunk.

So basically our strategy here is that
every single time enter a final state in

our machine, one option is we continue
saying I'm working on the same chunk.

The other option is
we say I should end this

chunk here and start
moving on to the next chunk.

So those are our two options.

Every single time I end up
in a final state that says this

could have been a chunk,
but maybe I didn't want it to be.

Maybe I wanted to keep working for a
little while.

So we're non-deterministically going to
try both.

So we're going to say every single time we
enter a final state in our machine,

we're going to be
allowed to sort of restart

our machine by going
back to the start state.

Every single time I entered say this final
state, this no zero is final state.

I wanted to say, well,
one option is that I

could say, all right, I'm
done with that chunk.

Now let's restart my machine so I can
start working on the next chunk.

And then for the
start state, this doesn't

really matter, but we
could do that as well.

And then we have that
special case of we need

to make sure that we
do the empty string.

So the empty string is always in the
language star.

So we're just going to say and make sure
we do the empty string by adding some

special state specifically for the empty
string.

So to sort of show you formally what's
going on here, we just said every final

state, we transition without consuming
input to the start state.

And then the start state transitions
without consuming input to some state just

for the empty string, because we always
wanted that.

So what we've shown then is that every
rule we have for taking two regular

expressions and building a more complicated
one and producing a more complicated one.

We can apply always those same rules to
finite state automata.

So any way we had
of combining regular

expressions also works
for finite state automata.

So since we could sort of start with our
base case regular expressions and have

base case finite state automata for those,
in any way we could combine regular

expressions into other regular expressions
we could use to combine finite state

automata into other finite state automata,
Anything I could build up, any language I

could build up with regular
expressions, I could build

up a finite state automaton
to do the exact same thing.

That's what we've demonstrated,
which shows that any language that I can

express as a regular
expression, I could also compute

with some non-deterministic
finite state automaton.

And any language I could compute with a
non-deterministic finite state automaton,

I could compute that with a deterministic
finite state automaton as well.

Which means that as a computing model,
we showed that non-deterministic finite

state automata are
equivalent to regular

expressions and deterministic
finite state automata.

These are all equivalent models of
computing.

And by the way, there is
a special name we give to

the languages that can
be computed by this model.

We call those regular languages in honor of
their equivalence to regular expressions.

We most often end up describing these
languages using regular expressions.

So we call them regular languages.

But we most often compute those languages
using NFAs.