--- Video Title: Proving FSAs are as Powerful as Regular Expressions (Part 4: Kleene Star) Description: Theory of Computation https://uvatoc.github.io/week7 15.4: Proving FSAs are as Powerful as Regular Expressions Part 4: Kleene Star - Meaning of Kleene Star - NFA Implementation of Kleene Star - Concluding the Equivalence Proof Nathan Brunelle and David Evans University of Virginia --- We have only one more operation that we need to talk about, and that is this cleany star. So we said if I have some regular expression r, then I can build this regular expression r star. Where r star meant a string matched r star if we could break it into any number of chunks where every single chunk matched r. So you can think about this as an arbitrary number of concatenations. So a string matches r star if there was some r, r, r, r, r, r, etc. There's some number of r's that I could concatenate together that would match that string. That's one way you could think about this r star. So is there some way I can split it up into some number of chunks such that every single one of those chunks matched r. That's that regular expression r. So what this is doing, the operation that this is doing is this cleany star on the language. Where we say that the language raised to the star power. So the cleany star on the language. It says concatenate that language with itself zero times. And take all of those strings and union them with the strings you get by concatenating the language with itself one time. And then take all of those strings and union those with all the strings you get by concatenating the language with itself twice. And then three times, and then four times, and five times, and so forth. Where we say that the language concatenated with itself zero times is just the empty string, the language containing the empty string. The language concatenated with itself one time is just the language itself. And then like L2, that's equal to L concatenated with L. L to the power 3, that's L, L, L, and so forth. So for L to the power K, that says have K Ls in a row concatenated. For instance, if I had the language 0, 0, and 1, 1 starred, so the claney star of the language 0, 0, 1, 1, this will give me any string that is pairs of zeros and ones in any order. So I can get the empty string because that was L0. I can get 0, 0 because that belonged to L1. I can get 1, 1 because that also belonged to L1. 0, 0, 1, 1, that belonged to L squared, L2. 1, 1, 0, 0, 1, 1, that one belonged to L cubed, and so on. How big is the claney star of a language going to be? There are actually two sizes that it could be. Yeah. Either one or infinity. Okay. So if L equals the set containing the empty string, then that implies L star is equal to the set containing the empty string. Because this is saying we can take things from the language and concatenate them together in any order, any number of times. But the empty string concatenate with the empty string that only can ever give us the empty string. Also, if L is equal to the empty language, the language with no strings. Languages are just sets of strings. So the empty set is a language that just doesn't have any strings in it. Then that implies that L star is equal to the set containing the empty string. Because L0 is just always the set containing the empty string. So L0 is the set containing the empty string. L1 and on would be the empty set for this case. So if L is the empty set, then L star is the set containing the empty string. In all other cases, L star is going to be infinite. Yeah. The language for this automaton is the empty set. The language for that automaton is the set containing the empty string. They're different languages. Okay, so we can do the Claney star using non-deterministic finite state automata. Something to keep in mind for the Claney star, we're only doing this on one machine. So whereas like union, concatenation, that sort of thing, those operated on two languages or two machines, Claney star, we only have one thing that we're operating on, kind of like complement, only had one thing that we were operating on. So in this case, our goal is that we want this new machine we're building to return one whenever the input string can be broken into a bunch of chunks, such that our machine returns one for every chunk. So that's what it means for some string to belong to the language of the star of the language for that machine, is to say I can break it into a bunch of chunks, such that the machine returns one for every chunk. So every chunk would take us from a start state to a final state. So our strategy is going to be kind of like our strategy for concatenation, where every single time we get into a final state, that meant here was a chunk that that machine could have returned one on. But it might have been problematic later if I chose to stop it there and continue on to the next chunk versus if I wanted to keep growing that first chunk. So basically our strategy here is that every single time enter a final state in our machine, one option is we continue saying I'm working on the same chunk. The other option is we say I should end this chunk here and start moving on to the next chunk. So those are our two options. Every single time I end up in a final state that says this could have been a chunk, but maybe I didn't want it to be. Maybe I wanted to keep working for a little while. So we're non-deterministically going to try both. So we're going to say every single time we enter a final state in our machine, we're going to be allowed to sort of restart our machine by going back to the start state. Every single time I entered say this final state, this no zero is final state. I wanted to say, well, one option is that I could say, all right, I'm done with that chunk. Now let's restart my machine so I can start working on the next chunk. And then for the start state, this doesn't really matter, but we could do that as well. And then we have that special case of we need to make sure that we do the empty string. So the empty string is always in the language star. So we're just going to say and make sure we do the empty string by adding some special state specifically for the empty string. So to sort of show you formally what's going on here, we just said every final state, we transition without consuming input to the start state. And then the start state transitions without consuming input to some state just for the empty string, because we always wanted that. So what we've shown then is that every rule we have for taking two regular expressions and building a more complicated one and producing a more complicated one. We can apply always those same rules to finite state automata. So any way we had of combining regular expressions also works for finite state automata. So since we could sort of start with our base case regular expressions and have base case finite state automata for those, in any way we could combine regular expressions into other regular expressions we could use to combine finite state automata into other finite state automata, Anything I could build up, any language I could build up with regular expressions, I could build up a finite state automaton to do the exact same thing. That's what we've demonstrated, which shows that any language that I can express as a regular expression, I could also compute with some non-deterministic finite state automaton. And any language I could compute with a non-deterministic finite state automaton, I could compute that with a deterministic finite state automaton as well. Which means that as a computing model, we showed that non-deterministic finite state automata are equivalent to regular expressions and deterministic finite state automata. These are all equivalent models of computing. And by the way, there is a special name we give to the languages that can be computed by this model. We call those regular languages in honor of their equivalence to regular expressions. We most often end up describing these languages using regular expressions. So we call them regular languages. But we most often compute those languages using NFAs.