In
» From Regular Expressions to FSAs we looked at one direction of the equivalence between regular expressions and FSAs. We showed that every language that can be characterized by a regular expression is also accepted by some FSA. Now let's look at the other direction and prove that for every FSA there is a regular expression that characterizes the same language.
Let M=({q1,…,qn},Σ,δ,q1,F) be a deterministic FSA, and L be the language accepted by it. We are going to show that there is also a regular expression for L.
Before we begin, let's define the following notion: Rijk are those strings that M reads when it goes from i to j without passing through a state numbered higher than k (by passing through, we mean entering and leaving again). So i and j may have numbers greater than k. We will use Rijk to "zoom" in and out of our automaton, determining that only a part of it may be used.
Now our induction will be over k. Starting from single states we successively allow ourselves to use larger parts of our automaton. We have to show that (given any choice of i,j≤n):
- Basis
There is a regular expression that characterizes Rij0. This is the language of all strings that take M from i to j without going through any state numbered higher than 0 (that is infact, without going through any state at all). - Induction
There is a regular expression for Rijk, given that there is one for Rlmk−1 for any choice of l,m≤n.
Give proofs for basis and induction! To prove the basis, you can of course simply enumerate as a disjunction all the words that can be read in M in one step from i to j. For the inductive step, think about the following: If a word can be read from i to j in M without going through any state higher than k, it can always be split into a sequence of parts none of which takes M through a state higher than k−1 (in the sense of entering and leaving). How can this be done?
Add a "pretty-print" operation (you can call it
pr) to the operations defined in
» Implementing Operations on FSAs. It should use the backbone of the
scan/1-predicate to crawl along an automaton and print out messages with information about any state it visits and about any edge it passes.
For instance loading
haha.pl and then calling
scan(pr(a1)). should result in output like the following:
automaton a1:
initial state 1
from 1 to 2 over h
etc...
final state 4
yes
Hint:Hint: One way of producing formated output in Prolog is using the format/2 predicate. The first argument of format/2 is a string (i.e. something that's enclosed between double quotes). This string may contain one or more placeholders. The second argument is a list of things to be printed instead of these placeholders.
For instance the call
format("My ~w has ~w legs.~n My ~w has ~w legs.~n", [dog, 4, sister, 2]).
produces the following output:
My dog has 4 legs.
My sister has 2 legs.
yes
The string given as first argument in this call contains the placeholder symbol ~w four times. These four placeholders are then replaced by the items on the list in the order in which they occur. Furthermore the strings contains the special symbol ~n twice. This is printed as a linebreak.
There are lots of other ways of using the format-predicate, for instance with other kinds of placeholders and for file output. You will find them documented with most Prolog implementations. For this exercise, ~w and ~n should do.