Intersection, Complementation
Abstract:
The class of regular languages also has closure properties under intersection and complementation.
But the class of regular languages also has closure properties that are not obvious from the definition of regular expressions. We shall look at closure under intersection and complementation.
If L1 and L2 are regular languages, then so is L1∩L2. We will now construct a FSA M for L1∩L2 from two FSAs M1 and M2 for L1 and L2. The idea is to construct an automaton that simulates going through M1 and M2 synchronously, step by step in parallel. Formally (we use indices 1 and 2 to refer to elements of M1 or M2 respectively, and assume that both automata share the same alphabet):
M=(Q1×Q2,Σ,δ,(s1,s2),F1×F2)
The states of the new automaton will are all pairs of states from M1 with states from M2.
The transitions will be such that the paths through M are the paths through M1 if we only look at the first element of each state-pair, and are the paths through M1 if we only look at second elements:
δ={(((p1,p2),a),(q1,q2))∣p1,q1∈Q1,p2,q2∈Q2,a∈Σ,((p1,a),q1)∈δ1,((p2,a),q2)∈δ2}
But our new automaton will accept a word only if the computation ends on a pair where both elements are final states in their "original automata" (the set of final states is specified as F1×F2). This means that we can reach a final state with a word in M iff we could reach a final state with it in M1and in M2, hence iff the word is in L1∩L2.
If L is a regular language over an alphabet Σ, then so is L¯ (i.e. the language consisting of all strings from Σ∗ that are not in L). We will again show how to construct an automaton M for L¯ from an automaton M1 for L. Yet this time we have to make three additional assumptions:
- M1 is ε-free, i.e. contains no jump arcs.
- M1 is complete, i.e. there is a transition from each state in Q over each symbol in Σ (note that none of the automata we've seen so far has had this property).
- M1 is deterministic.
However, these assumptions are quite harmless. We can get rid of jump arcs by systematically substituting direct transitions over the symbols read before and after each jump. For example, look at the following very simple automaton with one jump arc:
Figure 11: Very Simple Automaton with one Jump Arc. Automaton before transformation. |
To get rid of the jump arc between states 1 and 2, we transform it into the following automaton:
Figure 13: Transformed Simple Automaton without Jump Arc. Automaton after first transformation. |
The two transitions we've added so to speak bypass the former jump arc.
To complete an automaton we do as follows: We add one "dead" state (a new state that has no outgoing transitions to any other states), and let all the "missing" transitions point to it. For instance, here's a completed version of our ε-free automaton from above:
Figure 17: Transformed Complete Simple Automaton without Jump Arc. Automaton after transformation. |
Finally, as we've already mentioned, non-determinism doesn't add anything substantial to FSAs, and every non-deterministic FSA can be transformed into a deterministic one. The algorithm fo this purpose is a bit more involved and we can present an easy example here. The basic idea is to build a new automaton that has as states all sets of states in the old one.
So for the construction of a complement-automaton, let's assume that M1 is a determinstistic ε-free complete FSA for the language L. Then we construct an FSA M for L1¯ by copying M1 and making all and only its non-final states final states of M. M will reach a final state on a word (and thus accept it) if and only if that word would have taken M1 into a non-final state. Hence M will accept all and only those words from Σ∗ that are not accepted by M1.
Notice that once we know that regular languages are closed under union and complementation, closure under intersection follows from the set theoretic fact that L1¯∪L1¯¯=L1∩L2. An automaton for any intersection language can be contructed accordingly by nesting the construction methods for union and complementation. But because to prove closure under complementation, we had to make assumptions that we don't need for the direct construction method for intersection, we've chosen to present this direct method first.