Lecture 18: Nonregular Languages
Date |
---|
- DFAs model computers with finite amount of memory. Thus:
Regular languages correspond to problems that can be solved with finite memory.
- So, if you have a regular language, then it can be solved with some finite memory computer (i.e. the problem is solvable).
On the contrary, we have non-regular languages:
Nonregular languages, in a sense, correspond to problems that cannot be solved with finite memory.
(i.e. problems NOT solvable by a computer).
- A side note: when, in the real, solving problems, think whether something can be modeled as a regular language which can help you determine if a problem is solvable (i.e. worth spending your time) or not.
Finding nonregular languages
For regular languages, it was theoretically easier to prove since it was existential: all we needed to do was find one NFA/DFA/regex.
However for NRL's, it is a universal statement. You might think that we'd need to prove that there exists no NFA or DFA or regex. But actually, we can leverage some intuition and realize that we only need to prove that there exist no DFA which subsequently covers all the other cases.
- Keith defines three "traits" of DFAs which he lists on the board:
- Order
- Number
- Bound
- He then walks through a few examples and shows that some languages simply don't meet all 3 and are thus irregular.
Drawing DFAs and proof routes
- One intuition behind showing the universally-quantified statement that is "nonregular" is to attempt to draw a DFA, then somewhere in the state matching, you reach a contradiction.
In general, there are two proof routes:
- Contradiction would be where an accept and reject string by the language end up in the same state.
- Direct is a little more abstract: it means that, say you have some sequence which should read out the same but the subsequence starts at different places/ends at different places (just see lecture at like
25:00
).- In the context of the problem given, realize that once you reach and , you should be going down different sequences after (per the language constraints). Unfortunately, in that particular example, both and had the same ending sequence which is where the problem is since and are separate states. More definitively, we are saying we have the strings and which means our language is nonregular since they both have the same ending, but different beginnings. Intuition:
Now we will aim to generalize this intuition by formalizing it.
Distinguishability
Formalizing the above intuition essentially.
This setting guideline for what is accepted (exactly one) and not (if both are in the language, that would be impossible given our intuition).
Even more generalizable, we say that any two input strings that are distinguishable, must end up in different states (meaning one is rejected by the language and the other is accepted):
Proof by distinguishability (contradiction)
This is the "gold-standard" proof example for nonregular languages:
Another proof
Note on automaton construction: an underlying question you should keep in your mind (for all machine constructions for that matter) is "what do you need to remember" regarding the language.
Use the distinguishability property as a Lemma:
Then, we get:
Proof template
As you can see, this second proof is nearly identical to the first proof. This means we can generalize this form via the following proof template:
Generalizing this
We can now actually generalize this by introducing the notion of the distinguished sets implication:
Myhill-Nerode Theorem
Since we have introduced the above, we get the following theorem:
This has two meanings:
- The set is infinite in cardinality
- Each string in the set is different (constituting a distinguishing set)
We can now use this tool to prove what we just did earlier, but much more elegantly:
Intution in using the theorem: