A Little Grammar of English
Abstract:
Phrase structure rules and lexical rules.
The above ideas adapt straightforwardly to natural languages. But with natural languages it is useful to draw a distinction between rules which have syntactic categories on the right hand side, and rules which have only words on the right hand side. The rules are known as
↗phrase structure rule s, and
↗lexical rule s, respectively.
A Phrase Structure Grammar
Here's a simple so-called
↗phrase structure grammar (PSG) of English. We have phrase structure rules:
S→NPVP |
NP→PN |
NP→PNRel |
NP→DetNbar |
NBar→N |
NBar→NRel |
Rel→WhVP |
VP→IV |
VP→TVNP |
VP→DVNPPP |
VP→SVS |
PP→PNP |
...and lexical rules:
PN→vincent |
PN→mia |
PN→marsellus |
PN→jules |
Det→a |
Det→the |
Det→her |
Det→his |
N→gun |
N→robber |
N→man |
N→woman |
Wh→who |
Wh→that |
P→to |
IV→died |
IV→fell |
TV→loved |
TV→shot |
TV→knew |
DV→gave |
DV→handed |
SV→knew |
SV→believed |
So in this grammar, the terminal symbols are English words. There is a special word for the symbols (such as N, PN, and Det) which occur in lexical rules: they are called
↗preterminal symbol s.
This grammar is unambiguous. That is no string has two distinct parse trees. (Incidentally, this means that it is not a realistic grammar of English: all natural languages are highly ambiguous.) But it does display an interesting (and troublesome) phenomenon called
↗local ambiguity .
Consider the sentence "The robber knew Vincent shot Marsellus" . This has a unique parse tree in this grammar. But now look at the first part of it: "The robber knew Vincent" . This is also a sentence according to this grammar --- but
not when considered as a part of "The robber knew Vincent shot Marsellus" . This can be a problem for a parser. Locally we have something that looks like a sentence --- and a parser may prove that this part really is a sentence according to the grammar. But this information does not play a role in analyzing the larger sentence of which it is a part. Keep this in mind. It will become important again when we build a parser using this grammar in
» Bottom-Up Parsing and Recognition.