A Little Grammar of English

The above ideas adapt straightforwardly to natural languages. But with natural languages it is useful to draw a distinction between rules which have syntactic categories on the right hand side, and rules which have only words on the right hand side. The rules are known as ^↗phrase structure rule s, and ^↗lexical rule s, respectively.

A Phrase Structure Grammar

Here's a simple so-called ^↗phrase structure grammar (PSG) of English. We have phrase structure rules:

S→NPVP

NP→PN

NP→PNRel

NP→DetNbar

NBar→N

NBar→NRel

Rel→WhVP

VP→IV

VP→TVNP

VP→DVNPPP

VP→SVS

PP→PNP

...and lexical rules:

PN→vincent

PN→mia

PN→marsellus

PN→jules

Det→a

Det→the

Det→her

Det→his

N→gun

N→robber

N→man

N→woman

Wh→who

Wh→that

P→to

IV→died

IV→fell

TV→loved

TV→shot

TV→knew

DV→gave

DV→handed

SV→knew

SV→believed

So in this grammar, the terminal symbols are English words. There is a special word for the symbols (such as N, PN, and Det) which occur in lexical rules: they are called ^↗preterminal symbol s.

This grammar is unambiguous. That is no string has two distinct parse trees. (Incidentally, this means that it is not a realistic grammar of English: all natural languages are highly ambiguous.) But it does display an interesting (and troublesome) phenomenon called ^↗local ambiguity .

Local Ambiguity

Consider the sentence "The robber knew Vincent shot Marsellus" . This has a unique parse tree in this grammar. But now look at the first part of it: "The robber knew Vincent" . This is also a sentence according to this grammar --- but not when considered as a part of "The robber knew Vincent shot Marsellus" . This can be a problem for a parser. Locally we have something that looks like a sentence --- and a parser may prove that this part really is a sentence according to the grammar. But this information does not play a role in analyzing the larger sentence of which it is a part. Keep this in mind. It will become important again when we build a parser using this grammar in » Bottom-Up Parsing and Recognition.