5.3.2 A First DCG

A first DCG for transitive verbs.

Let's now sit down and write a DCG for simple English relative clauses. In this DCG (found in dCG4Gaps.pl ) we will only deal with transitive verbs (verbs that take a single NP as argument --- or to use the proper linguistic terminology, verbs that subcategorize for a single NP). Further, to keep things simple, the only relative pronoun we shall deal with is ``who''.

Let's first look at the NP rules:

np(nogap) --> det, n. np(nogap) --> det, n, rel. np(nogap) --> pn. np(nogap) --> pn, rel. np(gap(np)) --> [].

The first four rules are probably familiar. They say that an English NP can consist of a determiner and a noun (for example: ``a witch''), or a determiner and a noun followed by a relative clause (for example: ``a witch who likes Harry''), or a proper name (for example: ``Harry''), or a proper name followed by a relative clause (for example: ``Harry, who likes a house-elf''). All these NPs are `complete' or `ordinary' NPs. Nothing is missing from them. That is why the extra argument of np contains the value nogap.

What about the fifth rule? This tells us that an NP can also be realized as an empty string --- that is, as nothing at all. Obviously this is a special rule: it's the one that lets us introduce gaps. It says: we are free to use `empty' NPs, but such NPs have to be marked by a feature which says that they are are special. Hence in this rule, the value of the extra argument is gap(np). This tells us that we are dealing with a special NP --- one in which the usual NP information is absent.

A Feature-Based Approach

The use of features to keep track of whether or not information is missing, and if so, what kind of information is missing, is the crucial idea underlying grammar-based treatments of all sorts of long distance dependency phenomena, not just relative clauses. Usually such features keep track of a lot more information than our simple nogap versus gap(np) distinction --- but as we are only looking at very simple relative clauses, this is all we'll need for now.

Now for the S and VP rules.

s(Gap) --> np(nogap),vp(Gap).

vp(Gap) --> v(1), np(Gap).

The first rule says that an S consists of an NP and a VP. Note that the NP must have the feature nogap. This simply records the fact that in English the NP in subject position cannot be realized by the empty string (in some languages, for example Italian, this is possible in some circumstances). Moreover, note that the value of the Gap variable carried by the VP (which will be either nogap or gap(np), depending on whether the VP contains empty NPs) is unified with the value of the Gap variable on the S. That is, we have here an example of feature passing: the record of the missing information in the verb phrase (if any) is passed up to the sentential level, so that we have a record of exactly which information is missing in the sentence.

The second rule says that a VP can consist of an ordinary transitive verb together with an NP. Note that instead of using the symbol tv for transitive verbs, we use the symbol v marked with an extra feature (the 1). (In the following section we shall introduce a second type of verb, which we will call v(2) verbs.) Also, note that this rule also performs feature passing: it passes the value of Gap variable up from the NP to the VP. So the VP will know whether the NP carries the value nogap or the value gap(np).

Now for the relativization rules:

rel --> prorel, s(gap(np)). rel --> prorel, vp(nogap).

The first rule deals with relativization in object position --- for example, the clause ``who Harry likes'' in ``The witch who Harry likes''. The clause ``who Harry likes'' is made up of the relative pronoun ``who'' (that is, a prorel) followed by ``Harry likes''. What is ``Harry likes''? It's a sentence that is missing its object NP --- that is, it is a s(gap(np)), which is precisely what the first relativization rule demands.

Incidentally --- historically, this sort of analysis, which is due to Gerald Gazdar, is extremely important. Note that the analysis we've just given doesn't talk about moving bits of sentences round. It just talks about missing information, and says which kind of information needs to be shared with the mother category. The link with the movement story should be clear: but the new information-based story is simpler, clearer and more precise.

The second rule deals with relativization in subject position --- for example, the clause ``who likes the witch'' in ``Harry, who likes the witch''. The clause ``who likes the witch'' is made up of the relative pronoun ``who'' (that is, a prorel) followed by ``likes the witch''. What is ``likes the witch''? Just an ordinary VP --- that is to say, a vp(nogap) just as the second relativization rule demands.

And that's basically it. We re-use the lexical rules from above and add some new ones:

n --> [house-elf]. pn --> [harry]. v(1) --> [likes]. v(1) --> [watches]. prorel --> [who].

Let's look at some examples. First, let's check that this little DCG handles ordinary sentences:

s(_,[harry,likes,the,witch],[]).

Let's now check that we can build relative clauses. First, object position relativization:

np(_,[the,witch,who,harry,likes],[]).

Now subject position relativization:

np(_,[harry,who,likes,the,witch],[]).

And of course, there's no need to stop there --- we can happily embed such constructions. For example, combining the last two examples we get:

np(_,[the,witch,who,harry,who,likes,the,witch,likes],[]).

And indeed, we really are correctly handling an unbounded construction. For example, we can form:

np(_,[a,witch,who,a,witch,who,harry,likes,likes],[]).

And we go on to add another level:

np(_,[a,witch,who,a,witch,who,a,witch,who,harry,likes,likes,likes],[]).

But this is getting hard to understand --- so let's simply check that we can make sentences containing relative clauses and then move on:

s(_,[a,house-elf,likes,a,house-elf,who,a,witch,likes],[]).