8.4.2 The Algorithm

The chart parsing algorithm in Prolog.

Now, when wanting to recognize a sentence, the first thing we have to do is to initialize the chart, i.e., to write the appropriate scan-facts for the words into the Prolog database. This is done by the following predicate. It is a recursive predicate that works its way through the list representing the input sentence in the standard fashion of Prolog list processing predicates. For each word in the input it asserts a fact with the functor scan recording the position of that word.

initialize_chart([], _).

initialize_chart([Word|Input], From) :- To is From + 1, %%% assertz -> put new word at the end of the db assertz(scan(From, To, Word)), %%% write(scan(From, To, Word)), nl, initialize_chart(Input, To).

Then, we have to read the first word and add all the arcs that we can build from this first word. When nothing new can be added, we read the next word and add all arcs that we can build for the substring between 0 and 2. When no more can be done for that span, we read the next word and again add all new arcs that can be build for this new word. We continue like this until we reach the end of the sentence.

The main predicate of this part of the program is process_bottomup/0.

process_chart_bottomup :- doall( (scan(From, To, Word), %%% write('read word: '), write(Word),nl, lex(Word, Cat), add_arc(arc(From, To, Cat))) ).

It reads a word from the database (scan(From, To, Word)) and looks up its category in the lexicon (lex(Word, Cat)). The real work is done by calling add_arc/1. This predicate adds the new arc and all arcs that can be built from it to the chart.

Then, doall/1 forces backtracking: process_bottomup/0 backtracks to use other lexical entries for the word under consideration if there are any and to read the next word if there are none. doall/1 implements a failure driven loop. It forces Prolog to backtrack over its argument. doall/1 is implemented as follows:

doall(Goal) :- Goal, fail.

doall(_) :- true.

add_arc/1 takes an arc as argument. If that arc is not yet in the chart, add_arc adds it. It then calls new_arcs/1, which constructs and adds all the arcs that can be build from combining the newly added arc with what is already in the chart.

add_arc(Arc) :- \+ Arc, assertz(Arc), %%% write(Arc),nl, new_arcs(Arc).

new_arcs/1 also uses the failure driven loop predicate doall/1 --- we want to find all the new arcs that can be built. We do this by looking for suitable edges to our left in the chart. As we have explained in Section 8.2.2, the chart is complete to the left: All potential arcs to combine with can only be on our left.

For example if our newly added category is an np, we try to find rules having an NP as rightmost part of the right hand side (remember we are using a bottom-up strategy), e.g. vp ---> [tv,np]. Then we try to find the left part of the right hand side (tv in this example) in the database (and hence in the chart on our left).

new_arcs(arc(J, K, Cat)) :- doall( (LHS ---> RHS, append(Before, [Cat], RHS), path(Before, I, J), add_arc(arc(I, K, LHS))) ).

new_arcs takes an arc arc(J,K,Cat) as argument. To find new arcs, we take a rule from the grammar and check a) whether Cat is the last category on the right hand side of that rule and b) whether there is a sequence of arcs in the chart that spans I,J and is labelled with the categories that come on the right hand side of the rule before Cat. If that is the case, a recursive call of add_arc adds the new arc /the one spanning I,J and labelled with the left hand side of the rule) to the chart and checks whether it can be used to build any further arcs.

Let's go back to our example from Section 8.2.1. Suppose we have just added the vp and called new_arcs(arc(1, 3, vp)).

--------- vp --------- | | | | -- np -- --- np --- | | | | | | | | -- pn -- -- tv -- --- pn --- | | | | | | | | 0 vincent 1 shot 2 marsellus 3

We then find the rule s ---> [np,vp], so Before is instantiated with [np]. Next, path([np],I,1) checks whether there is such a path. Indeed there is: I is instantiated with 0. So we add the arc arc(0,3,s) to the chart (and recursively check whether there is anything else that has to be done).

It only remains to define path/3. It's first argument is a list of categories, and the second and third arguments are nodes. The predicate recursively checks if an arc, or a sequence of arcs, links the two input nodes:

path([], I, I).

path([Cat|Cats], I, K) :- arc(I, J, Cat), J =< K, path(Cats, J, K).

Now, we have defined the predicates that build the chart. What is missing is a predicate the calls them and then checks whether the final chart contains an arc that spans the whole input sentence and is labelled with s. Here it is:

chart_recognize_bottomup(Input) :- cleanup, initialize_chart(Input, 0), process_chart_bottomup, length(Input, N), arc(0, N, s).

The first thing this predicate does is to clean the chart. This is very important. Our recognizer works by asserting stuff into the database. This means we need to get rid of all the stuff that was asserted to the database before we try to recognize a new sentence --- for otherwise all the old stuff will be hanging around, and may interfere with the analysis of the new sentence. In short, we need to retract all the information asserted the last time we used the recognizer before we use it again. And this is exactly what cleanup/0 does for us:

cleanup :- retractall(scan(_,_,_)), retractall(arc(_,_,_)).

After cleaning the database up, chart_recognize_bottomup/1 initializes the chart with the input sequence, and processes the chart. Finally, the only thing to be done is check whether there is an arc labled s that is spanning the whole input.

The whole program can be found in passive_chart_bottomup.pl .