Active Chart Parsing

Abstract:
This lecture consists of two parts. In the first part, we will:
  1. Explain the basic ideas of active chart parsing: active edges, the fundamental rule, and the use of agendas.
  2. Present a simple bottom-up active chart recognition algorithm.
  3. Implement this algorithm in Prolog.
In the second part, we will:
  1. Adapt the general active chart parsing algorithm introduced in the first part to work top-down.
  2. Present an example of the top-down algorithm in action.

Table of Contents

Active Edges
In this lecture we move from passive chart parsing to active chart parsing. Active chart parsing is based on a very simple idea. A passive chart, like the one we worked with in the previous chapter, is used to keep a record of complete constituents that we have found. (For example, on a passive chart we can record the information that the string between nodes 2 and 4 is an NP.) In an active chart we additionally record hypotheses --- that is, we record what it is we are actually looking for, and how much of it we have found so far. Such information is recorded in active edge s (or active arc s).

The Fundamental Rule
Active arcs are called `active' for a very good reason: They "actively" hint at what can be done with them. In particular, we can see at once how to combine them with passive edges to make new edges. The new edges we make may be either passive or active.

Using an Agenda
Active chart parsers usually make use of an agenda . An agenda is a datastructure that keeps track of the things that we still have to do. When new edges are created, we have to remember that we have to look at them to see whether they can be combined with other edges in any way. To not forget this we store them in the agenda. (They are not added directly to the chart as with the passive chart parser.) We will then take one edge at a time from the agenda, add it to the chart, and then use it to build new edges.

A General Algorithm for Active Chart Parsing
A nice thing about using an agenda is that it is straightforward to specify a very general algorithm for active chart parsing. We'll now present this algorithm. With only minor changes --- or to be more accurate, simply by being a bit more precise about how we are going to carry out the individual steps given below --- it is very easy to convert this general algorithm into (say) a top-down or bottom-up active charting parsing algorithm.

Bottom-up Active Chart Parsing
Now that we know about active edges, the fundamental rule, and agendas, it's time to put all these ingredients together and look at a concrete chart parsing algorithm. The bottom-up algorithm we shall use is essentially the general algorithm (???» A General Algorithm for Active Chart Parsing#3) just studied, but with the details filled in to make it work bottom-up. To understand it, let's go through it step by step using an example.

The Code
Code Summary for this Chapter.

Top-Down Active Chart Parsing
In the last sections we presented a general algorithm (???» A General Algorithm for Active Chart Parsing#3) for working with active charts and agendas, and claimed that by making small changes to this algorithm we would be able to make it work either bottom-up or top-down. We have shown how to make it work bottom-up. Now we'll see how to make it work top-down.There are two main changes we have to make. First, and most fundamentally, we have to think about the ways rules are used to make active edges. That is, we need to rethink step 2d of the algorithm. In addition, we need to set up the initial chart and agenda differently. Let's discuss these issues one by one.


Exercise

  1. [This may be your end project]
    Implement a top-down chart recognizer in Prolog.
    As we have said, you only have to adapt steps 1 (the initialization of chart and agenda) and 2d (how new hypotheses are predicted) of the general algorithm (???» A General Algorithm for Active Chart Parsing#3) to arrive at a top-down parser.
    That means that you can reuse most of the code that we wrote for the bottom-up algorithm. In fact, you only have to change the predicates initialize_chart_bottomup/2 and initialize_agenda_bottomup/1 that take care of the initialization, and the predicates make_new_arcs_bottomup/2 and predict_new_arcs_bottomup/2 that implement step 2d.

    initialize chart

    Let's first look at the initialization. For top-down processing, we initially write not only the words and their position into the chart but also passive arcs giving us the categories of the words. So, initialize_chart_topdown/2 has to recurse through the input list, and
    1. write the approriate scan entries to the chart,
    2. retrieve all categories for each word from the lexicon, and
    3. assert a passive arc to the chart for each of these categories.
    For step 2 and 3, you might use a failure driven loop (remember doall/1 from » The Algorithm).

    initialize agenda

    In the initial agenda we want to have all possible hypotheses of how a sentence can be built. Use findall/3 to to find all rules that have s on the left hand side of the arrow and add and active arc from position 0 to position 0.

    new arcs

    Now, let's look at how new arcs are created. As in the bottom-up case, we apply step 2c to all arcs. But unlike there, we will apply step 2d only to active arcs. make_new_arcs_topdown/2 has to make sure that apply_fundamental_rule/2 and predict_new_arcs_topdown/2 are applied appropriately depending on whether an active or a passive arc is coming in.
    The fundamental rule remains the same. But predict_new_arcs_topdown/2 differs from predict_new_arcs_bottomup/2 . The arcs that are coming in now are active: CCFound.CatToFind. predict_new_arcs_topdown/2 has to look for all rules that have the category CatToFind on the left hand side and create new active edges for them. These new active edges don't cover anything of the string, yet, and start in the position where the active edge ends.
    That is basically it. Remember that a proper documentation is an essential part of your Endprojekt. Give a short specification of what you implemented, i.e. the use and functionality of your program. Then describe in detail (!) how you implemented it. I.e. quote parts (or all) of the code and explain how it works. Third, supply examples calls that can be used for testing your program.
    Once again, if you have trouble or see that something doesn't work properly, don't mind. Document this problem and discuss your efforts to solve it. You need not be perfect, but you should show your understanding of the matter.