InterpretedTreeAutomaton (alto 2.2-SNAPSHOT API)

java.lang.Object
- de.up.ling.irtg.InterpretedTreeAutomaton

All Implemented Interfaces:

Serializable

Direct Known Subclasses:

MaximumEntropyIrtg
```
public class InterpretedTreeAutomaton
extends Object
implements Serializable
```
An interpreted regular tree grammar (IRTG). An IRTG consists of a finite tree automaton M, which generates a language of derivation trees, plus an arbitrary number of interpretations, each of which maps derivation trees to objects over some algebra. In this way, the IRTG describes an n-place relation over these algebras.
In this implementation of IRTGs, M is given as an object of class TreeAutomaton, which is passed to the constructor as an argument. Each interpretation is added by name, and is represented as an object of class Interpretation. IRTGs are typically read from input streams (e.g. from files) by using an InputCodec, rather than being constructed programmatically.

Author:

koller

See Also:

Serialized Form

Constructor Summary

Constructors
Constructor and Description

InterpretedTreeAutomaton(TreeAutomaton<String> automaton)
Constructs a new IRTG with the given derivation tree automaton.

Constructors
Constructor and Description
`InterpretedTreeAutomaton(TreeAutomaton<String> automaton)` Constructs a new IRTG with the given derivation tree automaton.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`addAllInterpretations(Map<String,Interpretation> interps)` Adds all interpretations in the map, with their respective names.
`void`	`addInterpretation(String name, Interpretation interp)` Adds an interpretation with a given name.
`void`	`bulkParse(Corpus input, Consumer<Instance> corpusConsumer, ProgressListener listener)` Reads all inputs for this IRTG from a corpus and parses them.
`void`	`bulkParse(Corpus input, Predicate<Instance> filter, Consumer<Instance> corpusConsumer, ProgressListener listener)` Reads inputs for this IRTG from a corpus and parses them.
`Set<Object>`	`decode(String outputInterpretation, Map<String,String> representations)` Decodes a map of input representations to a set of objects of the specified output algebra.
`TreeAutomaton`	`decodeToAutomaton(String outputInterpretation, TreeAutomaton parseChart)` Decodes a parse chart to a term chart over some output algebra.
`boolean`	`equals(Object obj)` Compares the IRTG to another IRTG for equality.
`InterpretedTreeAutomaton`	`filterBinarizedForAppearingConstants(String interpName, Object input)` Creates a new IRTG with many of the rules filtered out.
`InterpretedTreeAutomaton`	`filterForAppearingConstants(String interpName, Object input)` Creates a new IRTG with many of the rules filtered out.
`static InterpretedTreeAutomaton`	`forAlgebras(Map<String,Algebra> algebras)` Creates an empty IRTG for the given algebras.
`static InterpretedTreeAutomaton`	`fromPath(String path)` Helper method that creates a stream from the given path and reads it as with `read` from an IRTG input codec.
`static InterpretedTreeAutomaton`	`fromString(String s)` Helper method that reads an IRTG from a string as with `read` from an IRTG input codec.
`TreeAutomaton<String>`	`getAutomaton()` Returns the derivation tree automaton.
`Interpretation`	`getInterpretation(String interp)` Returns the interpretation with the given name.
`Map<String,Interpretation>`	`getInterpretations()` Returns a map from which the interpretations can be retrieved using their names.
`Map<String,Object>`	`interpret(Tree<String> derivationTree)` Maps a given derivation tree to terms over all interpretations and evaluates them.
`Object`	`interpret(Tree<String> derivationTree, String interpretationName)` Interprets the given derivation tree in the interpretation with the given name, and returns an object of the algebra.
`void`	`normalizeRuleWeights()` Modifies the rule weights of the derivation tree automaton such that the weights for all rules with the same parent state sum to one.
`TreeAutomaton`	`parse(Map<String,String> representations)` Parses a map of input representations to a parse chart.
`TreeAutomaton`	`parseCondensedWithPruning(Map<String,Object> inputs, PruningPolicy pp)`
`TreeAutomaton`	`parseInputObjects(Map<String,Object> inputs)` Parses a map of input objects to a parse chart.
`TreeAutomaton`	`parseSimple(String interpretationName, Object input)` Parses a single input representations to a parse chart without using any optimization in the parsing process.
`Object`	`parseString(String interpretation, String representation)` Resolves the string representation to an object of the given algebra.
`TreeAutomaton`	`parseWithSiblingFinder(String interpretationName, Object input)` Parses a single input representations to a parse chart without using a sibling finder in the intersection.
`static InterpretedTreeAutomaton`	`read(InputStream r)` Helper method that reads an IRTG from an input stream as with `read` from an IRTG input codec.
`Corpus`	`readCorpus(Reader reader)` Loads a corpus for this IRTG using the given a reader.
`void`	`setDebug(boolean debug)` Switches debugging output on or off.
`String`	`toString()` Returns a string representation of the IRTG.
`void`	`trainEM(Corpus trainingData)` Performs expectation maximization (EM) training of this (weighted) IRTG using the given corpus.
`void`	`trainEM(Corpus trainingData, int iterations, double threshold, ProgressListener listener)` Performs expectation maximization (EM) training of this (weighted) IRTG using the given corpus and gives progress information to the passed progress listener.
`void`	`trainEM(Corpus trainingData, ProgressListener listener)` Performs expectation maximization (EM) training of this (weighted) IRTG using the given corpus and gives progress information to the passed progress listener.
`void`	`trainML(Corpus trainingData)` Performs maximum likelihood training of this (weighted) IRTG using the given annotated corpus.
`void`	`trainVB(Corpus trainingData)` Performs Variational Bayes (VB) training of this (weighted) IRTG using the given corpus.
`void`	`trainVB(Corpus trainingData, int iterations, double threshold, ProgressListener listener)` Performs Variational Bayes (VB) training of this (weighted) IRTG using the given corpus.
`void`	`trainVB(Corpus trainingData, ProgressListener listener)` Performs Variational Bayes (VB) training of this (weighted) IRTG using the given corpus.

Methods inherited from class java.lang.Object
getClass, hashCode, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - InterpretedTreeAutomaton
```
public InterpretedTreeAutomaton(TreeAutomaton<String> automaton)
```
    Constructs a new IRTG with the given derivation tree automaton.
    
    Parameters:
    
    automaton -
- Method Detail
  - addInterpretation
```
public void addInterpretation(String name,
                              Interpretation interp)
```
    Adds an interpretation with a given name.
    
    Parameters:
    
    name -
    
    interp -
  - addAllInterpretations
```
public void addAllInterpretations(Map<String,Interpretation> interps)
```
    Adds all interpretations in the map, with their respective names.
    
    Parameters:
    
    interps -
  - getAutomaton
```
public TreeAutomaton<String> getAutomaton()
```
    Returns the derivation tree automaton.
    
    Returns:
  - getInterpretations
```
public Map<String,Interpretation> getInterpretations()
```
    Returns a map from which the interpretations can be retrieved using their names.
    
    Returns:
  - getInterpretation
```
public Interpretation getInterpretation(String interp)
```
    Returns the interpretation with the given name.
    
    Parameters:
    
    interp -
    
    Returns:
  - interpret
```
public Object interpret(Tree<String> derivationTree,
                        String interpretationName)
```
    Interprets the given derivation tree in the interpretation with the given name, and returns an object of the algebra.
    
    Parameters:
    
    derivationTree -
    
    interpretationName -
    
    Returns:
  - interpret
```
public Map<String,Object> interpret(Tree<String> derivationTree)
```
    Maps a given derivation tree to terms over all interpretations and evaluates them. The method returns a mapping of interpretation names to objects in the respective algebras.
    
    Parameters:
    
    derivationTree -
    
    Returns:
  - parseString
```
public Object parseString(String interpretation,
                          String representation)
                   throws ParserException
```
    Resolves the string representation to an object of the given algebra. This is a helper function that retrieves the algebra for the given interpretation, and then calls Algebra.parseString(java.lang.String) on that algebra.
    
    Parameters:
    
    interpretation -
    
    representation -
    
    Returns:
    
    Throws:
    
    ParserException
  - parse
```
public TreeAutomaton parse(Map<String,String> representations)
                    throws ParserException
```
    Parses a map of input representations to a parse chart. "Representations" is a map that maps interpretation names to string representations of input objects. Each input object is resolved to object in the respective algebra, and its decomposition automaton computed. Then the pre-images of all decomposition automata under the homomorphism of the respective interpretation are computed, and all are intersected with the derivation tree automaton of the IRTG. The result is returned as a tree automaton; the language of that automaton is the set of all grammatically correct derivation trees that map to the given input objects.
    The interpretations for which inputs are specified in "representations" may be any subset of the interpretations that this IRTG understands.
    Note that this method makes no guarantees regarding reducedness of the resulting tree automaton. Depending on the way parsing was done, it may still contain states that are unreachable or unproductive.
    
    Parameters:
    
    representations -
    
    Returns:
    
    Throws:
    
    ParserException
  - parseSimple
```
public TreeAutomaton parseSimple(String interpretationName,
                                 Object input)
                          throws ParserException
```
    Parses a single input representations to a parse chart without using any optimization in the parsing process.
    
    Parameters:
    
    interpretationName - name of the interpretation from which the object comes.
    
    input -
    
    Returns:
    
    a tree automaton containing all possible derivation trees that are mapped to the input by the interpretation.
    
    Throws:
    
    ParserException
  - parseWithSiblingFinder
```
public TreeAutomaton parseWithSiblingFinder(String interpretationName,
                                            Object input)
                                     throws ParserException
```
    Parses a single input representations to a parse chart without using a sibling finder in the intersection.
    
    Parameters:
    
    interpretationName - name of the interpretation from which the object comes.
    
    input -
    
    Returns:
    
    a tree automaton containing all possible derivation trees that are mapped to the input by the interpretation.
    
    Throws:
    
    ParserException
  - parseCondensedWithPruning
```
public TreeAutomaton parseCondensedWithPruning(Map<String,Object> inputs,
                                               PruningPolicy pp)
```
  - parseInputObjects
```
public TreeAutomaton parseInputObjects(Map<String,Object> inputs)
```
    Parses a map of input objects to a parse chart. The process is as in parse(java.util.Map), except that the "inputs" map is a map of interpretation names to pre-constructed objects of the respective algebras.
    
    Parameters:
    
    inputs -
    
    Returns:
  - decodeToAutomaton
```
public TreeAutomaton decodeToAutomaton(String outputInterpretation,
                                       TreeAutomaton parseChart)
```
    Decodes a parse chart to a term chart over some output algebra. The term chart describes a language of the terms over the specified output algebra. This language is the homomorphic image of the parse chart under the homomorphism of the given output interpretation.
    
    Parameters:
    
    outputInterpretation -
    
    parseChart -
    
    Returns:
  - decode
```
public Set<Object> decode(String outputInterpretation,
                          Map<String,String> representations)
                   throws ParserException
```
    Decodes a map of input representations to a set of objects of the specified output algebra. This first computes a parse chart for the input representations, as per parse(java.util.Map). It then decodes the parse chart into an output term chart (see decodeToAutomaton(java.lang.String, de.up.ling.irtg.automata.TreeAutomaton) and evaluates each term in the language of the term chart to an object in the output algebra. The method returns the set of all of these objects.
    
    Parameters:
    
    outputInterpretation -
    
    representations -
    
    Returns:
    
    Throws:
    
    ParserException
  - trainML
```
public void trainML(Corpus trainingData)
             throws UnsupportedOperationException
```
    Performs maximum likelihood training of this (weighted) IRTG using the given annotated corpus. In the context of an IRTG, "annotated corpus" means that the derivation tree is annotated for each training instance.
    
    Parameters:
    
    trainingData -
    
    Throws:
    
    UnsupportedOperationException
  - trainEM
```
public void trainEM(Corpus trainingData)
```
    Performs expectation maximization (EM) training of this (weighted) IRTG using the given corpus. The corpus may be unannotated; if it contains annotated derivation trees, these are ignored by the algorithm. However, it must contain a parse chart for each instance (see Corpus for details) .
    The algorithm terminates as soon as the rate of the likelihood increases drops below 1E-5.
    
    Parameters:
    
    trainingData -
  - trainEM
```
public void trainEM(Corpus trainingData,
                    ProgressListener listener)
```
    Performs expectation maximization (EM) training of this (weighted) IRTG using the given corpus and gives progress information to the passed progress listener. The corpus may be unannotated; if it contains annotated derivation trees, these are ignored by the algorithm. However, it must contain a parse chart for each instance (see Corpus for details) .
    The algorithm terminates as soon as the rate of the likelihood increases drops below 1E-5.
    
    Parameters:
    
    trainingData -
    
    listener -
  - trainEM
```
public void trainEM(Corpus trainingData,
                    int iterations,
                    double threshold,
                    ProgressListener listener)
```
    Performs expectation maximization (EM) training of this (weighted) IRTG using the given corpus and gives progress information to the passed progress listener. The corpus may be unannotated; if it contains annotated derivation trees, these are ignored by the algorithm. However, it must contain a parse chart for each instance (see Corpus for details) .
    The algorithm terminates after a given number of iterations or as soon as the rate the likelihood increases drops below a given threshold.
    
    Parameters:
    
    trainingData -
    
    iterations - maximum number of iterations allowed
    
    threshold - minimum change in log-likelihood that prevents stopping of the iterations
    
    listener -
  - normalizeRuleWeights
```
public void normalizeRuleWeights()
```
    Modifies the rule weights of the derivation tree automaton such that the weights for all rules with the same parent state sum to one. This calls normalizeWeights on the tree automaton that produces the derivation trees.
  - trainVB
```
public void trainVB(Corpus trainingData)
```
    Performs Variational Bayes (VB) training of this (weighted) IRTG using the given corpus. The corpus may be unannotated; if it contains annotated derivation trees, these are ignored by the algorithm. However, it must contain a parse chart for each instance (see Corpus for details) .
    This method implements the algorithm from Jones et al., "Semantic Parsing with Bayesian Tree Transducers", ACL 2012. Iteration will terminate once the change in the ELBO falls below 1E-5.
    
    Parameters:
    
    trainingData - a corpus of parse charts
  - trainVB
```
public void trainVB(Corpus trainingData,
                    ProgressListener listener)
```
    Performs Variational Bayes (VB) training of this (weighted) IRTG using the given corpus. The corpus may be unannotated; if it contains annotated derivation trees, these are ignored by the algorithm. However, it must contain a parse chart for each instance (see Corpus for details) .
    This method implements the algorithm from Jones et al., "Semantic Parsing with Bayesian Tree Transducers", ACL 2012. Iteration will terminate once the change in the ELBO falls below 1E-5.
    
    Parameters:
    
    trainingData - a corpus of parse charts
    
    listener - a progress listener that will be given information about the progress of the optimization.
  - trainVB
```
public void trainVB(Corpus trainingData,
                    int iterations,
                    double threshold,
                    ProgressListener listener)
```
    Performs Variational Bayes (VB) training of this (weighted) IRTG using the given corpus. The corpus may be unannotated; if it contains annotated derivation trees, these are ignored by the algorithm. However, it must contain a parse chart for each instance (see Corpus for details) .
    This method implements the algorithm from Jones et al., "Semantic Parsing with Bayesian Tree Transducers", ACL 2012.
    
    Parameters:
    
    trainingData - a corpus of parse charts
    
    iterations - the maximum number of iterations allowed
    
    threshold - the minimum change in the ELBO before iterations are stopped
    
    listener - a progress listener that will be given information about the progress of the optimization.
  - setDebug
```
public void setDebug(boolean debug)
```
    Switches debugging output on or off.
    
    Parameters:
    
    debug -
  - readCorpus
```
public Corpus readCorpus(Reader reader)
                  throws IOException,
                         CorpusReadingException
```
    Loads a corpus for this IRTG using the given a reader. The corpus must define a subset of the interpretations which this IRTG defines.
    
    Parameters:
    
    reader -
    
    Returns:
    
    Throws:
    
    IOException
    
    CorpusReadingException
  - bulkParse
```
public void bulkParse(Corpus input,
                      Consumer<Instance> corpusConsumer,
                      ProgressListener listener)
```
    Reads all inputs for this IRTG from a corpus and parses them. This behaves like bulkParse(de.up.ling.irtg.corpus.Corpus, java.util.function.Predicate, java.util.function.Consumer, de.up.ling.irtg.util.ProgressListener) with an instance filter that always returns true.
    
    Parameters:
    
    input -
    
    corpusConsumer -
    
    listener -
  - bulkParse
```
public void bulkParse(Corpus input,
                      Predicate<Instance> filter,
                      Consumer<Instance> corpusConsumer,
                      ProgressListener listener)
```
    Reads inputs for this IRTG from a corpus and parses them. The input corpus must be suitable for this IRTG (i.e., use a subset of the interpretations it defines). If the corpus has charts attached, these will be used; otherwise, each instance for which the "filter" is true is parsed. We then compute the best derivation tree from each chart using Viterbi, and map it to all interpretations of the IRTG. This yields a "completed" Instance (consisting of the derivation tree and values on all interpretations), which we write to the given corpusConsumer (e.g., a CorpusWriter). If a non-null value is passed as the "listener", it is notified after each instance has been written.
    Note that the output corpus may contain fewer instances than the input corpus, if the "filter" returned false on some of the input instances.
    
    Parameters:
    
    input -
    
    filter -
    
    corpusConsumer -
    
    listener -
  - toString
```
public String toString()
```
    Returns a string representation of the IRTG. The IRTG is given in the same format that the IrtgInputCodec understands.
    
    Overrides:
    
    toString in class Object
    
    Returns:
  - equals
```
public boolean equals(Object obj)
```
    Compares the IRTG to another IRTG for equality. Two IRTGs are considered equal if (1) their derivation tree automata are equal; (2) they define the same interpretation names; (3) for each interpretation name, the homomorphisms are equal; (4) for each interpretation name, the intepretations use the same algebra class.
    
    Overrides:
    
    equals in class Object
    
    Parameters:
    
    obj -
    
    Returns:
  - read
```
public static InterpretedTreeAutomaton read(InputStream r)
                                     throws IOException,
                                            CodecParseException
```
    Helper method that reads an IRTG from an input stream as with read from an IRTG input codec.
    
    Parameters:
    
    r -
    
    Returns:
    
    Throws:
    
    IOException
    
    CodecParseException
  - fromString
```
public static InterpretedTreeAutomaton fromString(String s)
                                           throws IOException,
                                                  CodecParseException
```
    Helper method that reads an IRTG from a string as with read from an IRTG input codec.
    
    Parameters:
    
    s -
    
    Returns:
    
    Throws:
    
    IOException
    
    CodecParseException
  - fromPath
```
public static InterpretedTreeAutomaton fromPath(String path)
                                         throws IOException,
                                                CodecParseException
```
    Helper method that creates a stream from the given path and reads it as with read from an IRTG input codec.
    
    Parameters:
    
    path -
    
    Returns:
    
    Throws:
    
    IOException
    
    CodecParseException
  - forAlgebras
```
public static InterpretedTreeAutomaton forAlgebras(Map<String,Algebra> algebras)
```
    Creates an empty IRTG for the given algebras. The IRTG contains a tree automaton with no rules, and one interpretation for each entry of the given map, with the given name and the given algebra.
    
    Parameters:
    
    algebras -
    
    Returns:
  - filterForAppearingConstants
```
public InterpretedTreeAutomaton filterForAppearingConstants(String interpName,
                                                            Object input)
```
    Creates a new IRTG with many of the rules filtered out. The rules are filtered out if they contain a constant in the given interpretation which cannot be used in deriving the given object.
    
    Parameters:
    
    interpName -
    
    input -
    
    Returns:
  - filterBinarizedForAppearingConstants
```
public InterpretedTreeAutomaton filterBinarizedForAppearingConstants(String interpName,
                                                                     Object input)
```
    Creates a new IRTG with many of the rules filtered out. The rules are filtered out if they contain a constant in the given interpretation which cannot be used in deriving the given object or rules which are connected by binarization to the rules that have been removed. Note that some binarization techniques, e.g. the "inside" strategy used in InsideRuleFactory, pool rules together after binarization. Removing all rules connected by binarization to rules removed due to constants removes some of these pooled rules that would be necessary for parsing. Thus, for grammars binarized with such strategies, use filterForAppearingConstants(java.lang.String, java.lang.Object) instead.
    
    Parameters:
    
    interpName -
    
    input -
    
    Returns:

Class InterpretedTreeAutomaton

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

InterpretedTreeAutomaton

Method Detail

addInterpretation

addAllInterpretations

getAutomaton

getInterpretations

getInterpretation

interpret

interpret

parseString

parse

parseSimple

parseWithSiblingFinder

parseCondensedWithPruning

parseInputObjects

decodeToAutomaton

decode

trainML

trainEM

trainEM

trainEM

normalizeRuleWeights

trainVB

trainVB

trainVB

setDebug

readCorpus

bulkParse

bulkParse

toString

equals

read

fromString

fromPath

forAlgebras

filterForAppearingConstants

filterBinarizedForAppearingConstants