public class InterpretedTreeAutomaton extends Object implements Serializable
In this implementation of IRTGs, M is given as an object of class
TreeAutomaton
, which is passed to the constructor as an argument.
Each interpretation is added by name, and is represented as an object of
class Interpretation
. IRTGs are typically read from input streams
(e.g. from files) by using an InputCodec
, rather than being
constructed programmatically.
Constructor and Description |
---|
InterpretedTreeAutomaton(TreeAutomaton<String> automaton)
Constructs a new IRTG with the given derivation tree automaton.
|
Modifier and Type | Method and Description |
---|---|
void |
addAllInterpretations(Map<String,Interpretation> interps)
Adds all interpretations in the map, with their respective names.
|
void |
addInterpretation(String name,
Interpretation interp)
Adds an interpretation with a given name.
|
void |
bulkParse(Corpus input,
Consumer<Instance> corpusConsumer,
ProgressListener listener)
Reads all inputs for this IRTG from a corpus and parses them.
|
void |
bulkParse(Corpus input,
Predicate<Instance> filter,
Consumer<Instance> corpusConsumer,
ProgressListener listener)
Reads inputs for this IRTG from a corpus and parses them.
|
Set<Object> |
decode(String outputInterpretation,
Map<String,String> representations)
Decodes a map of input representations to a set of objects of the
specified output algebra.
|
TreeAutomaton |
decodeToAutomaton(String outputInterpretation,
TreeAutomaton parseChart)
Decodes a parse chart to a term chart over some output algebra.
|
boolean |
equals(Object obj)
Compares the IRTG to another IRTG for equality.
|
InterpretedTreeAutomaton |
filterBinarizedForAppearingConstants(String interpName,
Object input)
Creates a new IRTG with many of the rules filtered out.
|
InterpretedTreeAutomaton |
filterForAppearingConstants(String interpName,
Object input)
Creates a new IRTG with many of the rules filtered out.
|
static InterpretedTreeAutomaton |
forAlgebras(Map<String,Algebra> algebras)
Creates an empty IRTG for the given algebras.
|
static InterpretedTreeAutomaton |
fromPath(String path)
Helper method that creates a stream from the given path and reads it as with
read from an IRTG input codec. |
static InterpretedTreeAutomaton |
fromString(String s)
Helper method that reads an IRTG from a string as with
read
from an IRTG input codec. |
TreeAutomaton<String> |
getAutomaton()
Returns the derivation tree automaton.
|
Interpretation |
getInterpretation(String interp)
Returns the interpretation with the given name.
|
Map<String,Interpretation> |
getInterpretations()
Returns a map from which the interpretations can be retrieved using their
names.
|
Map<String,Object> |
interpret(Tree<String> derivationTree)
Maps a given derivation tree to terms over all interpretations and
evaluates them.
|
Object |
interpret(Tree<String> derivationTree,
String interpretationName)
Interprets the given derivation tree in the interpretation with the given
name, and returns an object of the algebra.
|
void |
normalizeRuleWeights()
Modifies the rule weights of the derivation tree automaton such that the
weights for all rules with the same parent state sum to one.
|
TreeAutomaton |
parse(Map<String,String> representations)
Parses a map of input representations to a parse chart.
|
TreeAutomaton |
parseCondensedWithPruning(Map<String,Object> inputs,
PruningPolicy pp) |
TreeAutomaton |
parseInputObjects(Map<String,Object> inputs)
Parses a map of input objects to a parse chart.
|
TreeAutomaton |
parseSimple(String interpretationName,
Object input)
Parses a single input representations to a parse chart without using any optimization in the parsing process.
|
Object |
parseString(String interpretation,
String representation)
Resolves the string representation to an object of the given algebra.
|
TreeAutomaton |
parseWithSiblingFinder(String interpretationName,
Object input)
Parses a single input representations to a parse chart without using a sibling finder in the intersection.
|
static InterpretedTreeAutomaton |
read(InputStream r)
Helper method that reads an IRTG from an input stream as with
read
from an IRTG input codec. |
Corpus |
readCorpus(Reader reader)
Loads a corpus for this IRTG using the given a reader.
|
void |
setDebug(boolean debug)
Switches debugging output on or off.
|
String |
toString()
Returns a string representation of the IRTG.
|
void |
trainEM(Corpus trainingData)
Performs expectation maximization (EM) training of this (weighted) IRTG
using the given corpus.
|
void |
trainEM(Corpus trainingData,
int iterations,
double threshold,
ProgressListener listener)
Performs expectation maximization (EM) training of this (weighted) IRTG
using the given corpus and gives progress information to the passed progress
listener.
|
void |
trainEM(Corpus trainingData,
ProgressListener listener)
Performs expectation maximization (EM) training of this (weighted) IRTG
using the given corpus and gives progress information to the passed progress
listener.
|
void |
trainML(Corpus trainingData)
Performs maximum likelihood training of this (weighted) IRTG using the
given annotated corpus.
|
void |
trainVB(Corpus trainingData)
Performs Variational Bayes (VB) training of this (weighted) IRTG using
the given corpus.
|
void |
trainVB(Corpus trainingData,
int iterations,
double threshold,
ProgressListener listener)
Performs Variational Bayes (VB) training of this (weighted) IRTG using
the given corpus.
|
void |
trainVB(Corpus trainingData,
ProgressListener listener)
Performs Variational Bayes (VB) training of this (weighted) IRTG using
the given corpus.
|
public InterpretedTreeAutomaton(TreeAutomaton<String> automaton)
automaton
- public void addInterpretation(String name, Interpretation interp)
name
- interp
- public void addAllInterpretations(Map<String,Interpretation> interps)
interps
- public TreeAutomaton<String> getAutomaton()
public Map<String,Interpretation> getInterpretations()
public Interpretation getInterpretation(String interp)
interp
- public Object interpret(Tree<String> derivationTree, String interpretationName)
derivationTree
- interpretationName
- public Map<String,Object> interpret(Tree<String> derivationTree)
derivationTree
- public Object parseString(String interpretation, String representation) throws ParserException
Algebra.parseString(java.lang.String)
on that algebra.interpretation
- representation
- ParserException
public TreeAutomaton parse(Map<String,String> representations) throws ParserException
The interpretations for which inputs are specified in "representations" may be any subset of the interpretations that this IRTG understands.
Note that this method makes no guarantees regarding reducedness of the resulting tree automaton. Depending on the way parsing was done, it may still contain states that are unreachable or unproductive.
representations
- ParserException
public TreeAutomaton parseSimple(String interpretationName, Object input) throws ParserException
interpretationName
- name of the interpretation from which the object comes.input
- ParserException
public TreeAutomaton parseWithSiblingFinder(String interpretationName, Object input) throws ParserException
interpretationName
- name of the interpretation from which the object comes.input
- ParserException
public TreeAutomaton parseCondensedWithPruning(Map<String,Object> inputs, PruningPolicy pp)
public TreeAutomaton parseInputObjects(Map<String,Object> inputs)
parse(java.util.Map)
, except that the "inputs" map is a map of
interpretation names to pre-constructed objects of the respective
algebras.inputs
- public TreeAutomaton decodeToAutomaton(String outputInterpretation, TreeAutomaton parseChart)
outputInterpretation
- parseChart
- public Set<Object> decode(String outputInterpretation, Map<String,String> representations) throws ParserException
parse(java.util.Map)
. It then decodes
the parse chart into an output term chart (see decodeToAutomaton(java.lang.String, de.up.ling.irtg.automata.TreeAutomaton)
and evaluates each term in the language of the term chart to an object in
the output algebra. The method returns the set of all of these objects.outputInterpretation
- representations
- ParserException
public void trainML(Corpus trainingData) throws UnsupportedOperationException
trainingData
- UnsupportedOperationException
public void trainEM(Corpus trainingData)
Corpus
for
details)
.The algorithm terminates as soon as the rate of the likelihood increases drops below 1E-5.
trainingData
- public void trainEM(Corpus trainingData, ProgressListener listener)
Corpus
for
details)
.The algorithm terminates as soon as the rate of the likelihood increases drops below 1E-5.
trainingData
- listener
- public void trainEM(Corpus trainingData, int iterations, double threshold, ProgressListener listener)
Corpus
for
details)
.The algorithm terminates after a given number of iterations or as soon as the rate the likelihood increases drops below a given threshold.
trainingData
- iterations
- maximum number of iterations allowedthreshold
- minimum change in log-likelihood that prevents stopping of the iterationslistener
- public void normalizeRuleWeights()
normalizeWeights
on the tree automaton that produces the derivation trees.public void trainVB(Corpus trainingData)
Corpus
for details)
.This method implements the algorithm from Jones et al., "Semantic Parsing with Bayesian Tree Transducers", ACL 2012. Iteration will terminate once the change in the ELBO falls below 1E-5.
trainingData
- a corpus of parse chartspublic void trainVB(Corpus trainingData, ProgressListener listener)
Corpus
for details)
.This method implements the algorithm from Jones et al., "Semantic Parsing with Bayesian Tree Transducers", ACL 2012. Iteration will terminate once the change in the ELBO falls below 1E-5.
trainingData
- a corpus of parse chartslistener
- a progress listener that will be given information about
the progress of the optimization.public void trainVB(Corpus trainingData, int iterations, double threshold, ProgressListener listener)
Corpus
for details)
.This method implements the algorithm from Jones et al., "Semantic Parsing with Bayesian Tree Transducers", ACL 2012.
trainingData
- a corpus of parse chartsiterations
- the maximum number of iterations allowedthreshold
- the minimum change in the ELBO before iterations are stoppedlistener
- a progress listener that will be given information about
the progress of the optimization.public void setDebug(boolean debug)
debug
- public Corpus readCorpus(Reader reader) throws IOException, CorpusReadingException
reader
- IOException
CorpusReadingException
public void bulkParse(Corpus input, Consumer<Instance> corpusConsumer, ProgressListener listener)
bulkParse(de.up.ling.irtg.corpus.Corpus, java.util.function.Predicate, java.util.function.Consumer, de.up.ling.irtg.util.ProgressListener)
with an instance filter that always returns true.input
- corpusConsumer
- listener
- public void bulkParse(Corpus input, Predicate<Instance> filter, Consumer<Instance> corpusConsumer, ProgressListener listener)
Instance
(consisting of the derivation tree and
values on all interpretations), which we write to the given
corpusConsumer (e.g., a CorpusWriter
). If a non-null value is
passed as the "listener", it is notified after each instance has been
written.Note that the output corpus may contain fewer instances than the input corpus, if the "filter" returned false on some of the input instances.
input
- filter
- corpusConsumer
- listener
- public String toString()
public boolean equals(Object obj)
public static InterpretedTreeAutomaton read(InputStream r) throws IOException, CodecParseException
read
from an IRTG input codec.r
- IOException
CodecParseException
public static InterpretedTreeAutomaton fromString(String s) throws IOException, CodecParseException
read
from an IRTG input codec.s
- IOException
CodecParseException
public static InterpretedTreeAutomaton fromPath(String path) throws IOException, CodecParseException
read
from an IRTG input codec.path
- IOException
CodecParseException
public static InterpretedTreeAutomaton forAlgebras(Map<String,Algebra> algebras)
algebras
- public InterpretedTreeAutomaton filterForAppearingConstants(String interpName, Object input)
interpName
- input
- public InterpretedTreeAutomaton filterBinarizedForAppearingConstants(String interpName, Object input)
InsideRuleFactory
, pool rules together
after binarization. Removing all rules connected by binarization
to rules removed due to constants removes some of these pooled rules that
would be necessary for parsing. Thus, for grammars binarized with such strategies,
use filterForAppearingConstants(java.lang.String, java.lang.Object)
instead.interpName
- input
- Copyright © 2017. All rights reserved.