public class MaximumEntropyIrtg extends InterpretedTreeAutomaton
InterpretedTreeAutomaton
),
a maxent IRTG allows you to specify a set of feature functions,
cf. FeatureFunction
. In the grammar file, these are
declared using the feature
keyword, and come after
the interpretation
declarations and before the rules
of the grammar.
You can use any subclass of FeatureFunction
in your grammar,
as long as it is on the classpath. You can then add concrete instances
of your feature function classes to the grammar using one of the following
two forms:
feature f1: de.up.ling.irtg.maxent.ChildOfFeature('VP','PP')
:
Constructs a new object of class ChildOfFeature
, passing the strings
"VP" and "PP" as the first and second argument to the constructor of the class.feature f2: YourClass::staticMethod("a", "b")
:
Calls the static method YourClass#staticMethod
with the given
string arguments. The static method is supposed to return an object of
a subclass of FeatureFunction
, which is then used as the feature
function instance with name f2.Note that different feature instances must have different names (in the example: f1 and f2) so the system can keep their weights apart.
Supervised learning of the weights of a maxent IRTG is performed using
trainMaxent(de.up.ling.irtg.corpus.Corpus)
.
Computation of a weighted chart (in which rule weights corresponds to
log-linear scores of trees) is done with parseInputObjects(java.util.Map)
.
From this, you can compute the best derivation tree using TreeAutomaton.viterbi()
and similar methods.
Constructor and Description |
---|
MaximumEntropyIrtg(TreeAutomaton<String> automaton,
Map<String,FeatureFunction> featureMap)
Constructor
|
Modifier and Type | Method and Description |
---|---|
FeatureFunction |
getFeatureFunction(int index)
Returns the feature function referenced by index
|
FeatureFunction |
getFeatureFunction(String name)
Returns the feature function referenced by name
|
List<String> |
getFeatureNames()
Returns the list of the feature function names
|
FeatureFunction[] |
getFeatures() |
double |
getFeatureWeight(int i)
Returns the weight of a specific feature function referenced by
i
|
double[] |
getFeatureWeights()
Returns the array of the feature function weights
|
int |
getNumFeatures()
Returns the number of features
|
TreeAutomaton |
parseInputObjects(Map<String,Object> inputs)
Parses an input of representations and their name and computes a chart
for this input The member variable useIrtgParser indicates which
parser to use True: the parser of InterpretedTreeAutomaton will be used
False: an implementation of a CKY-parser will be used
|
void |
readWeights(Reader reader)
Reads the feature function weights from a reader, e.g., string or file
The data must be formatted as Java properties.
|
void |
setFeatures(Map<String,FeatureFunction> featureMap)
Sets the feature functions
|
void |
setFeatureWeight(int index,
double weight)
Sets the weight of a specific feature function
|
void |
setFeatureWeights(double[] weights)
Sets the array of the feature function weights
|
static void |
setLoggingLevel(Level level) |
String |
toString()
Returns a string representing the object and its elements
|
boolean |
trainMaxent(Corpus corpus) |
boolean |
trainMaxent(Corpus corpus,
ProgressListener listener)
Trains the weights for the rules according to the training data.
|
void |
writeWeights(Writer writer)
Writes the feature function weights to a writer, e.g., string or file The
data will be formatted as Java properties
|
addAllInterpretations, addInterpretation, bulkParse, bulkParse, decode, decodeToAutomaton, equals, filterBinarizedForAppearingConstants, filterForAppearingConstants, forAlgebras, fromPath, fromString, getAutomaton, getInterpretation, getInterpretations, interpret, interpret, normalizeRuleWeights, parse, parseCondensedWithPruning, parseSimple, parseString, parseWithSiblingFinder, read, readCorpus, setDebug, trainEM, trainEM, trainEM, trainML, trainVB, trainVB, trainVB
public MaximumEntropyIrtg(TreeAutomaton<String> automaton, Map<String,FeatureFunction> featureMap)
automaton
- the TreeAutomaton build by grammar rulesfeatureMap
- the map contains feature functions accessed by their
names. These functions are used to calculate probabilities for the RTGpublic final void setFeatures(Map<String,FeatureFunction> featureMap)
featureMap
- the mapping of names to feature functionspublic void setFeatureWeights(double[] weights)
weights
- the array of feature weightspublic void setFeatureWeight(int index, double weight)
index
- the position of the weight in the arrayweight
- the new weightNoFeaturesException
- if no features are presentpublic double getFeatureWeight(int i)
i
- the reference of a feature functionNoFeaturesException
- if no features are presentpublic double[] getFeatureWeights()
NoFeaturesException
- if no features are presentpublic List<String> getFeatureNames()
public FeatureFunction getFeatureFunction(String name)
name
- the name of the feature functionNoFeaturesException
- if no features are presentpublic FeatureFunction getFeatureFunction(int index)
index
- the index of the feature functionNoFeaturesException
- if no features are presentpublic int getNumFeatures()
public TreeAutomaton parseInputObjects(Map<String,Object> inputs)
parseInputObjects
in class InterpretedTreeAutomaton
inputs
- mapping of representations and their namespublic boolean trainMaxent(Corpus corpus)
public boolean trainMaxent(Corpus corpus, ProgressListener listener)
corpus
- the training data containing sentences and their parse treepublic void readWeights(Reader reader) throws IOException
reader
- the reader to read the data fromIOException
- if the reader cannot read the data properlypublic void writeWeights(Writer writer) throws IOException
writer
- the writer to store the data intoIOException
- if the writer cannot store the data properlypublic String toString()
toString
in class InterpretedTreeAutomaton
public FeatureFunction[] getFeatures()
public static void setLoggingLevel(Level level)
Copyright © 2017. All rights reserved.