Glossary

abstracted over	When forming an abstraction, the variable in the λ-prefix is abstracted over.
active arc	In active chart parsing we record hypotheses about structures to be found and what has been found so far in active arcs.
active edge	Same as ^↗active arcs.
affix	The smallest units in a word that bear meaning, such as "rabbit" and "s" , are called morphemes. Morphemes like rabbit that contribute the main meaning of a noun, verb, etc. are also called stems, while the other morphemes are known as affixes.
agenda	In active chart parsing, an agenda is a datastructure that keeps track of the tasks that still have to be addressed.
algorithm	A detailed sequence of actions to perform to accomplish some task. Named after an Iranian mathematician, Al-Khawarizmi.
alphabet	The alphabet of a Finitie State Automaton (usually called Σ) is the set of all strings (symbols) that can be processed during one of the transitions.
ambiguity	In this context, ambiguity describes the situation in which an expression of a (formal) language has more than one analyses.
atomic formula	Atomic formulas are formulae that do not contain any connectives.
bottom-up parsing and recognition	The basic idea of bottom-up parsing and recognition is to begin with the concrete data provided by the input string and try to recursively build bigger and bigger pieces of structure.
bound variable	A quantifier is said to bind the variable standing immediately to its right. Within a quantification of the form ∀xϕ or ∃xϕ all occurences of x within ϕ are bound by the leading ∀ (resp. ∃) if ϕ itself does not contain any quantifiers. Otherwise, only those occurences are bound by the leading quantifier that are not bound by any quantifier within ϕ itself. For a more explicit formal definition of bound and free variable, see the main text.
breadth-first	A graph search algorithm which tries all one-step extensions of current paths before trying larger extensions is called breadth-first.
chart parsing	Chart parsing is based on the idea to keep a record (a chart) of all the structures found during parsing. It is a very general idea, and it can be used with just about any kind of parsing (or recognition) strategy (top down, bottom up, left-corner, depth first, breadth first, ...).
concatenation	The concatenation of two strings A and B is the string AB, where both are for one sequence.
confluence	Property of calculi: A calculus is called confluent if the order of rule application has no influence on the results obtained.
constant symbol	Syntactic entity (in First Order Logic) corresponding to a name: A symbols to which one fixed entity from the domain of the model in use is assigned.
context free	Said of a grammar where the syntax of each constituent is independent of the symbols (words) occuring before and after it in a sentence. E.g. the context free rule NP → DET N says that a sequence of words that are of the types DET and N can be ananlyzed as an NP irrespective of their neighbouring symbols.
context free grammar	Context free grammars (CFGs) describe a larger class of formal languages than regular languages do. E.g. the language anbn can be described by a CFG. Even more, many natural language structures can be neatly described with context free grammars. This makes them especially useful for writing grammars for NLP applications. Another reason why CFGs are widely used in the field of NLP is the availability of many efficient tools (compilers, parsers, etc.).
context free rule	Are rules of a ^↗context free grammar (you may also see such rules called rewrite rules, or something similar).
Definite Clause Grammar	A grammar formalism that is built into Prolog. Although DCGs are formally more powerful than CFGs, we use the DCG mechanism to deal with CFGs.
depth-first	Said of a graph search algorithm which extends the current path as far as possible before backtracking (to the last so-called choice point) and trying the next alternative path.
derivation	A sequence of applications of grammar rules (starting with some initial symbols) is called a derivation of the symbols in the last row.
derivation (morphological)	Is the technical term for the process of adding an ^↗affix to a word ^↗stem resulting in a word with a class different from that of the stem. Making a noun out of a verb by adding "ation" to the verb is an example: "realize" + "ation" gives us realization.
dialogue act	Dialogue acts describe what role different utterances play within a dialogue.
dynamic	Prolog puts some restrictions on what can be asserted and retracted. More precisely, other predicates are only allowed to remove or retract clauses of predicates that have been declared dynamic. So, if one wants to write a predicate which asserts or retracts clauses of the form happy(mia), one also has to put the following statement into the file: :- dynamic happy/1.
exponential	Computional costs are typically time and space. Exponential Problems are such that one of these factors expands in an exponential curve (if the input gets more complex).
extraction	If syntactic arguments are moved from their original position, one speaks of extraction. C.f. "John" in "(It is) John, Mary loves."
failure driven loop	The idea of failure driven loops is to force Prolog to backtrack until there are no more possibilities left. This technique is particularily helpful for working with the Prolog database.
feature	Features group similar values within categories. E.g. the feature CASE groups all case information like nominative, genitive, etc. within the category syntax.
final state	Finite State Automata by definition contain a list of (zero or more) final states. Only if one such state can be reached (by consuming all input), an automaton can report success (for this particular input).
finite state automaton	Same as ^↗finite state machine.
finite state generator	A finite state generator is a ^↗finite state machine that outputs a sequence of symbols.
finite state machine	An abstract machine consisting of a set of states (including the initial state), an input tape (and possibly an output tape), a state transition function, and some designated ^↗terminal states. Finite state ^↗recognizers only read from one input tape and finally accept the given input or not. ^↗finite state generators in turn just write to an output tape. Finally, ^↗finite state transducers read input and write output in parallel.
finite state parser	A finite state parser is a ^↗finite state transducer used for parsing the input.
finite state transducer	A kind of ^↗finite state machine that allows to produce output recording the structure of the input.
first-order language	A first-order language (over some vocabulary) defines how we can use the vocabulary to form complex entities.
formal language	A formal language is a set of strings formally characterized e.g. by a grammar, or an automaton.
free	See ^↗free variables.
free variable	Variables (in well formed expressions) that are not bound. For an explicit formal definition of the concepts of bound and free variable, see the main text.
FSA	Abbreviates ^↗finite state automaton.
FSM	Abbreviates ^↗finite state machine.
functional application	The concatenation of a λ-prefixed expression with its argument (additionally marked by an @-symbol in our notation)
fundamental rule	The fundamental rule for combining a passive edge and an active edge works as follows: Suppose the active edge goes from node n1 to node n2 and has category C immediately to the right of the dot.
gap	Describe syntactic positions where an argument is missing. E.g. in "Who loves mary?" , at the second argument position of "loves" is a gap (because the argument "who" was extracted to the front). The assumption ist that the arguments of a predicate normally occur to its right as in: love(X,Y).
inflection	^↗Morphemes be combined to form words that are legal in some language. Two kinds of processes can be distinguished here. They are called inflection and ^↗derivation. Inflection is usually taken to be the process of adding a grammatical affix to a word stem, forming a word of the same class as the stem. Adding plural s to a noun stem, for example, is an inflectional process.
input tape	One part of a ^↗finite state machine.
jump arc	Jump arcs let a ^↗finite state machine jump from one state to another without emitting or reading a symbol.
Kleene closure	L∗denotes the Kleene closure of a formal language L. It contains all strings obtained by concatenating any number of strings from L. In particular it contains the empty string ε as the result of concatenating zero strings from L.
language accepted	The language accepted (or recognized) by an ^↗FSA is the set of all strings it recognizes when used in recognition mode. The language accepted by a formal grammar is the set of all strings the grammar accepts.
language generated	The language generated by an FSA is the set of all strings it can generate when used in generation mode.
language generated by a context free grammar	The language generated by a context free grammar is the set of sentences (strings of ^↗terminal symbols) that can be derived starting from the ^↗start symbol.
left corner	The left corner of a rule is the first symbol on its right hand side. For example, NP is the left corner of the rule S→NPVP, and IV is the left corner of the rule VP→IV. Similarly, we can say that vincent is the left corner of the lexical rule PN→vincent.
left-corner parsing	The basic idea in left-corner parsing is to alternate bottom-up and top-down steps.
left-corner table	The left-corner table of a grammar stores which constituents can be at the ^↗left corner of which other constituents. This allows one to check every time whene choosing a rule whether it makes sense to use this rule given the top-down information about the category to be recognize.
lexical rule	A ^↗grammar rule which has only words on the right hand side.
local ambiguity	Ambiguity within a globally unambigous constituent. E.g. the phrase "short walk" has two analysis while the larger sentences "The short walk ended." and "The short walk slowly." do not.
matrix	If ϕ and ψ are wffs, then both ∃xϕ and ∀xϕ are wffs. We call ϕ the matrix or scope of such wffs.
meaning construction	Is the process of constructing semantic representations out of (natural language) strings. Usually along the lines of some syntactic analysis.
meaning representation	Meaning as such is a very abstract concept. It's not at all easy to imagine what it is to get to the meaning of a sentence, let alone to construct it from that sentence. To study meaning, and especially to deal with it on a computer, one needs a handle on it, something more concrete. Meaning representations, i.e. strings of a formal language that has technical advantages over natural language like first-order logic are often used for this purpose.
morpheme	Minimal meaningful language unit; cannot be divided into smaller meaningful units. Morphemes like "rabbit" that contribute the main meaning of a noun, verb, etc. are also called ^↗stems, while the other morphemes are also known as ^↗affixes.
movement	Explanation in various (especially early) theories of formal linguistics of particular syntactic structures. In short, it is assumed there that constituents can move within some kind of underlying syntax tree.
name	FOL: same as ^↗constant symbols.
non-deterministic	Said of ^↗algorithms that allow for more than one subsequent step ("action") in some configuration.
non-terminal symbol	The set of symbols of a formal grammar that occur on the left hand of some arrow.
Occurs-Check	For reasons of efficiency, Prolog's standard unification mechanism leaves out the so-called Occurs-Check. This plays no role most of the time. But in some cases it does. A canonical example to see what's going wrong is the following: ?- A = f(A). A = f(f(f(f(f(f(f(f(f(f(...)))))))))) Yes If one asks Prolog to unify a variable with a complex term, it does not check whether this variable occurs inside the term. Therefore the variable and the term unify in the example above resulting in an infinite structure.
output tape	Part of some kind of ^↗finite state machine.
parser	Parsers give an analysis of their input. In contrast, ^↗recognizers just accept it or not.
parse tree	A parse tree is a finite tree all of whose nodes are licensed (admitted) by a grammar rule. The leaf nodes must be licensed by ^↗lexical rules, the interior nodes (that is, nodes with daughters) must be licensed by ^↗phrase structure rules.
passive arc	Complete arc in active chart parsing that has found everything it hypothesized. In graphical notation, these are the arcs where the `.' is at the end of the label.
passive chart parsing	Is a parsing algorithm that uses a chart as a record of all constituents that have been parsed so far.
passive edge	Same as ^↗passive arc.
phrase structure grammar	PSGs provide a constituent analysis of a string (in terms of its constituent phrases). PSGs are usually represented as rewrite rules. E.g. ^↗phrase structure rules: S→NPVPNP→PNVP→IV ...and lexical rules: PN→volkerIV→died
phrase structure rule	Are parts of a ^↗phrase structure grammar.
polynomial	Said of problems where the resources needed to find the solution (see ^↗exponential) are related to the input size by a polinomial function.
predicate symbol	A symbol (e.g. in first order logic) with which we can speak about a certain property of or relation between individuals (therefore predicate symbols are also called relation symbols). Every predicate symbol has an arity which specifies the number of arguments the predicate takes. E.g. the predicate symbol LOVE is of arity 2 and talks about a two-place relation and the predicate symbol THERAPIST is of arity 1 and talks about a property.
preterminal symbol	A symbol of a formal grammar that occurs on the left-hand side of the arrow in a ^↗lexical rule.
recognizer	See ^↗parsers.
regular expression	Any description of a pattern composed from combinations of symbols and the three operators:^↗ concatenation, disjunction, and ^↗closure.
regular language	Language generated by a ^↗regular expression.
relation symbol	Same as ^↗predicate symbol.
restriction	Determimers in natural language usually specify a set of individuals and attribute some property. The two parts of the determiner specifying the group and the property are called restriction and scope, respectively. E.g. in the sentence "Every wife has a husband." , "wife" is the restriction and "has a husband" the scope of the universal quantifier.
rewrite arrow	The arrow used in writing ^↗context free rules
rewrite interpretation	The simplest interpretation of ^↗context free grammars is the rewrite interpretation. This views CFGs as rules which let us rewrite the start symbol to other strings: each rule is interpreted as saying that we can replace an occurrence of the symbol on the left side of the rule by the symbol(s) on the right side.
rewrite rule	Same as ^↗context free rules.
scope	Same as ^↗matrix.
search	The implementation of ^↗non-deterministic algorithms brings with it the need to perform search, i.e. (possibly) a range of alternative steps has to be explored (searched). A space of alternatives is often represented as a (search) tree.
sentence	A (logical) formula that contains no occurrences of free variables is called sentence.
sentence symbol	Every context free grammar has one special symbol, the start symbol (or sentence symbol), which is usually written `S'.
speech act	Speech acts describe the use of language to perform some act, e.g. promise, reply, or request.
spurious analysis	One speaks of a spurious analysis in cases where a parser produces two identical analysis for same input, e.g. by using the same grammar rules in a different order.
start state	One marked state of a ^↗finite state machine.
start symbol	Same as ^↗sentence symbol
stem	See ^↗morphemes.
subcategorize	One of the most important concepts in (formal) linguistics ist that of subcategorization. Certain word classes are assumed to require (subcategorize for) other word classes (so called complements) to form complete phrases. E.g. nouns are assumed to subcategorize for a determiner. Probably most important are verbs which subcategorize for nouns (a subject and possibly objects) to form a sentence.
syntactic structure	The structure of a string of symbols according to some (formal) grammar. E.g. the syntactic structure of the natural language sentence "John walks." is often represented as [S [NP [PN John]] [VP [IV walks]]].
tape	Part of ^↗finite state machines.
term	FOL: Any constant or any variable is a term. Roughly speaking, terms are the noun phrases of first-order languages.
terminal state	Same as ^↗final state
terminal symbol	Symbols of a formal grammar that may only occur on the right hand side of the arrow.
threading	Is a technique for explaining/treating so-called long distance phenomena within natural language anylysis. The problem is that parts of a sentence that belong together are often distributed over the surface structure. E.g. in the noun phrase "The boy which Mary, who has become a scientist recently, visited last night" , "the boy" is an argument of "visited" . One explanation of such phenomena uses missing (empty) elements (^↗gaps) and describes how these elements and the so-called fillers are connected through the phrases as if there was a thread connecting them.
top-down parsing/recognition	In top-down parsing/recognition one start at the most abstract level (the level of sentences) and work down to the most concrete level (the level of words).
transition	Transitions are definied by the respective function of ^↗finite state machines.
tree admissibility rule	Next to the ^↗rewrite interpretation of CFGs, there is a more fundamental way of thinking about CFGs, namely as tree admissibility rules. According to this view, CFGs describe sets of well-formed ^↗parse trees.
well-formed formula	FOL: formulae that are built from a vocabulary, the logical symbols of FOL and first-order variables according to the syntax rules of FOL.
α-conversion	The operation of producing an alphabetical variant of a λ-abstraction by replacing all occurences of the variable abstracted over by another one. Used (for instance) in λ-based semantic construction to avoid accidental variable bindings in the target representation.
α-equivalent	Two λ-expressions are called α-equivalent if they only differ in the names of λ-bound variables. In what follows we often treat α-equivalent expressions as if they were identical. For example, we will sometimes say that the lexical entry for some word is a λ-expression ℰ, but when we actually work out some semantic construction, we might use an α-equivalent expression ℰ′ instead of ℰ itself.
β-reduction	This compound expression consists of the abstraction λx.WOMAN(x) written immediately to the left of the expression MARY, both joined together by @. Such a concatenation is called functional application; the left-hand expression is called the functor, and the right-hand expression the argument. The concatenation is an instruction to discard the λx. prefix of the functor, and to replace every occurrence of x that was bound by this prefix with the argument. We call this substitution process β-reduction (other common names include β-conversion and λ-conversion). Performing the β-reduction demanded in the previous example yields:
λ-abstraction	λ-expressions are formed out of ordinary first order formulas using the λ-operator. We can prefix the λ-operator, followed by a variable, to any first order formula or λ-expression. We call expressions with such prefixes λ-abstractions (or, more simply, abstractions). We say that the variable following a λ-operator is abstracted over. And we say that the variable abstracted over is (λ-)bound by its respective λ-operator within an abstraction, just as we say that a quantified variable is bound by its quantifier inside a quantification.
λ-bound	The variable occuring to the right of a λ-operator is λ-bound by this operator within the expression in its scope. If the same variable occurs to the right of another λ within that expression, it is in turn bound by this λ within its scope. As an example, look at the following two λ-terms. The occurences of x are bound by the different λs as indicated by the colouring: λx.Woman(x)λx.Woman(x)∨λx.Man(x)(x)