Glossary

abstracted overWhen forming an abstraction, the variable in the λ-prefix is abstracted over.
active arcIn active chart parsing we record hypotheses about structures to be found and what has been found so far in active arcs.
active edgeSame as active arcs.
affixThe smallest units in a word that bear meaning, such as "rabbit" and "s" , are called morphemes. Morphemes like rabbit that contribute the main meaning of a noun, verb, etc. are also called stems, while the other morphemes are known as affixes.
agendaIn active chart parsing, an agenda is a datastructure that keeps track of the tasks that still have to be addressed.
algorithmA detailed sequence of actions to perform to accomplish some task. Named after an Iranian mathematician, Al-Khawarizmi.
alphabetThe alphabet of a Finitie State Automaton (usually called Σ) is the set of all strings (symbols) that can be processed during one of the transitions.
ambiguityIn this context, ambiguity describes the situation in which an expression of a (formal) language has more than one analyses.
atomic formulaAtomic formulas are formulae that do not contain any connectives.
bottom-up parsing and recognitionThe basic idea of bottom-up parsing and recognition is to begin with the concrete data provided by the input string and try to recursively build bigger and bigger pieces of structure.
bound variable A quantifier is said to bind the variable standing immediately to its right. Within a quantification of the form xϕ or xϕ all occurences of x within ϕ are bound by the leading (resp. ) if ϕ itself does not contain any quantifiers. Otherwise, only those occurences are bound by the leading quantifier that are not bound by any quantifier within ϕ itself. For a more explicit formal definition of bound and free variable, see the main text.
breadth-firstA graph search algorithm which tries all one-step extensions of current paths before trying larger extensions is called breadth-first.
chart parsingChart parsing is based on the idea to keep a record (a chart) of all the structures found during parsing. It is a very general idea, and it can be used with just about any kind of parsing (or recognition) strategy (top down, bottom up, left-corner, depth first, breadth first, ...).
concatenationThe concatenation of two strings A and B is the string AB, where both are for one sequence.
confluence Property of calculi: A calculus is called confluent if the order of rule application has no influence on the results obtained.
constant symbolSyntactic entity (in First Order Logic) corresponding to a name: A symbols to which one fixed entity from the domain of the model in use is assigned.
context freeSaid of a grammar where the syntax of each constituent is independent of the symbols (words) occuring before and after it in a sentence. E.g. the context free rule NP DET N says that a sequence of words that are of the types DET and N can be ananlyzed as an NP irrespective of their neighbouring symbols.
context free grammarContext free grammars (CFGs) describe a larger class of formal languages than regular languages do. E.g. the language anbn can be described by a CFG. Even more, many natural language structures can be neatly described with context free grammars. This makes them especially useful for writing grammars for NLP applications. Another reason why CFGs are widely used in the field of NLP is the availability of many efficient tools (compilers, parsers, etc.).
context free ruleAre rules of a context free grammar (you may also see such rules called rewrite rules, or something similar).
Definite Clause GrammarA grammar formalism that is built into Prolog. Although DCGs are formally more powerful than CFGs, we use the DCG mechanism to deal with CFGs.
depth-firstSaid of a graph search algorithm which extends the current path as far as possible before backtracking (to the last so-called choice point) and trying the next alternative path.
derivationA sequence of applications of grammar rules (starting with some initial symbols) is called a derivation of the symbols in the last row.
derivation (morphological)Is the technical term for the process of adding an affix to a word stem resulting in a word with a class different from that of the stem. Making a noun out of a verb by adding "ation" to the verb is an example: "realize" + "ation" gives us realization.
dialogue actDialogue acts describe what role different utterances play within a dialogue.
dynamicProlog puts some restrictions on what can be asserted and retracted. More precisely, other predicates are only allowed to remove or retract clauses of predicates that have been declared dynamic. So, if one wants to write a predicate which asserts or retracts clauses of the form happy(mia), one also has to put the following statement into the file: :- dynamic happy/1.
exponentialComputional costs are typically time and space. Exponential Problems are such that one of these factors expands in an exponential curve (if the input gets more complex).
extractionIf syntactic arguments are moved from their original position, one speaks of extraction. C.f. "John" in "(It is) John, Mary loves."
failure driven loop The idea of failure driven loops is to force Prolog to backtrack until there are no more possibilities left. This technique is particularily helpful for working with the Prolog database.
featureFeatures group similar values within categories. E.g. the feature CASE groups all case information like nominative, genitive, etc. within the category syntax.
final stateFinite State Automata by definition contain a list of (zero or more) final states. Only if one such state can be reached (by consuming all input), an automaton can report success (for this particular input).
finite state automatonSame as finite state machine.
finite state generatorA finite state generator is a finite state machine that outputs a sequence of symbols.
finite state machine An abstract machine consisting of a set of states (including the initial state), an input tape (and possibly an output tape), a state transition function, and some designated terminal states. Finite state recognizers only read from one input tape and finally accept the given input or not. finite state generators in turn just write to an output tape. Finally, finite state transducers read input and write output in parallel.
finite state parserA finite state parser is a finite state transducer used for parsing the input.
finite state transducerA kind of finite state machine that allows to produce output recording the structure of the input.
first-order languageA first-order language (over some vocabulary) defines how we can use the vocabulary to form complex entities.
formal languageA formal language is a set of strings formally characterized e.g. by a grammar, or an automaton.
freeSee free variables.
free variableVariables (in well formed expressions) that are not bound. For an explicit formal definition of the concepts of bound and free variable, see the main text.
FSAAbbreviates finite state automaton.
FSMAbbreviates finite state machine.
functional applicationThe concatenation of a λ-prefixed expression with its argument (additionally marked by an @-symbol in our notation)
fundamental ruleThe fundamental rule for combining a passive edge and an active edge works as follows: Suppose the active edge goes from node n1 to node n2 and has category C immediately to the right of the dot.
gapDescribe syntactic positions where an argument is missing. E.g. in "Who loves mary?" , at the second argument position of "loves" is a gap (because the argument "who" was extracted to the front). The assumption ist that the arguments of a predicate normally occur to its right as in: love(X,Y).
inflectionMorphemes be combined to form words that are legal in some language. Two kinds of processes can be distinguished here. They are called inflection and derivation. Inflection is usually taken to be the process of adding a grammatical affix to a word stem, forming a word of the same class as the stem. Adding plural s to a noun stem, for example, is an inflectional process.
input tapeOne part of a finite state machine.
jump arcJump arcs let a finite state machine jump from one state to another without emitting or reading a symbol.
Kleene closureLdenotes the Kleene closure of a formal language L. It contains all strings obtained by concatenating any number of strings from L. In particular it contains the empty string ε as the result of concatenating zero strings from L.
language acceptedThe language accepted (or recognized) by an FSA is the set of all strings it recognizes when used in recognition mode. The language accepted by a formal grammar is the set of all strings the grammar accepts.
language generatedThe language generated by an FSA is the set of all strings it can generate when used in generation mode.
language generated by a context free grammar The language generated by a context free grammar is the set of sentences (strings of terminal symbols) that can be derived starting from the start symbol.
left corner The left corner of a rule is the first symbol on its right hand side. For example, NP is the left corner of the rule SNPVP, and IV is the left corner of the rule VPIV. Similarly, we can say that vincent is the left corner of the lexical rule PNvincent.
left-corner parsingThe basic idea in left-corner parsing is to alternate bottom-up and top-down steps.
left-corner table The left-corner table of a grammar stores which constituents can be at the left corner of which other constituents. This allows one to check every time whene choosing a rule whether it makes sense to use this rule given the top-down information about the category to be recognize.
lexical ruleA grammar rule which has only words on the right hand side.
local ambiguityAmbiguity within a globally unambigous constituent. E.g. the phrase "short walk" has two analysis while the larger sentences "The short walk ended." and "The short walk slowly." do not.
matrixIf ϕ and ψ are wffs, then both xϕ and xϕ are wffs. We call ϕ the matrix or scope of such wffs.
meaning constructionIs the process of constructing semantic representations out of (natural language) strings. Usually along the lines of some syntactic analysis.
meaning representation Meaning as such is a very abstract concept. It's not at all easy to imagine what it is to get to the meaning of a sentence, let alone to construct it from that sentence. To study meaning, and especially to deal with it on a computer, one needs a handle on it, something more concrete. Meaning representations, i.e. strings of a formal language that has technical advantages over natural language like first-order logic are often used for this purpose.
morphemeMinimal meaningful language unit; cannot be divided into smaller meaningful units. Morphemes like "rabbit" that contribute the main meaning of a noun, verb, etc. are also called stems, while the other morphemes are also known as affixes.
movementExplanation in various (especially early) theories of formal linguistics of particular syntactic structures. In short, it is assumed there that constituents can move within some kind of underlying syntax tree.
nameFOL: same as constant symbols.
non-deterministicSaid of algorithms that allow for more than one subsequent step ("action") in some configuration.
non-terminal symbolThe set of symbols of a formal grammar that occur on the left hand of some arrow.
Occurs-Check For reasons of efficiency, Prolog's standard unification mechanism leaves out the so-called Occurs-Check. This plays no role most of the time. But in some cases it does. A canonical example to see what's going wrong is the following:

?- A = f(A).

A = f(f(f(f(f(f(f(f(f(f(...)))))))))) 
Yes
If one asks Prolog to unify a variable with a complex term, it does not check whether this variable occurs inside the term. Therefore the variable and the term unify in the example above resulting in an infinite structure.
output tapePart of some kind of finite state machine.
parserParsers give an analysis of their input. In contrast, recognizers just accept it or not.
parse tree A parse tree is a finite tree all of whose nodes are licensed (admitted) by a grammar rule. The leaf nodes must be licensed by lexical rules, the interior nodes (that is, nodes with daughters) must be licensed by phrase structure rules.
passive arcComplete arc in active chart parsing that has found everything it hypothesized. In graphical notation, these are the arcs where the `.' is at the end of the label.
passive chart parsingIs a parsing algorithm that uses a chart as a record of all constituents that have been parsed so far.
passive edgeSame as passive arc.
phrase structure grammarPSGs provide a constituent analysis of a string (in terms of its constituent phrases). PSGs are usually represented as rewrite rules. E.g. phrase structure rules: SNPVPNPPNVPIV ...and lexical rules: PNvolkerIVdied
phrase structure ruleAre parts of a phrase structure grammar.
polynomialSaid of problems where the resources needed to find the solution (see exponential) are related to the input size by a polinomial function.
predicate symbolA symbol (e.g. in first order logic) with which we can speak about a certain property of or relation between individuals (therefore predicate symbols are also called relation symbols). Every predicate symbol has an arity which specifies the number of arguments the predicate takes. E.g. the predicate symbol LOVE is of arity 2 and talks about a two-place relation and the predicate symbol THERAPIST is of arity 1 and talks about a property.
preterminal symbolA symbol of a formal grammar that occurs on the left-hand side of the arrow in a lexical rule.
recognizerSee parsers.
regular expressionAny description of a pattern composed from combinations of symbols and the three operators: concatenation, disjunction, and closure.
regular languageLanguage generated by a regular expression.
relation symbolSame as predicate symbol.
restrictionDetermimers in natural language usually specify a set of individuals and attribute some property. The two parts of the determiner specifying the group and the property are called restriction and scope, respectively. E.g. in the sentence "Every wife has a husband." , "wife" is the restriction and "has a husband" the scope of the universal quantifier.
rewrite arrowThe arrow used in writing context free rules
rewrite interpretation The simplest interpretation of context free grammars is the rewrite interpretation. This views CFGs as rules which let us rewrite the start symbol to other strings: each rule is interpreted as saying that we can replace an occurrence of the symbol on the left side of the rule by the symbol(s) on the right side.
rewrite ruleSame as context free rules.
scopeSame as matrix.
sentenceA (logical) formula that contains no occurrences of free variables is called sentence.
sentence symbolEvery context free grammar has one special symbol, the start symbol (or sentence symbol), which is usually written `S'.
speech actSpeech acts describe the use of language to perform some act, e.g. promise, reply, or request.
spurious analysisOne speaks of a spurious analysis in cases where a parser produces two identical analysis for same input, e.g. by using the same grammar rules in a different order.
start stateOne marked state of a finite state machine.
start symbolSame as sentence symbol
stemSee morphemes.
subcategorizeOne of the most important concepts in (formal) linguistics ist that of subcategorization. Certain word classes are assumed to require (subcategorize for) other word classes (so called complements) to form complete phrases. E.g. nouns are assumed to subcategorize for a determiner. Probably most important are verbs which subcategorize for nouns (a subject and possibly objects) to form a sentence.
syntactic structureThe structure of a string of symbols according to some (formal) grammar. E.g. the syntactic structure of the natural language sentence "John walks." is often represented as [S [NP [PN John]] [VP [IV walks]]].
tapePart of finite state machines.
termFOL: Any constant or any variable is a term. Roughly speaking, terms are the noun phrases of first-order languages.
terminal stateSame as final state
terminal symbolSymbols of a formal grammar that may only occur on the right hand side of the arrow.
threadingIs a technique for explaining/treating so-called long distance phenomena within natural language anylysis. The problem is that parts of a sentence that belong together are often distributed over the surface structure. E.g. in the noun phrase "The boy which Mary, who has become a scientist recently, visited last night" , "the boy" is an argument of "visited" . One explanation of such phenomena uses missing (empty) elements (gaps) and describes how these elements and the so-called fillers are connected through the phrases as if there was a thread connecting them.
top-down parsing/recognition In top-down parsing/recognition one start at the most abstract level (the level of sentences) and work down to the most concrete level (the level of words).
transitionTransitions are definied by the respective function of finite state machines.
tree admissibility ruleNext to the rewrite interpretation of CFGs, there is a more fundamental way of thinking about CFGs, namely as tree admissibility rules. According to this view, CFGs describe sets of well-formed parse trees.
well-formed formulaFOL: formulae that are built from a vocabulary, the logical symbols of FOL and first-order variables according to the syntax rules of FOL.
α-conversion The operation of producing an alphabetical variant of a λ-abstraction by replacing all occurences of the variable abstracted over by another one. Used (for instance) in λ-based semantic construction to avoid accidental variable bindings in the target representation.
α-equivalent Two λ-expressions are called α-equivalent if they only differ in the names of λ-bound variables. In what follows we often treat α-equivalent expressions as if they were identical. For example, we will sometimes say that the lexical entry for some word is a λ-expression , but when we actually work out some semantic construction, we might use an α-equivalent expression instead of itself.
β-reduction This compound expression consists of the abstraction λx.WOMAN(x) written immediately to the left of the expression MARY, both joined together by @. Such a concatenation is called functional application; the left-hand expression is called the functor, and the right-hand expression the argument. The concatenation is an instruction to discard the λx. prefix of the functor, and to replace every occurrence of x that was bound by this prefix with the argument. We call this substitution process β-reduction (other common names include β-conversion and λ-conversion). Performing the β-reduction demanded in the previous example yields:
λ-abstractionλ-expressions are formed out of ordinary first order formulas using the λ-operator. We can prefix the λ-operator, followed by a variable, to any first order formula or λ-expression. We call expressions with such prefixes λ-abstractions (or, more simply, abstractions). We say that the variable following a λ-operator is abstracted over. And we say that the variable abstracted over is (λ-)bound by its respective λ-operator within an abstraction, just as we say that a quantified variable is bound by its quantifier inside a quantification.
λ-boundThe variable occuring to the right of a λ-operator is λ-bound by this operator within the expression in its scope. If the same variable occurs to the right of another λ within that expression, it is in turn bound by this λ within its scope.
As an example, look at the following two λ-terms. The occurences of x are bound by the different λs as indicated by the colouring:
λx.Woman(x)λx.Woman(x)λx.Man(x)(x)