EXAM PREPARATION Machine Translation 1. * Provide an illustration for the following: Translation is made more difficult by the fact that - The translator must add or subtract substantive information from the message that the text conveys. - The meanings of many vocabulary items are bound to the culture of the source text and thus have no correspondents in the target language. * In a rule-based machine-translation system, what role would be played by - a morphographemic transducer - a morphological analyzer/generator - syntactic parser/generator - a transfer component * Bernard Vauquois used a triangular diagram to discuss various strategies used in machine translation. The following question relate to such a triangle. - Draw the diagram - Why is the diagram narrower at the top? - What is the top point in the triangle called and what is its role in machine translation? - What would the diagram look like for a syntax-based transfer (or some other) system? - In what sense are higher levels in the triangle more "abstract"? * Explain briefly: - Triangulation - Interlingua - Vauquois' triangle - Robustness - Transfer - Reference * Why might the following be difficult to translate into another language, and why? - goal keeper - alimony - nut - hot tea - the bill passed the house - value-added tax - the world cup * Why would a language with one of the following properties be easier/harder for a rule-based/statistical system to translate into/out of English? - Rich morphology - Genders - No determiners - Noun-noun compounds - VSO, SOV, free, word order - Separable verbs or verbal particles (like German "ankommen" or "to look up" * The following properties are sometimes claimed as advantages of statistical/rule-based machine-translation. Which are generally seen as favoring which side, and why? - (Non)determinsim - Rapid development - Broad domain coverage - Robustness - Ability to handle long-distance dependencies. - speed - better resolution of ambiguities - Fluent translation - Accurate translation * Advocates of statistical machine translation have criticized earlier rule-based approaches to the problem an a variety of grounds. Defend them briefly against the following specific criticisms: - linguists are only interested in exotic or fringe phenomena. - the time and effort required to write a useful set of rules makes them impracticable. - linguists recognize ambiguities but provide no help in resolving them. - The rule that linguists write do not constitute robust systems. * Explain briefly the main import of Zipf's law and explain its relevance to the construction of linguistic models by machine learning based on large corpora.