Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes

Computational Linguistics Colloquium

Tuesday, 18 November, 16:15
Conference Room, Building C7 4

Predicting MT Quality Using Textual Entailment

Sebastian Pado
Stanford University

Automatic evaluation plays a crucial role in machine translation for guiding system development. Unfortunately, state-of-the-art MT evaluation metrics like BLEU and related scores tend to focus on shallow properties such as word sequence overlap. Thus, they are likely to underestimate the quality of translations that involve syntactic and semantic reformulations such as diathesis alternations, scrambling, or paraphrases, which becomes a growing problem as the quality of MT systems improves.

We propose to predict MT adequacy scores by assessing the quality of textual entailment between translation candidates and reference translations. Textual entailment can be seen as a probabilistic relaxation of logical inference, and thus our procedure corresponds to the intuition that adequate MT output and reference translations must entail one another.

Experimentally, we find that a vanilla system for recognising textual entailment, applied to the MT evaluation task, greatly outperforms individual state-of-the-art MT evaluation metrics, rivals the performance of a committee of these metrics, and does particularly well on examples involving linguistic variation. Furthermore, we show that the information provided by the entailment-based and surface-based approaches is to some extent complementary: Entailment features and traditional scores can be combined into a hybrid model that consistently outperforms either individual approach.

If you would like to meet with the speaker, please contact Carolone Sporleder.