Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes

Distinguished Speakers in Language Science

Thursday, 29 January 2015, 16:15
Conference Room, Building C7.4

A Neurophysiologically-Inspired Statistical Language Model

Jon Dehdari
DFKI Saarbrücken

We describe a statistical language model having components that are inspired by macroscopic electrophysiological activities in the brain. These components correspond to important language–relevant event–related potentials measured using electroencephalography. We relate neural signals involved in local– and long–distance grammatical processing, as well as local– and long–distance lexical processing to statistical language models that are scalable, cross–linguistic, and incremental. We develop a novel language model component that unifies n–gram, cache, skip, and trigger language models into a generalized model inspired by the long–distance lexical event–related potential (N400). This component also exhibits some structural similarities with Elman network–based language models (commonly referred to as RNNLMs). The model is trained online, allowing for use with streaming text. We show consistent perplexity improvements over 4–gram modified Kneser–Ney language models for large–scale datasets in English, Arabic, Croatian, and Hungarian.

If you would like to meet with the speaker, please contact Jon Dehdari.