Distinguished Speakers in Language Science
Thursday, 29 January 2015, 16:15
Conference Room, Building C7.4
A Neurophysiologically-Inspired Statistical Language Model
Jon DehdariDFKI Saarbrücken
We describe a statistical language model having components that are inspired by macroscopic electrophysiological activities in the brain. These components correspond to important language–relevant event–related potentials measured using electroencephalography. We relate neural signals involved in local– and long–distance grammatical processing, as well as local– and long–distance lexical processing to statistical language models that are scalable, cross–linguistic, and incremental. We develop a novel language model component that unifies n–gram, cache, skip, and trigger language models into a generalized model inspired by the long–distance lexical event–related potential (N400). This component also exhibits some structural similarities with Elman network–based language models (commonly referred to as RNNLMs). The model is trained online, allowing for use with streaming text. We show consistent perplexity improvements over 4–gram modified Kneser–Ney language models for large–scale datasets in English, Arabic, Croatian, and Hungarian.
If you would like to meet with the speaker, please contact
Jon Dehdari.