The project INCOMSLAV investigates the relation between
information density, encoding density and
grammaticalisation in a cross-linguistic perspective,
focusing on intercomprehension within the family of Slavic
languages. In the initial funding period (2014-2018), the
project brings together results from the analysis of
parallel corpora and from a variety of experiments with
native speakers of Slavic languages and compares them with
insights of comparative historical linguistics on the
relationship between Slavic languages. A statistical
language model of surprisal is used to measure information
density and as a tool to gauge how language users master
high degrees of surprisal, due to partial
incomprehensibility. The key idea here is that
comprehension of an unknown, but related, language should
be better, when the language model adapted for
understanding the unknown language exhibits relatively low
average surprisal, or density. In the second funding
period (2018-2022), the research agenda is extended to
spoken language, which allows us to investigate how
information density is balanced between the acoustic and
the text level in successful intercomprehension. At all
levels from the acoustic signal and its phonetic structure
to the texts generated from speech we develop similarity
metrics and information density measures related to Slavic
intercomprehension.
PhD research staff
(phase 1): Andrea
Fischer, Klára
Jágrová, Irina
Stenger
PhD research staff
(phase 2): Yu
Tracy Chen, Badr
Abdullah , Jacek
Kudera
Resources
Publications
|
2019
Jágrová, Avgustinova: Intelligibility of highly predictable
Polish target words in sentences presented to Czech readers.
CICLing 2019. Preprint.
Stenger, Avgustinova, Belousov, Baranov, Erofeeva. 2019.
Interaction of linguistic and socio-cognitive factors in
receptive multilingualism [Vzaimodejstvie lingvističeskich i
sociokognitivnych parametrov pri receptivnom
mul’tilingvisme], 25th International Conference on
Computational Linguistics and Intellectual Technologies
(Dialogue 2019), Proceedings, Moscow, Russia: http://www.dialog-21.ru/digest/2019/online/.
2018
Jágrová: Processing effort of Polish
NPs for Czech readers
– A+N vs. N+A.In:
Guz, Szymanek (eds.): Canonical
and Non-Canonical Structures in Polish. Studies in
Linguistics and Methodology vol. 12. Wydawnictwo KUL,
pp. 123-143. Preprint
Jágrová, Avgustinova, Stenger, Fischer: Language
models, surprisal and fantasy in Slavic
intercomprehension, Computer Speech & Language,
Available online 12 June 2018, ISSN 0885-2308,
https://doi.org/10.1016/j.csl.2018.04.005.
2017
Jágrová, Stenger, Avgustinova: Polski nadal
naluesieskomplikowany?
Interkomprehensionsexperimente mit Nominalphrasen.
In: Federalny Związek Nauczycieli Języka Polskiego
(ed.). Polski
w Niemczech - Polnisch in Deutschland
5(2017). pp. 20-37
Stenger, Jágrová, Fischer, Avgustinova, Klakow,
& Marti. (2017). Modeling the impact of
orthographic coding on Czech–Polish and
Bulgarian–Russian reading intercomprehension. Nordic
Journal of Linguistics, 40( 2),
175-199. doi:10.1017/S0332586517000130
Jágrová, Stenger, Marti,
Avgustinova. (2017). Lexical and Orthographic Distances
between Czech, Polish, Russian, and Bulgarian - a
Comparative Analysis of the Most Frequent Nouns. In:
Language Use and Linguistic
Structure. Olomouc Modern Language Series, Palacký
University Olomouc. pp. 401-416 (online)
Stenger, Avgustinova, Marti. (2017) Levenshtein distance and
word adaptation surprisal as methods of measuring mutual
intelligibility in reading comprehension of Slavic
languages. Computational Linguistics and Intellectual
Technologies: International Conference "Dialogue 2017"
Proceedings. Issue 16 (23), vol. 1, 304–317.(online)
2016
Jágrová, Stenger, Avgustinova, Marti: Polski to język
nieskomplikowany? Theoretische und praktische
Interkomprehension der 100 häufigsten polnischen
Substantive. In: Federalny Związek Nauczycieli Języka
Polskiego (ed.). Polski
w Niemczech - Polnisch in Deutschland 4(2016). pp.
5-19
Fischer, Jágrová, Stenger, Avgustinova, Klakow, Marti.
(2016). Orthographic
and Morphological Correspondences between Related Slavic
Languages as a Base for Modeling of Mutual Intelligibility.
In: Calzolari, Choukri, Declerck, Goggi, Grobelnik,
Maegaard, Mariani, Mazo, Moreno, Odijk, Piperidis.(eds.) Language
Resources and Evaluation Conference LREC 2016, pp.
4202-4209, included
linguistic resources, Portorož (Slovenia)
Stenger. (2016) How
Reading Intercomprehension Works among Slavic Languages
with Cyrillic Script. In: Köllner,. Ziai (eds.): Proceedings of the ESSLLI 2016 Student
Session: pp. 30-42
2015
Fischer, Jágrová, Stenger, Avgustinova, Klakow, Marti.
(2015). An Orthography Transformation Experiment with
Czech-Polish and Bulgarian-Russian Parallel Word Sets. In:
Sharp, Lubaszewski, Delmonte (eds.) Natural
Language Processing and Cognitive Science 2015 Proceedings.
Ca Foscarina Editrice, Venezia.
Fischer, Jágrová, Stenger, Avgustinova, Klakow, Marti (2015)
Orthography in Language Modelling of Mutual Intelligibility.
REMU
International Conference on Receptive Multilingualism,
University of Eastern Finland. (poster)
Avgustinova, Fischer, Jágrová, Klakow, Marti, Stenger
(2015) The Empirical Basis of Slavic Intercomprehension. REMU
International Conference on Receptive Multilingualism,
University of Eastern Finland. (slides)
2014
Klakow, Avgustinova, Stenger, Fischer, Jágrová: The
INCOMSLAV project. Seminar
in formal linguistics at Charles University, Prague.
November 24, 2014. Video recording, abstract &
presentation: http://lectures.ms.mff.cuni.cz/view.php?rec=238
|