Computational Linguistics & Phonetics Computational Linguistics & Phonetics Fachrichtung 4.7 Universität des Saarlandes
Chiara Gambi - Statistics course

Chiara Gambi - Linear-mixed effects models in R

Sommersemester 2014


This course requires some previous knowledge of statistics and R (see my statistics course during the Wintersemester). If you have not attended my course during the last semester, but you have studied statistics and/or used R before, you are welcome to email me to check whether it would make sense for you to attend this course anyway. In any case, please make sure you read through these help slides carefully! They should get you started with R and R studio. Please, go through the notes and make yourself familiar with the info provided, as it will be taken for granted at the start of the course. If you have any problems, please contact me using the email address provided on my university profile.

This course will introduce you to linear mixed-effects models. This is a family of statistical models that are rapidly becoming the new standard in psycholingustic research. You will learn how to analyze binary response variables and continuous (normally distributed) outcome variables measured in experiments with within-participants and within-items designs.

The course will combine brief theoretical introductions from the lecturer with hands-on exercises using the free software R ( During the first half of the semester, we will work together through a concrete example. You will be given intermediate assignments to complete at home, which will form part of your assessment. Depending on time, we will also look at methods to produce suitables graphs in R, and at Sweave, a package that let's you integrate R output with LaTeX to easily produce manuscripts for publication. Finally, at the end of the semester you will take part in a "data analysis challenge": you will be given a new dataset, and asked to produce an appropriate analysis, and describe the results effectively in a short presentation.


  23/04/14 - Lecture 1 (room 2.11, C 7.2) - Lecture 1 slides
  30/04/14 - Lecture 2 (room 2.11, C 7.2) - Lecture 2 slides
  07/0/14 - Tutorial 1 (room 2.11, C 7.2) - Tutorial 1 code
  25/06/14 - Lecture/Tutorial 3 (room 2.11, C 7.2) - slides on graphs in R
  08/07/14 - Lecture 4 (room 2.11, C 7.2) - slides on Sweave and source .Rnw file

R resources on the web

  - R cran mirror: (R download and documentation for all packages)
  - R seek: (web search engine for help on R-related topics)
  - ling-r-lang-L: mailing list for language researcher using R
  - R studio: (R studio download)
  - Sweave: (Sweave)


Statistics with R

R. H. Baayen (2008). Analyzing linguistic data: A practical introduction using R. Cambridge University Press. (several copie in Coli Library, and an online draft at

P. Dalgaard (2008). Introductory statistics with R. Springer (copies in Coli Library).

A. Field (2013). Discovering statistics using R. Sage Publications

A. Gelman, & J. Hill (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press

Journal articles on linear mixed-effects models (available on Google Scholar)
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of memory and language, 59(4), 390-412.

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255-278.

Cunnings, I. (2012). An overview of mixed-effects statistical models for second language researchers. Second Language Research, 28(3), 369-382.

Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of memory and language, 59(4), 434-446.