Chiara Gambi - Statistical analysis of experimental (and corpus) data with R
Wintersemester 2013
Please note that there has been a change to the time of the first lecture (10/02/14); there might be other changes to the schedule, so please keep checking this page (especially if you miss some of the classes).
IMPORTANT INFO - READ THIS
If you want to attend this course, you should download these help slides and read them carefully! They should get you started with R and R studio. Please, go through the notes and make yourself familiar with the info provided, as it will be taken for granted at the start of the course. If you have any problems, please contact me using the email address provided on my university profile.This course will introduce you to the statistical tools you will need for the analysis of experimental data (e.g., behavioral experiments in psycholinguistics, perception studies in phonetics, acceptability rating studies in experimental linguistics) and corpus data. We will start with basic descriptive statistics (distributions, means, standard deviations) and an introduction to the concept of hypothesis testing in inferential statistics. We will then cover the most common statistical methods used in speech and language research:
- t-test | |
- chi-square | |
- ANOVA | |
- Generalized linear mixed-effects models |
The course will combine brief theoretical introductions from the lecturer with hands-on exercises using the free software R (cran.r-project.org). The focus will be on learning how to run the analyses, interpret R output, report the findings in standard APA format, and produce suitable graphs.
SCHEDULE
10/02/14, 14-16 - Lecture (room 2.11, C 7.2) - Lecture 1 slides | |
11/02/14, 13-15 - Lab tutorial (CIP room, C 7.2) - Lab 1 exercises and solutions | |
12/02/14, 10-12 - Lecture (room 2.11) - Lecture 2 slides | |
12/02/14, 13-15 - Lab tutorial (CIP room) - Lab 2 exercises (old version) (note that this version contains a mistake in Section 4), Lab 2 exercises (new version) (this is correct!), Lab 2 data set and solutions | |
17/02/14, 15-17 - Lecture (room 2.11) - Lecture 3 slides. NOTE: I have spotted an imprecision on slide 18 of Lecture 3. The correct version of these slides can be found here, and here you find a brief explanation of what changed. | |
18/02/14, 13-15 - Lab tutorial (CIP room) - Lab 3 exercises and Lab 3 data set and solutions | |
19/02/14, 10-12 - Lecture (room 2.11) - Lecture 4 slides | |
20/02/14, 13-15 - Lab tutorial (CIP room) - Lab 4 exercises, cognates data set, and twoway data set; solutions (Section 1); solutions (complete) | |
21/02/14, 10-12 - Lecture (room 2.11) - Lecture 5 slides | |
21/02/14, 13-15 - Lab tutorial (CIP room) - Lab 5 exercises, error data set, and solutions | |
24/02/14, 10-12 - Lecture (room 2.11) - Lecture 6 slides | |
25/02/14, 13-15 - Lab tutorial (CIP room) - Lab 6 exercises |
EXAM DATE: 10th March (CIP room); 10-12 (s.t.) - NOTE: please be there at 10am sharp (no academic quarter!) ; exam solutions
R resources on the web
- R cran mirror: http://www.cran.r-project.org (R download and documentation for all packages) | |
- R seek: http://www.rseek.org (web search engine for help on R-related topics) | |
- ling-r-lang-L: mailing list for language researcher using R | |
- R studio: http://www.rstudio.com (R studio download) |
References
Statistics
Howell, D.C. (2004). Fundamental statistics for the behavioral sciences. Thomson Brooks (Fifth Edition in Coli Library; later editions in SULB)Field, A. (2000). Discovering statistics using SPSS. Sage Publications (Second edition in Coli Library; later editions in SULB).
NOTE: despite the title, this book is not just about SPSS. If you're hoping to be gently eased into statistics, this is the one for you!
Howitt, D., & Cramer, D. (2011). Introduction to statistics in psychology. Prentice Hall (Fifth Edition in SULB; NOTE: unfortunately, this edition contains mistakes in several formulae!).
Statistics with R
Books
R. H. Baayen (2008). Analyzing linguistic data: A practical introduction using R. Cambridge University Press. (several copie in Coli Library, and an online draft at www.ualberta.ca/~baayen/publications/baayenCUPstats.pdf).P. Dalgaard (2008). Introductory statistics with R. Springer (copies in Coli Library).
A. Field (2013). Discovering statistics using R. Sage Publications
A. Gelman, & J. Hill (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press
Journal articles on linear mixed-effects models (available on Google Scholar)
Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of memory and language, 59(4), 390-412.Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255-278.
Cunnings, I. (2012). An overview of mixed-effects statistical models for second language researchers. Second Language Research, 28(3), 369-382.
Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of memory and language, 59(4), 434-446.