Text-to-speech synthesis

Summer 2022, Möbius

Advanced Lecture, 2 SWS, 3 LP/ECTS, LSF #134460

MSc LST/LCT; MSc Inf; MSc CuKT

Thu 10.15-11.45 AM
Building E1.3, Room 0.03.1 (Hörsaal III)

Exam inspection is possible on Thu Oct 27, 1-2 PM, C7.2/1.01 (exam office, Cristina Deeg), but only after registration by email.


This course is taught on-site and on-line: lecture in E1.3/0.03.1 and live broadcast via Zoom.

Entrance requirements

Formally speaking, none, but some background in speech science, signal processing, or computational linguistics will be advantageous.

Assessments / Exams

Written exam at the end of the summer semester. Registration via LSF-POS or, if this is not possible, by email to the lecturer.

Content

Speech synthesis is an essential component of any system relying on intuitive human-machine communication. Speech synthesis systems are also used in phonetic research to gain further insight into speech production and acoustic properties of speech. This advanced course offers an introduction to text-to-speech (TTS) synthesis systems and strategies. Various approaches to speech synthesis are presented, including formant synthesis, concatenative synthesis, and state-of-the-art unit selection and statistical (HMM and DNN based) synthesis, as well as recent end-to-end approaches. Linguistic text analysis and natural language processing modules typically found in TTS systems are covered as well.

Contact:
  Prof. Dr. Bernd Möbius
  Email
  C7.2/4.10
  0681/302-4500

Structure

Date Topic Slides Background
21.04. Introduction: history, methods pdf / pptx (incl. audio) Taylor 2009, ch. 1 and 2
28.04. TTS system components pdf Taylor 2009, ch. 7.1-7.2
05.05. Formant synthesis pdf / pptx (incl. audio) Taylor 2009, ch. 13
12.05. Diphone synthesis pdf / pptx (incl. audio) Taylor 2009, ch. 14
19.05. Unit Selection synthesis pdf / pptx (incl. audio) Taylor 2009, ch. 16
26.05. (no class)
02.06. Duration modeling pdf Taylor 2009, ch. 9
09.06. Intonation modeling pdf Taylor 2009, ch. 9
16.06. (no class)
23.06. Linguistic text analysis I pdf Taylor 2009, ch. 7.4, 8.3-8.5
Möbius 2001, ch. 3-6
30.06. Linguistic text analysis II (see 23.06.)  
07.07. Parametrical and NN-based synthesis
Taught by S. Le Maguer
pdf / zip (incl. audio)
(start with index.html)
Taylor 2009, Ch. 15
Dale: Voice synthesis business 2022
14.07. TTS system evaluation
Taught by S. Le Maguer
Taylor 2009, ch. 17.2
Wester 2015: read at least first three sections
21.07.
10:15-11:45
Written exam
E1.3/0.03.1 (Hörsaal III)
Registration for exam by July 14, 2022
via LSF/HISPOS or else by email to Prof. Möbius
19.10.
14:15-15:45
Written exam (resit)
B3.1/0.14 (Hörsaal I)
Registration for exam by Oct 12, 2022
via LSF/HISPOS or else by email to Prof. Möbius

TTS systems

Literature

Suggested companion text book Further text books: The books by Dutoit, Sproat, and Taylor are available in the CS library.

BibTex entries of all references (books, papers, URL)


bm 21.10.2022