Introduction to R – rewritten

I completely rewrote my Introduction to R. The premise of the new manuscript is that you want to processes a written or spoken linguistic corpus using R in order to perform statistical analyses.

I reused lots of text of the previous script, especially the static introductions about what works how. But now it is embedded into a narrative: “preprocess a corpus”. The idea is to teach you all the necessary stuff only when it becomes necessary.

You can download the current version here.


Phonetik und Phonologie im deutschsprachigen Raum 2016

Last week I visited the PundP in Munich. Felicitas Kleber and Christoph Draxler did a wonderful job organizing this beautiful conference. The great thing about PundP is that it addresses not only phoneticians and phonologists who have been working in the field for a while, but also undergraduate students and PhD-students. It provides a friendly environment where one can find constructive criticism and great inspiration for one’s work.

I met wonderful friends there, among them Daniel Duran, who presented his revolutionary learning-and-testing environment; Adrian Leemann, who presented on his App “Dialäkt Äpp”, Cornelia Heyde, who studies the articulation of stutterers by means of ultra-sound and Mathias Scharinger, who is investigating neural mechanisms of perception.

This year, I myself presented the Karl-Eberhards-Corpus of spontaneously spoken southern German, work which I have been doing together with Denis Arnold for the last three years. The corpus consists of one hour conversations between two friends. It is annotated and corrected at the word level. Segment annotations are provided on a forced aligner basis.

Left: myself; right: Daniel Duran

Text to speech on your phone

There is plenty of stuff like books and articles I would like to read but simply don’t have the time.  On the other hand, I use the car to drive to work thinking: What a waste of time.
I found a solution:  a text-to-speech system for the smart phone called “@voice aloud reader“.

It is capable to process PDF files, text files, doc files and much more. It refers to the text-to-speech system already installed on your phone but you can install many more voices as well as languages.


Three and a half years ago I started working at the Department of Quantitative Linguistics, University of Tübingen. My new field of research was articulography, a method to monitor and record the movement of the tongue and lips during speech production.

Together with my colleagues Denis Arnold and Martijn Wieling, we made a movie to advertise for our experiments. Recently, I found the movie again on youtube:

Word counts and phonetic counts

More and more phoneticians, as I do,  rely on measures of experience, such as frequency of occurrences, in their investigations. I have been struggling with these for the last three years and ended up with a fairly easy accessible data base of frequencies for word forms and phones and biphones (phonotactic frequencies) for German.  It is based on CELEX and the SDEWAC corpus.

The database can be accessed here and a description here. How to cite the database is included in the description. Thanks for sharing and citing.

R manuscript updated

I updated some passages on scripting in R.  RLOGOIn addition, I included a new chapter on writing functions. The chapter also tackles how to write your code step by step.
I also included a table of contents with hyperlinks, which allows you to go directly to the chapter/section of your interest. Finally, I started an index.

Check it out here! Share it, if you like it!