Linguistics 401--Introduction to General Phonetics
University of Illinois at Urbana-Champaign
Chilin Shih

Korean Data


There is a three way distinction of Korean voiceless stops and affricates at the same place of articulation. For example, there are 3 different [p]'s, 3 different [t]'s and 3 different [k]'s. They are typically described as unaspirated, aspirated, and fortis. Furthermore, there are two different [s]'s: plain [s] and fortis [s]. While native speakers don't seem to have any problem identifying which is which, this task is very challenging for non-native speakers. What are the possible acoustic cues? How do we measure them so we can describe the differences quantitatively? You can download speech materials from this page and start your own investigation.

Download Speech files

You can download korean.tar, or korean.tar.gz, which contains 80 speech files from 5 Korean speakers.

There is a directory "korean" under which there are five subdirectories from "spk1" to "spk5". Speech files are stored in the spk directories with the filenames "spk1_1.wav", "spk1_2.wav" ... "spk1_16.wav".

On a UNIX system, you can unpack the tar file, or the compressed tar.gz file, with the following commands respectively:

tar -xvf korean.tar
tar -xzvf korean.tar.gz
This will recreate the directory structure.

If you have problem with tar files or tar.gz files, you can download speech files individually:

spk1 spk2 spk3 spk4 spk5
spk1_1.wav spk2_1.wav spk3_1.wav spk4_1.wav spk5_1.wav That flesh. Plain
spk1_2.wav spk2_2.wav spk3_2.wav spk4_2.wav spk5_2.wav That flesh. Plain
spk1_3.wav spk2_3.wav spk3_3.wav spk4_3.wav spk5_3.wav That daughter. Fortis
spk1_4.wav spk2_4.wav spk3_4.wav spk4_4.wav spk5_4.wav That daughter. Fortis
spk1_5.wav spk2_5.wav spk3_5.wav spk4_5.wav spk5_5.wav That fire. Unaspirated
spk1_6.wav spk2_6.wav spk3_6.wav spk4_6.wav spk5_6.wav That fire. Unaspirated
spk1_7.wav spk2_7.wav spk3_7.wav spk4_7.wav spk5_7.wav That rice. Fortis
spk1_8.wav spk2_8.wav spk3_8.wav spk4_8.wav spk5_8.wav That rice. Fortis
spk1_9.wav spk2_9.wav spk3_9.wav spk4_9.wav spk5_9.wav That mask. Aspirated
spk1_10.wav spk2_10.wav spk3_10.wav spk4_10.wav spk5_10.wav That mask. Aspirated
spk1_11.wav spk2_11.wav spk3_11.wav spk4_11.wav spk5_11.wav That horn. Fortis
spk1_12.wav spk2_12.wav spk3_12.wav spk4_12.wav spk5_12.wav That horn. Fortis
spk1_13.wav spk2_13.wav spk3_13.wav spk4_13.wav spk5_13.wav That moon. Unaspirated
spk1_14.wav spk2_14.wav spk3_14.wav spk4_14.wav spk5_14.wav That moon. Unaspirated
spk1_15.wav spk2_15.wav spk3_15.wav spk4_15.wav spk5_15.wav That grass. Aspirated
spk1_16.wav spk2_16.wav spk3_16.wav spk4_16.wav spk5_16.wav That grass. Aspirated


We recorded eight Korean words in a simple phrasal context ``That keyword.''. These eight words include three words each in the [p] and [t] stop series, and two words in the [s] series.

If we record the key words in isolation, it will be difficult to determine the length of the stop closure. Therefore we place the keywords in a simple sentence frame. The keywords are phrase final, but the consonants of interest are word initial, therefore they occur in the middle of the phrase. This gives us clear indication on when the stops begin. The frame is as short as it can be, so that you can find the consonants in question without any prior knowledge of Korean.


There are 5 Korean speakers, 4 females and 1 male. Each phrase is repeated twice. So there are 80 stimuli in total.

Recording Equipments

The recording is made with Computerized Speech Lab (CSL) Model 4300B by Kay Elemetrics.

The speech files are mono files recorded in 16 bit, 44100 Hz sampling rate, and are in the .wav format.


Special thanks to Yoo Ree Chung, Soyoung Jung, Eun Kyung Lee, Yoonsook Mo, and Young-il Oh for providing the speech recording.