On the production of aspiration and...
Transcript of On the production of aspiration and...
1
Academic Year 2010 - 2011
On the production of
aspiration and prevoicing
The effect of training on native speakers of Belgian Dutch
Janey Vanlocke
Supervisor Dr. Ellen Simon
Master thesis submitted in partial fulfilment of the requirements for the degree of Master in English-Italian Literature and Linguistics
2
TABLE OF CONTENTS
Acknowledgements ........................................................................................................................................................................5
1. Introduction .................................................................................................................................................................................6
2. Aspiration and prevoicing .......................................................................................................................................................8
2.1 Introduction to both features .........................................................................................................................................8
2.2 Aspiration ..............................................................................................................................................................................9
2.2.1 What is aspiration? ....................................................................................................................................................9
2.2.2 Positive Voice Onset Time .................................................................................................................................... 10
2.2.3 The effect of place of articulation ...................................................................................................................... 11
2.2.4 English vs. Dutch ..................................................................................................................................................... 16
2.3 Prevoicing .......................................................................................................................................................................... 18
2.3.1 What is prevoicing? ................................................................................................................................................ 18
2.3.2 Negative Voice Onset Time .................................................................................................................................. 19
2.3.3 Influencing factors .................................................................................................................................................. 21
2.3.4 English vs. Dutch ..................................................................................................................................................... 23
3. The effect of training .............................................................................................................................................................. 25
3.1 Perception vs. production ............................................................................................................................................. 25
3.1.1 Perception ................................................................................................................................................................. 25
3.1.2 Production ................................................................................................................................................................. 26
3.1.3 Audiovisual training ............................................................................................................................................... 27
3.2 Real-time spectrograms ................................................................................................................................................ 28
3.3 Other techniques used in pronunciation training ................................................................................................ 29
3.3.1 Feedback .................................................................................................................................................................... 30
3.3.2 Contrasting with native language ...................................................................................................................... 31
4. Case study .................................................................................................................................................................................. 32
4.1 Hypotheses ........................................................................................................................................................................ 32
4.1.1 General aim of the case study ............................................................................................................................. 32
4.1.2 Specific hypotheses ................................................................................................................................................ 33
3
4.2 Method ................................................................................................................................................................................ 33
4.2.1 Participants ............................................................................................................................................................... 33
4.2.2 Stimuli and design .................................................................................................................................................. 35
4.2.3 Procedure .................................................................................................................................................................. 38
4.2.4 Analysis ...................................................................................................................................................................... 39
4.3 Results and discussion ................................................................................................................................................... 39
4.3.1 Analysis ...................................................................................................................................................................... 39
4.3.2 Pretest ......................................................................................................................................................................... 40
4.3.3 Posttest ....................................................................................................................................................................... 46
4.3.4 Pretest vs. posttest: a comparison..................................................................................................................... 57
4.3.5 Summary .................................................................................................................................................................... 62
5. Conclusion ................................................................................................................................................................................. 63
References ...................................................................................................................................................................................... 64
Appendices ..................................................................................................................................................................................... 66
Appendix A: Questionnaire.................................................................................................................................................. 66
Appendix B: Pretest and Posttest ...................................................................................................................................... 69
1. List of tokens used in pre- and posttest picture-naming task ........................................................................ 69
2. Slides as presented in pre- and posttest picture-naming task ....................................................................... 71
3. List of tokens used in posttest word-reading task ............................................................................................. 76
4. List of words as presented in posttest word-reading task .............................................................................. 77
Appendix C: Training session ............................................................................................................................................. 78
1. Slides used in training session on aspiration and prevoicing ........................................................................ 78
2. Handout with tips on aspiration and prevoicing ................................................................................................ 81
Appendix D: Copy of recordings ........................................................................................................................................ 82
Appendix E: Results of pretest ........................................................................................................................................... 83
1. Aspiration ........................................................................................................................................................................ 83
2. Prevoicing ........................................................................................................................................................................ 84
4
Appendix F: Results of Posttest.......................................................................................................................................... 85
1. Picture-naming task ..................................................................................................................................................... 85
2. Word-reading task ........................................................................................................................................................ 87
Appendix G: results of pre- and posttest picture-naming task ................................................................................ 89
5
ACKNOWLEDGEMENTS
First and foremost, I would like to thank my promotor Dr. Ellen Simon, who guided me through
the writing process and provided me with indispensable advice. I would also like to thank her
for her professional feedback, for taking the time to meet up with me to discuss my progress (or
lack thereof).
Thank you to all of the patient and cooperative volunteers who worked with me on this
project with much enthusiasm and for taking the time to take part in it. Truly, without them this
research paper would not have been possible.
Since completing this Master thesis is the final step before graduating, I would also like to take
this opportunity to thank some very important people who have supported me through my
studies from day one.
I am extremely grateful to my parents for believing in me and for never doubting I would
get there in the end, even when I myself did not. I hope I have made them proud.
A huge thank you to my lovely boyfriend who never stopped supporting me and always
pushed me to do the best I could. For trusting I would succeed in the end.
My special thanks go to my grandmother, who I remember taught me how to achieve the
correct English pronunciation, even when I was very little.
6
1. INTRODUCTION
The topic of the present paper is the production of aspiration of voiceless plosives /p, t, k/ and
prevoicing of voiced plosives /b, d/ by Belgian Dutch speakers of English. More specifically, the
current paper is on the effect specific pronunciation training of both of these features has on
non-native speakers of English. In other words, whether or not a single training session can
contribute to a learner’s knowledge and practical application of pronunciation of a foreign
language.
In learning a non-native language, speakers are known to transfer from their native
language (henceforth, L1), i.e. to apply certain features of their L1 in the foreign language,
transfer occurs not only on a grammatical level, but also on a phonetic level (e.g. Collins &
Vandenbergen, 2000). These features, however, might not be in use (in the same way) as in the
L1, hence transfer often leads to mistakes or misunderstandings in the foreign language. Under
the influence of their native language, L2 learners often produce certain pronunciation features
in English the same way they would in their native language. For example, speakers of Dutch
frequently substitute English /e/ for /æ/ as in for example <send for help> which then turns
into */’sænd fƏ ‘hælp/. In other words, native speakers of Dutch mispronounce features of
English because they extrapolate features from their L1 onto the L2.
The learning of a foreign language has also been widely discussed with regards to the so-
called critical age. This has been suggested for example by Flege (1989; mentioned in Hattori,
2009), who draws attention to the fact that training phonetic contrasts becomes very difficult
when these were not taught and maintained at an early age. However, Flege (1995; mentioned in
Hattori, 2009) also suggests, that even in adulthood, speakers have held onto the ability to learn
a language. Furthermore, pronunciation training or else learning new information on a language,
has been known to be effective even at a later stage in life, i.e. past the critical age (e.g. Hattori,
2009).
The first part of the present paper provides background information on the processes of
aspiration and prevoicing and on training methods. The first section explores the process of
aspiration of the voiceless plosives /p, t, k/. This is a prominent feature in English but is not in
Dutch. The differences in realization of voiceless plosives between the two languages, i.e. English
and Belgian Dutch, are pointed out and illustrated by means of stills of spectrograms and
waveforms. In the second section, prevoicing is discussed. In Dutch, initial voiced plosives are
produced with prevoicing. This feature is however absent in English. The final section of the first
part provides a brief overview of several different methods or techniques of pronunciation
training and on their effectiveness which have been widely researched.
7
In the second part, a case study is presented in which an experiment was conducted with
13 Belgian Dutch speakers. The experiment consisted of two parts. First, the informants were
asked to name pictures in English. These words tested them on their production of the two
features at hand, i.e. aspiration and prevoicing. After the pretest they took part in a short
training session in which the aforementioned features were explained on a theoretical basis, but
they were also practiced on. The second part of the experiment, i.e. the posttest, consisted of the
identical picture-naming task as in the pretest and a word-reading task.
The object of the present study is threefold: (1) find out if native speakers of Belgian Dutch
produce aspiration in their production of English, (2) ascertain whether or not native speakers
of Belgian Dutch transfer the production of prevoicing onto English and (3) establish if a single
training session has an effect on the production of both features. In other words, if after training,
the informants produce aspiration and/or omit the production of prevoicing when uttering
English words. A minor additional aim of the experiment was to establish, if the effect of the
pronunciation training can be noticed only in the words which were specifically trained on
during the session or if the information was generalized by the participants to other so-called
new words.
It was hypothesized that before training, the speakers will not produce aspiration and will
prevoice heavily. After having completed the pronunciation training, the informants are
expected to improve their pronunciation towards a more target-like one (i.e. with aspiration and
without prevoicing), especially in the specific words which were trained on during the one-on-
one instruction.
The results showed that a single short pronunciation training session can influence the
production of both aspiration and prevoicing. Most of the participants showed considerable
improvement of both processes. Overall, an improvement of 18,1% for aspiration and 23,9% for
prevoicing was established through only one training session.
8
2. ASPIRATION AND PREVOICING
2.1 INTRODUCTION TO BOTH FEATURES
Van Alphen & Smits (2004) explain that in most languages, the types of plosives are divided into
two phonemic categories, i.e. voiced /b, d, g/ and voiceless /p, t, k/1. They point out that the
phonetic realization of these aforementioned phonemic classes is a varying factor among
languages. Since this study is on the pronunciation of English by native speakers of Belgian
Dutch, at the end of both subchapters, a concrete comparison will be made between the two
languages to point out differences and/or similarities (see Sections 2.2.4 and 2.3.4).
In a study by Lisker and Abramson (1964), the notion of Voice Onset Time (henceforth,
VOT) is put forward as an important player in the production of stop consonants. VOT is the
elapsed time between the release of the stop and the onset of vocal fold vibration. After having
studied 11 languages, Lisker & Abramson (1964) were able to conclude that VOT distinguishes
between three categories of plosives: plosives with a negative VOT, with a slightly positive VOT
and with a clearly positive VOT. The first category – termed fully voiced – which gives rise to a
negative VOT, is the result of the production of voicing during the closure. This process is also
called prevoicing. In other words, the vocal folds have started vibrating before the release of the
initial plosive. Since the release of the plosive counts as the starting point for VOT measurement,
the VOT recorded in voiced plosives is negative in the case of prevoicing. The process of
prevoicing will be discussed more elaborately in Section 2.3. The plosives which are produced
with little or no aspiration and thus show only a slightly positive VOT make up the second
category, also labelled as voiceless unaspirated2. The third category involves those plosives
which lead to the production of a clearly positive VOT as a result of aspiration, i.e. the process by
which a weak ‘h’-like sound follows the release of the plosive. This last category is otherwise
known as voiceless aspirated. Since the onset of voicing in delayed by the production of
aspiration – which is voiceless – the period of voicelessness is longer, i.e. the VOT will be longer
than when no aspiration can be detected. A more detailed account of this process will be given in
Section 2.2.
The existence of these three categories of voicing implies that any language could make use
of them. This matter has been studied and has led to the conclusion that most languages in fact
do not employ all three. Thai, for example, uses the three different categories as van Alphen &
1 Voiced and voiceless plosives are also referred to as lenis and fortis plosives, respectively. 2 Keating, Linker & Huffman (1983; mentioned in van Alphen & Smits, 2004) showed that this category is the most common one of the three. Of the 51 languages they studied, almost all of them used the voiceless unaspirated category. For a more detailed report on the study see Keating, P. A., Linker, W. and Huffman, M. 1983. Patterns in allophone distribution for voiced and voiceless stops. Journal of Phonetics. 11. 277-290. (quoted in van Alphen & Smits, 2004).
9
Smits (2004) state, but it is only one of the very few languages that does so. A two-way
distinction is the most common among the languages in the world.
2.2 ASPIRATION
As explained in Section 2.1, the phonetic realization of the fortis plosives /p, t, k/ differs among
languages. In analogy to what is shown in the introduction (Section 2.1), namely that the
languages of the world can be categorized according to their phonetic realizations of plosives
(i.e. fully voiced, voiceless unaspirated and voiceless aspirated), Collins and Mees (2008) state
that languages are divided into aspirating and non-aspirating languages.
Aspiration is a feature present in most English accents. This situation is different in Dutch,
in which fortis plosives are not aspirated. In other words, English and Dutch are characterized
by a different phonetic realization of /p, t, k/. The phenomenon also does not occur in the
Romance languages (e.g. Italian, French, and Spanish).
2.2.1 WHAT IS ASPIRATION?
Collins & Mees (2008) explain that the process of aspiration (as touched upon in Section 2.1) is
often referred to as a small puff of air, which occurs after the release of voiceless stop
consonants /p, t, k/. In phonetics, it is symbolized as [h].
Aspiration is strongest in word-initial stressed position, e.g. in the English word <pie>,
phonetically [phai] (e.g. Collins & Mees, 2008; Collins & Vandenbergen, 2000). It is however less
strong in unstressed syllables as for example in <competitor>, phonetically [kəm’phetitə], in
which aspiration is much more evident in the stressed stop consonant /p/ than in the
unstressed initial /k/ or in both /t/s (e.g. Collins & Mees, 2008; Collins & Vandenbergen, 2000).
It is furthermore also noteworthy that when a stop is or when stop clusters are preceeded by the
fricative /s/, no aspiration is produced, for example in <spoon> (e.g. Collins & Mees, 2008;
Collins & Vandenbergen, 2000).
Aspiration manifests itself as a period of voicelessness, which is essentially a delay in the
onset of voicing of the following vowel. This period of voicelessness is expressed in VOT.
10
2.2.2 POSITIVE VOICE ONSET TIME
VOT, which is measured in milliseconds (henceforth, ms), is the time that elapses between the
release of a stop consonant and the onset of voicing. When aspiration occurs, the onset of
vibration of the vocal folds is delayed. In other words, the period of voicelessness is prolonged,
which in turn will render a longer VOT than when aspiration is absent.
Two examples are given below (Fig. 1 and Fig. 2), which show the difference in VOT
(marked in red) between when aspiration is produced and when it is not present. To maximize
the difference, one example is taken from English3 (Fig. 1) and the other from Dutch4 (Fig. 2).
Both tokens were produced by native speakers of, respectively, English and Dutch.
Figure 1 Waveform and spectrogram of the word <kus>
produced by a native speaker of Belgian Dutch. (Praat, Boersma & Weenink, 2011)
3 The sound file for this example was cut from the audio CD included in Collins & Vandenbergen (2000). 4 The audio file for this word was taken from my Bachelor Research paper (Vanlocke, 2010) in which native speakers of Belgian Dutch were asked to perform a reading task.
11
Figure 2 Waveform and spectrogram of the word <kite> produced by a native speaker of English.
(Praat, Boersma & Weenink, 2011)
In Fig. 1, the waveform and spectrogram of the Dutch word <kus> (<kiss>) are shown, which
is produced without aspiration. As Fig. 1 shows, voicing starts almost immediately (i.e. 30,3 ms)
after the burst. Fig. 2 shows the waveform and spectrogram of the English word <kite> produced
with aspiration. Vocal fold vibration, associated with the production of the following vowel (here
/ai/), starts later (i.e. only after 71,9 ms) than when aspiration is not produced (as in Fig. 1).
Since English is an aspirating language, the VOT of the fortis plosive /k/ in word-initial stressed
position is much longer (71,9 ms) compared to that in the unaspirating language Dutch (30,3
ms).
2.2.3 THE EFFECT OF PLACE OF ARTICULATION
As mentioned before (Section 2.2.1), stress is a factor which can influence VOT (e.g. Collins &
Mees, 2008; Collins & Vandenbergen, 2000), i.e. VOT is longer in stressed syllables than in
unstressed syllables. Other factors that have been reported to have an effect on VOT are
speaking rate and place of articulation. With regards to speaking rate, researchers have
suggested it to be a feature which can influence VOT. Studies propose that, as speaking rate
decreases, VOT increases (e.g. Kessinger & Blumstein, 1998; Magloire & Green, 1999). The
second factor, i.e. place of articulation, will be discussed in greater detail.
According to Lisker and Abramson (1964), VOTs range according to the category of plosive
that is being produced. Through their data acquired from eleven different languages, they were
able to ascertain that there is a difference in VOT with regards to the place of articulation
12
(henceforth, PoA) of the plosives. Their results revealed that VOTs are always longer5 in the
realization of velar stop than in that of alveolar or bilabial stops. A further distinction can be
made between the alveolar and bilabial stops, the former being longer than the latter. In other
words, VOTs of voiceless stop consonants relate to each other in the following manner: bilabial <
alveolar < velar, i.e. p < t < k.
In Figures 2 (see Section 2.2.2) to 4 (presented below), waveforms and spectrograms are
shown for each of the three possible plosives, indicating the difference in VOT duration. All
tokens are English words produced by native speakers6. For each of the tokens, the VOT is
marked in red. These randomly chosen examples provide proof for the theory of the influence of
PoA on VOT, since it is clear that the proposed rule of p < t < k is respected.
The shortest category, namely the bilabial plosive in the word <peas>, is presented in Fig. 3.
In this case, initial /p/ was produced with a VOT measured at 41,0 ms. Fig. 4 displays the
intermediate alveolar plosive /t/ represented here in the word <time>, with a VOT of 53,0 ms.
Comparing these two results with the VOT for /k/ in <kite> (see Fig. 2, above), it becomes clear
that the velar category renders the longest VOTs. Here it was measured at 71,9 ms, considerably
longer than either of the other two examples (71,9 ms > 53,0 ms > 41,0 ms).
Figure 3 Waveform and spectrogram of the word <peas> produced by a native speaker of English.
(Praat, Boersma & Weenink, 2011)
5 This is true whether or not the stops are aspirated. In other words, the difference will also be noticeable in Dutch. 6 All examples were taken from the audio CD included in Collins & Vandenbergen (2000).
13
Figure 4 Waveform and spectrogram of the word <time> produced by a native speaker of English.
(Praat, Boersma & Weenink, 2011)
An explanation for this phenomenon was not provided by Lisker & Abramson (1964);
however, more recent studies have investigated this and verified it, within one language as well
as cross-linguistically7. Moreover, the processes that are being performed in the oral cavity while
speaking have now been studied in more detail. The difference in VOT for different places of
articulation is due to complex mechanisms8 used and performed by the speaker. Cho &
Ladefoged (1999) provide a brief summary of physiological and/or aerodynamic characteristics9
that have been suggested by the literature as reasons for the effect of PoA on VOT. These
characteristics are the following: (1) the volume of the cavity behind the point of constriction,
(2) the volume of the cavity in front of the point of constriction, (3) movement of articulators, (4)
extent of articulatory contact area, (5) change of glottal opening area (for voiceless aspirated
stops), and finally (6) temporal adjustment between closure duration and VOT (Cho &
Ladefoged, 1999).
Cho & Ladefoged (1999) furthermore provide an evaluation of each of these factors. Since
some explanations are more apt for unaspirated stops and others are more fit for aspirated
7 Cho & Ladefoged (1999) studied VOTs of eighteen languages and found evidence that in all but one of these, velar stops showed the longest VOTs. 8 Liu, Ng, Wan, Wang & Zhang (2007) mention an array of studies which account for the differences in VOT caused by the PoA. The topics at hand in order to pinpoint the reasons for the different VOTs associated with PoA enumerated by Liu et al. (2007) are: “physiological and aerodynamic characteristics of speech production including the laws of aerodynamics, velocity of the articulators movements, the extent of articulator contact, as well as the temporal adjustment between closure duration and VOT” (Liu et al., 2007). 9 For a more detailed account of these specific characteristics, see Cho & Ladefoged (1999).
14
stops, they propose another explanation, which they claim holds for both aspirated and
unaspirated stops. Since this study is on aspiration, the explanation that was put forward by Cho
& Ladefoged (1999) as the most suitable for aspirated stop consonants, number (5) mentioned
above, will be discussed more elaborately. Without going into too much detail, Stevens10 (1999;
quoted in Cho & Ladefoged, 1999) attributes the varying VOTs in voiceless aspirated stops to
two main factors, namely the opening of the glottal area in various stages in accordance with the
different PoA, and the way the stiffness of the walls of the vocal tract and of the glottis changes
during the realization of the plosives.
Firstly, Stevens (1999; quoted in Cho & Ladefoged, 1999) gives an explanation for the
difference in VOTs associated with the PoA by discussing the changes occurring in the glottal
area. For the production of aspirated stops, it has been demonstrated that, before the release,
the glottis is already open. This opening is created in order to yield aspiration. By contrast, to
allow for voicing – which occurs after the release – the glottal area opening must be reduced.
Only when the glottis opening is decreased to approximately 0,12 cm2, the vocal folds will be
able to vibrate. The speed at which this decrease occurs is what Stevens (1999; quoted in Cho &
Ladefoged, 1999) proposes as the reason for the difference between the plosive categories. It is
suggested that the reduction in size of the glottis area happens faster for alveolar or labial stops
than for velars because the intraoral pressure present before the release drops more rapidly
with the production of the former than for the latter (see Fig. 5).
Figure 5 Schematized representation of the airflow and intraoral pressure in the release phase in voiceless stops (Stevens, 1999; as presented in Cho & Ladefoged, 1999).
10 It must be noted that Stevens (1999; quoted in Cho & Ladefoged, 1999) did not consider unaspirated stops in his description. Only aspirated stops were taken into account.
15
Secondly, during the closure phase of the production of the plosive, the walls of the glottis
and the vocal tract tense up, presumptively as a compensation for the pressure in the oral cavity.
When the release takes place, the intraoral pressure obviously decreases, and this causes an
inward force to strike the glottal walls. As a consequence, the stiffness also diminishes.
Immediately following the release, however, the stiffness does not disappear completely. This
prolonging of the stiffness causes voicing to be delayed. Because the intraoral pressure reduces
more rapidly for bilabials and alveolars than for velar plosives, the walls of the glottis and the
vocal tract relax more rapidly, which creates the opportunity for the vocal folds to vibrate earlier
after the release.
In short, the release of bilabial and alveolar voiceless aspirated stops gives rise to a faster
drop of intraoral pressure, which in turn causes the decrease of the opening of the glottal area to
be faster and the relaxation of the stiffness to happen sooner. These factors combined induce the
onset of voicing to occur sooner for bilabial and alveolar stops than for velar stops.
Consequently, the period of voicelessness and hence the VOT is longer in velar than in bilabial or
alveolar stops.
Even though most of the research on the effect PoA on VOT reveals VOTs of p < t < k to
relate to each other in that way, still there have also been reports of different results which
account for a less drastic distinction between VOTs of the different PoA (Whalen, Levitt &
Goldstein, 2007; see Figure 6). Figure 6 shows that for English-speaking adults, most researchers
have found results that support the p < t < k relationship. There are however others who have
suggested p < t = k as a more correct distinction. Looking at English spoken by children from the
age of one up to seven, it can be noted that even p = t < k is put forward as an option. Then again,
when only the top half of the table is taken into consideration, i.e. VOTs in English, it can be
noted that most researchers agree on the relationship between voiceless plosives as p < t < k.
16
Figure 6 A report on the effect of PoA on VOT in earlier works (as presented in Whalen, Levitt & Goldstein, 2007)
2.2.4 ENGLISH VS. DUTCH
Unlike English, Dutch belongs to the languages characterized by the production of unaspirated
rather than aspirated stops. As a consequence, the duration of VOTs is very different in these
two languages. On average, English VOTs can be measured around 76 ms (Simon, 2010). Their
Dutch equivalents however range between a mere 12 and 27 ms (Simon, 2010).
Previous studies have provided mean VOT measurements for both English and Dutch, which
serve as evidence for the proposed difference between the two languages. Lisker & Abramson’s
(1964) test results show that their English-speaking informants produced VOTs of 58, 70 and 80
ms for /p/, /t/ and /k/, respectively11. Docherty (1992) reports VOT values of 45,74 ms for /p/,
66,45 ms for /t/ and 66,09 ms for /k/. Hawkins (1979; quoted in Docherty, 1992) for example
found that voicing occurred after 47 ms in /p/, after 68 ms in /t/ and after 72 ms in /k/. The
study done by Suomi (1980; mentioned in Docherty, 1992) resulted in the following
measurements for /p, t, k/: 40, 55 and 56 ms. Finally, Simon (2010) found average VOTs of 80
ms in /p/, 73 ms in /t/ and 76 ms in /k/.
11 It should be pointed out that Lisker & Abramson (1964) tested with only four speakers. Hence, these results may not be entirely representative.
17
All of these results can be found in Table 1. In order to be able to compare these results to
the ones obtained in the present study, averages of these previous results were also calculated
and provided in the table below (Table 1).
Table 1 Mean VOTs (in ms) for English aspirated stops measured in previous studies.
For their Dutch unaspirated counterparts, Lisker & Abramson (1964) also provided mean
VOTs12. They found that VOTs lasted 10 ms for /p/, 15 ms for /t/ and 25 ms for /k/. Simon
(2010) reports averages of 12 ms for /p/, 23 ms for /t/ and 29 ms for /k/. These results are
presented in Table 2, along with the average calculated from these measurements.
Table 2 Mean VOTs (in ms) for Dutch unaspirated stops
measured in previous studies.
In the table below (Table 3), a comparison can be found between the average English and
Dutch VOTs for the three voiceless stops /p, t, k/.
Table 3 Comparison between the averages for English and Dutch VOTs (in ms).
This table enhances the considerable difference between English and Dutch VOTs, as a
result of the presence and absence of aspiration. Moreover, the effect of PoA as discussed in
Section 2.2.3 is clearly visible, in both English13 and Dutch.
12 These results were obtained from a sole speaker of Dutch; therefore these results should be used tentatively. 13 Only Docherty (1992; mentioned in van Alphen & Smits, 2004) found /k/ as the plosive with the intermediate VOT, i.e. p < k < t, as can be seen in Table 1. However, /t/ is a mere 0,36 ms longer than /k/.
18
2.3 PREVOICING
In some languages, as in Dutch, initial voiced plosives are produced with prevoicing (also see
Section 2.1). Other examples of languages belonging to this category are Arabic, Bulgarian,
French, Japanese, Polish, Russian and Spanish (examples provided by van Alphen & Smits, 2004).
These languages make a distinction between voiced and voiceless unaspirated plosives, which
means that the voiceless plosives /p, t, k/ are unaspirated and that the voiced plosives /b, d, g/
are characterized by the production of prevoicing. Dutch has only two types of voiced plosives in
its native phonology, namely /b/ and /d/. The voiced counterpart of velar /k/, namely /g/, is
only present in loanwords, such as <goal> (example taken from van Alphen & Smits, 2004). Just
as for voiceless plosives, English and Dutch have differing phonetic realizations of the voiced
plosives, which will be discussed in Section 2.3.4.
2.3.1 WHAT IS PREVOICING?
Van Alphen & Smits (2004) explain that prevoicing is produced during the phase in which the
closure of the initial plosive takes place. It is essentially the vibration of the vocal folds which
occurs before the realization of the initial voiced plosive. Van Alphen & Smits (2004) mention a
number of conditions that need to be met in order to create vibration of the vocal folds during
this process. They got the idea of these physiological and aerodynamic conditions from van den
Berg (1958; quoted in van Alphen & Smits, 2004). Two in particular are discussed.
The first of these conditions is that it must be made sure that the vocal folds are “adducted
and tensed” (van Alphen & Smits, 2004: 457). The second condition involves the transglottal
pressure, which must be adequately adjusted in order to render vocal fold vibration caused by
“enough positive airflow through the glottis” (van Alphen & Smits, 2004). When the articulators
are brought in the position for the production of a plosive, the exit way for the airflow is closed
off. The air that is passing through the glottis cannot leave the oral cavity which results in an
accumulation of intraoral pressure. Ohala (1983; quoted in van Alphen & Smits, 2004) states
that this way, the pressure built up in the oral cavity comes close to resembling subglottal
pressure. However, if the supraglottal area is expanded, this process will be slowed down. The
enlargement of the volume of the vocal tract makes voicing during the closure phase of the
plosive easier. This enlargement can be achieved in two ways: actively or passively14. The first
method – for which van Alphen & Smits (2004) refer back to studies done by Westbury (1983;
quoted in van Alphen & Smits, 2004) and Stevens (1998; quoted in van Alphen & Smits, 2004) –
14 It should be pointed out that van Alphen & Smits (2004) remark that in general it is believed that the processes of both active and passive expansion of the vocal tract volume are used in the production of prevoicing.
19
involves increasing the size of the area above the glottis. This process involves a number of
mechanisms: (1) the lowering of the larynx, (2) the raising of the soft palate, (3) the advancing of
the tongue root, or alternatively, drawing down the dorsum and blade of the tongue. For the
second method, van Alphen & Smits (2004) state that “the supraglottal volume can also be
expanded passively due to the raised intraoral pressure, provided that the walls of the
supraglottal cavity are lax.”15 (van Alphen & Smits, 2004: 457).
2.3.2 NEGATIVE VOICE ONSET TIME
As touched upon in Section 2.1, prevoicing is characterized by negative VOT. This means that
there is voicing to be detected before the release of the voiced stop consonant. Since voicing
occurs before the burst of the plosive, VOT is negative.
In Fig. 7, the waveform and spectrogram of the Dutch word <bal>16 (<ball>) are presented.
The period of prevoicing is hightened in red and the burst of the bilabial voiced plosive /b/ is
marked in purple. Since voicing starts before the release of the plosive /b/, the VOT is labelled as
negative, in this case -96,1 ms. In a word produced by a native speaker of English, no prevoicing
is expected to be found, as can be seen in Fig. 8, which shows the waveform and spectrogram of
the English word <bush>17. Fig. 8 shows no vibration of the vocal folds before the burst, i.e. no
prevoicing was produced. The presence versus absence of prevoicing, in Figures 7 and 8,
respectively, is also visible through the presence (Fig. 7) and absence (Fig. 8) of a voice bar in the
spectrogram (marked in a blue square).
15 For this statement, van Alphen & Smits (2004) draw upon studies by Rothenberg (1968), Stevens (1998) and Svirsky, Stevens, Matthies, Manzella, Perkell & Wilhelms-Tricarico (1997). 16 The sound file for this word was taken from my Bachelor Research paper (Vanlocke, 2010) in which native speakers of Belgian Dutch were asked to perform a reading task. 17 The sound file was cut from the audio CD included in Collins & Vandenbergen (2000).
20
Figure 7 Waveform and spectrogram of the word <bal> produced by a native speaker of Belgian Dutch.
(Praat, Boersma & Weenink, 2011)
Figure 8 Waveform and spectrogram of the word <bush> produced by a native speaker of English.
(Praat, Boersma & Weenink, 2011)
21
2.3.3 INFLUENCING FACTORS
Since fewer studies have been devoted to prevoicing than to aspiration, van Alphen & Smits
(2004) tried to obtain more detailed information on prevoicing in Dutch18. Interestingly they
found that – despite it being an important auditory indicator in determining whether the plosive
is voiced or voiceless – speakers do not produce prevoicing consistently. Van Alphen & Smits
(2004) tested to what extent the production of prevoicing depends on the following factors: (1)
place of articulation, (2) the speaker’s gender, (3) the following phoneme, (4) lexical status and
(5) competitor environment.
Despite expectations expressed by van Alphen & Smits (2004) that factors (4) and (5)
would have an effect on the production of prevoicing, their study proved otherwise. With
respect to factor (4), lexical status, they had expected that when their informants were tested on
non-words, they would hyper-articulate and produce “more reliable prevoicing” (van Alphen &
Smits, 2004: 465), which turned out not to be the case. Finally, with respect to factor (5), a word
in a competitor environment was believed to be a reason to articulate more carefully in order to
reduce the chance of perceptual confusion, i.e. mistaking a voiced plosive for a voiceless one.
However, the data did not confirm this hypothesis.
2.3.3.1 THE EFFECT OF PLACE OF ARTICULATION
Van Alphen & Smits (2004) argue that, since the production of prevoicing is dependent on the
active or passive expansion of the vocal tract which keeps the transglottal pressure high enough,
a more posterior PoA will impose on this expanding capacity. This argument is backed by the
research done on children whose native language distinguishes between voiced plosives
(produced with prevoicing) and voiceless unaspirated plosives. These studies show that those
children do not acquire this contrast as fast as children whose language makes a distinction
between voiceless unaspirated and voiceless aspirated stops. Rothman, Koenig & Lucero (2002;
mentioned in van Alphen & Smits, 2004) attribute this later acquisition to the fact that the size of
children’s vocal tracts is smaller than for adults. Since the vocal tract is smaller, expansion
capacity will automatically also be smaller. Hence, prevoicing will be less easy to produce when
the capacity to expand is smaller. During the production of bilabial plosives more opportunities
are available than for alveolars to both actively and passively create the required enlargements.
18 Based on what e.g. Lisker & Abramson (1964) found, van Alphen & Smits (2004) tested solely on isolated words, since they recognized that when produced in a sentence context, “the phonetic realization of the voicing distinction” (van Alphen & Smits, 2004: 458) may be affected.
22
Active expansion by means of tongue-body movements is most likely to be easier during the
production of /b/ than during that of /d/. While producing bilabial plosives, the tongue can
move more freely than in alveolar plosives since the production of the latter includes the tongue
as a means to render the necessary closure. As a consequence, the intraoral pressure will rise
more quickly in alveolars than in bilabials. Essentially, this means that prevoicing is easier to
produce when the PoA is more anterior, i.e. bilabial /b/, than when it is more posterior, i.e.
alveolar /d/.
Passively, more tissue surface can take part in the expansion process when bilabials are
produced than when alveolars are produced. Alveolars depend on “the pharyngeal walls and
part of the soft palate” (van Alphen & Smits, 2004: 459) for the passive enlargement. Meanwhile,
bilabials can not only make use of the aforementioned factors, but also of “all of the tongue
surface and parts of the cheek”19 (van Alphen & Smits, 2004: 459).
The level of difficulty may well be lower for bilabials than for alveolars, van Alphen & Smits
(2004) did not find an influence on the duration of VOT, but they did find that, since bilabials are
easier to produce than alveolars, the former are produced with prevoicing more frequently than
the latter. However, Smith (1978; mentioned in van Alphen & Smits, 2004) found that, in English,
prevoicing duration was also affected by PoA. Van Alphen & Smits (2004) expected this also to
be the case in Dutch, but their study proved otherwise. It could still occur that the current study
does show a difference in duration of VOT according to PoA, since the target-tokens are English
words but produced by native speakers of Belgian Dutch.
2.3.3.2 THE EFFECT OF GENDER OF SPEAKER
As previous studies have shown (e.g. Stevens, 1998; mentioned in van Alphen & Smits, 2004),
the size of the vocal tract of women is smaller than that of men. As a consequence, the pressure
in the oral cavity rises faster and in turn makes it more difficult for female than for male
speakers to produce prevoicing. Van Alphen & Smits (2004) found that women produced
prevoicing less frequently than men20. These results are in line with an earlier study by Smith
(1978; quoted in van Alphen & Smits, 1978). Van Alphen & Smits (2004) also found a slight
difference in the length of prevoicing, i.e. longer for men (mean: 109 ms) than women (mean: 89
ms; however, this was not significant enough to seriously take into account.
19 For these explanations, van Alphen & Smits (2004) drew on the works of Houde (1968; mentioned in van Alphen & Smits, 2004) and Rothenberg (1968; mentioned in van Alphen & Smits, 2004). 20 The results van Alphen & Smits (2004) obtained from their participants of which five were male and five were female, 86% of the tokens of the male speakers were produced with prevoicing. This stands in stark contrast to the 65% produced by the female speakers.
23
2.3.3.3 THE EFFECT OF THE FOLLOWING PHONEME
Van Alphen & Smits (2004) also examined the effect of the phoneme following the initial plosive
on the production of prevoicing. The target tokens in their study either had a vowel or a
consonant as the second segment. Generally speaking, they found that when initial voiced
plosives are followed by a vowel, prevoicing is produced more often and longer (mean: -118 ms)
than when followed by a consonant (mean: -99 ms).
They claim that the reason for this phenomenon cannot merely be the difference in oral
cavity volume, because even though some of the following consonants give rise to a smaller oral
cavity size compared to when the initial plosive is followed by a vowel, this explanation would
not hold true for all the vowels and consonants they tested. In addition, they state that when
they compared the results among the plosives, they did not find a difference in duration of
prevoicing due to vowel height21 (which also results in differences in oral cavity size). Even
though they make an additional suggestion, namely that “the degree to which the vocal tract can
be expanded (passively or actively) plays a role” (van Alphen & Smits, 2004: 465), they do not
give a satisfactory explanation for this.
2.3.4 ENGLISH VS. DUTCH
A difference can be found between the phonetic realizations of the voiced stops /b/ and /d/ in
English and in Dutch (see Section 2.1). The latter belongs to the vast group of languages which
are characterized by the use of prevoicing. English on the contrary does not. Previous studies
have provided averages to which the results of this study will ultimately be compared.
On average, voiced plosives have been reported to render negative VOTs in Dutch of roughly
-100 to -80 ms and positive VOTs in English between 0 and 10 ms (Simon, 2010). However, it
has been noted that the actual production of prevoicing is largely dependent on the speaker:
some speakers produce it (consistently), while others do not (e.g. Lisker & Abramson, 1964; van
Alphen & Smits, 2004).
Lisker & Abramson (1964) provided average VOTs for voiced plosives, for both English and
Dutch, in isolated words and in connected speech. Words in isolation were produced with an
average VOT of -85 ms for /b/ and -80 ms for /d/ (by one native speaker of Dutch). Words with
/b/ or /d/ in initial position in sentence context were realized with a VOT of -41 ms and -51 ms,
respectively. For the four tested native speakers of English, they measured mean VOTs for /b/ of
anything between the two extremes of 1 to -101 ms and for /d/ between 5 and -102 ms.
21 This is contrary to the results in Smith (1978; quoted in van Alphen & Smits, 2004) who found vowel height to be an influencing factor on the duration of prevoicing as well as on the number of tokens that were prevoiced.
24
Surprisingly, these negative VOTs are longer than the ones recorded in Dutch, which is contrary
to what is generally expected (e.g. Simon, 2010). However, they state that these results can be
attributed to the drastic variation among speakers in whether or not they are producing
prevoicing. Due to the fact that one of the tested speakers produced long negative VOTs while
the other three did not, this is automatically reflected in the averages reported by Lisker &
Abramson (1964), which actually give a distorted picture of reality. In sentences, the reported
averages range from 7 to -65 ms in /b/ and from 9 to -56 ms in /d/. It must be stressed that the
limited number of speakers tested by Lisker & Abramson leads us to make the tentative
conclusion that these values are not representative. Van Alphen & Smits (2004) obtained means
from 10 native speakers of Dutch. They found averages of -82,80 ms and -71,23 ms for /b/ and
/d/, respectively. Table 4 offers a comparison between the results given by Lisker & Abramson
(1964) and by van Alphen & Smits (2004). Furthermore, it shows the average VOTs calculated22
from the results of the aforementioned studies.
Table 4 Mean VOTs (in ms) for Dutch prevoiced stops measured in previous studies.
* These results represent the averages measured in sentence context.
Simon (2010) tested ten native speakers of English on the production of prevoicing. She
found that on average they produced VOTs of -93 ms for /b/ and of -91 ms for /d/. She
furthermore carried out a test which included ten native speakers of Dutch, who produced
English words. The averages that are reported here are -113 ms for /b/ and -105 ms for /d/.
22 The table also shows the VOTs found in sentence context. These were, however, not included in the calculation.
25
3. THE EFFECT OF TRAINING
Over the years, many research papers have been devoted to the topic of pronunciation training
to L2 learners (or learners of a foreign language), and to the question of what is the best method
to teach pronunciation, i.e. the method(s) which yield(s) the most improvement. This chapter
will provide a brief overview of the methods which have been applied in search of the best
methods of pronunciation training. This search ranges from the discussion whether perception
or production training is the technique which gives rise to the most positive result, to the use of
real-time spectrograms which give the learner instant clues on what can and should be
improved.
3.1 PERCEPTION VS. PRODUCTION
It is known that learning both perception and production of a non-native language is difficult,
and is dependent on several factors (e.g. Flege, 1995; Guion, Flege, Ahahane-Yamada, Pruitt,
2000; mentioned in Hazan, Sennema, Iba & Faulkner, Andrew et al., 2005). Flege (1995; quoted
in Hazan et al., 2005), for example, has pointed out that interference from the native language
into the target language, i.e. the non-native language, can be detected. Interference is most
frequent when the sounds are not present in the learner’s native language or when these sounds
have differing phonological realizations in both languages.
Various researchers have found that perception training affects both perception and
production (e.g. Bradlow, Pisoni, Akahane-Yamada & Tohkura, 1997; Flege, 1989; see Section
3.1.1). Others were keen to find out whether production training also affected both production
and perception (e.g. Hattori & Iverson, 2008; Hattori, 2009; Mildner & Tomić, 2007; see Section
3.1.2). The following sections not only provide information on studies which have explored the
aforementioned phenomena, but they also discuss what effect audiovisual training (see Section
3.1.3) has on pronunciation of learners.
3.1.1 PERCEPTION
In his study on Chinese students’ perception of the word-final English /t/-/d/ contrast, Flege
(1989) was able to ascertain that the native language has a significant effect on L2 performance.
He found that those students whose native language does not allow for word-final obstruents
benefit less from perceptual training than those whose L1 involves obstruents in word-final
position. In other words, Flege (1989) argues that training can have a positive effect on L2
perception but only if the feature which learners are being trained on is already present in the
L1.
26
According to Flege (1989), the success of training also depends on the auditory nature of
the feature which is being learned, i.e. contrasts which are not present in the L1 but which are
easily distinguishable are acquired more easily. For example, the contrast between Zulu clicks or
the contrast between Hindi voiceless aspirated and breathy voiced dental stops hardly require
training because these contrasts are auditorily easier to detect. A feature which is not so easily
recognizable will be harder to learn. Furthermore, Flege (1989) partially attributes the
mispronunciation of certain features to learners’ “inability to perceive L2 phones or phonetic
contrasts in a nativelike manner” (Flege, 1989, p. 1684). Flege (1989) thus stresses the
importance of auditory training rather than production training. Moreover, he stresses the
importance of perception as a positive influence on later production.
In accordance with Flege (1989), Bradlow et al. (1997) also found that teaching perceptive
skills to learners improves both perception and production. Their Japanese informants were
trained on perceiving the difference between English /r/ and /l/. The training not only
improved their perception but also their production of the specific English phonemes /r/ and
/l/.
3.1.2 PRODUCTION
Various researchers have pointed out the beneficial effects of production training on learner’s
pronunciation. For example, training sessions – specifically aimed at discrimination and
practice23 – been reported to improve learners’ pronunciation (e.g. Gimson, 1980; quoted in
Kendrick, 1997).
Kendrick (1997) argues that learners need to practice talking in order to improve
pronunciation. She tested students on several tasks and found that all of these exercises resulted
in a better pronunciation24, with the greatest improvement to be noticed in segmental features.
Mildner & Tomić (2007) explored whether speech training (in combination with regular
language classes) helped to improve American English and Spanish native speakers’
pronunciation of Croatian vowels. Acoustic analyses and evaluation processes performed by
native speakers of Croatian led Mildner & Tomić (2007) to the conclusion that students benefit
enormously from speech training, more specifically production training. In other words, they
found that the individual training sessions all students received yielded very good results with
regards to the quality of pronunciation of the foreign language, i.e. Croatian. They also showed
that the native speakers of American English studying Croatian as a foreign language improved
23 Practice designed to work on those specific features which the learner struggles with. 24 The students themselves who were involved in Kendrick’s study all expressed that speaking English as much as possible is of indispensable importance in acquiring the correct pronunciation.
27
their pronunciation more than the Spanish students. Mildner & Tomić (2007) argue that this
difference arises because the vowel system of Spanish is more similar to that of Croatian, i.e. 5-
vowel system (i, e, a, o, u), than the American English vowel system (which is characterized by a
more elaborate vowel system). They claim that, as a result of the similarity between the Spanish
and Croatian vowel systems, American English students will show greater improvements, since
their vowel system initially deviated more from the one they are studying, i.e. Croatian.
Furthermore, they attributed the varying degree of improvement among individual speakers to
extralinguistic factors, such as motivation and attitude.
Previous studies have shown that perception training influences both perception and
production (cf. Section 3.1.1). Hattori & Iverson (2008) examined whether or not production
training has a similar effect on both perception and production, i.e. if production training
positively influences both perception and production. They found that production training was
only effective on the level of production and not on the level of perception. Hattori (2009), who
researched the perception and production of English /r/-/l/ by adult Japanese speakers, found
results which are in line with those of Hattori & Iverson (2008). Even though the speakers’
production of /r/ and /l/ had clearly improved, they had not necessarily improved their
perception of English /r/ and /l/, i.e. the level of accuracy with which they identified English /r/-
/l/ had not drastically ameliorated. Hence, Hattori (2009) concluded that production training
only leads to improvements in pronunciation but not in perception.
3.1.3 AUDIOVISUAL TRAINING
Hazan et al. (2005) researched the technique of audiovisual training. They conducted various
experiments which were designed to find out whether (1) visual gestures can help speakers to
learn certain phonemes, (2) audiovisual training is more effective than perceptual training and
(3) audiovisual training improves both perception and production.
Hazan et al. (2005) retained from their research that audiovisual training can be effective
but only if the visual gestures involved in producing a certain phoneme are easily noticeable for
the learners. However, Hazan et al. (2005) showed that learners are not influenced by visual
clues if these do not carry a phonemic contrast in their native language. In other words, speakers
make no use of the visual clues attached to a particular phoneme – even if the articulatory
gestures are clearly visible – if these gestures are not embedded in their L1 phoneme contrast.
Hazan et al.’s (2005) research showed greater improvements, in the contrast between labial and
labiodental consonants, after audiovisual training than after perceptual training. Improvements
on the account of audiovisual training could not only be noticed in perception but also in
sensitivity to audiovisual clues of the contrast. Students who were trained only on perception
28
skills did not improve these aforementioned skills, i.e. positively identifying articulatory
gestures.
Hazan et al. (2005) performed another experiment in which they put a less visually distinct
contrast than that between labial and labiodentals consonants to the test, i.e. the contrast
between English /l/ and /r/. However, this test did not provide results supporting the theory
that audiovisual training is supposedly more effective than perceptual training, since the
Japanese participants did not show differing results with regard to the different training
methods.
Still, Hazan et al. (2005) found a surprising result regarding the specific method of
audiovisual training. They noticed a considerable difference in improvement between
informants trained with synthetic faces and those who were trained with natural audiovisual
stimuli. The results were in favour of the last technique, i.e. training with natural faces led to
greater improvements.
Another factor which Hazan et al. (2005) aimed to examine was whether audiovisual
training has an effect on pronunciation which is similar to that of perceptual training (perceptual
training improves both perception and production). In other words, they were eager to find out
if just like perception, production also benefits from audiovisual training. They discovered that
the articulatory gestures used to produce English /l/ and /r/ also influence the pronunciation,
even without the occurrence of specific pronunciation training.
3.2 REAL-TIME SPECTROGRAMS
Real-time spectrograms25 have been used in several clinical studies (e.g. Chaney, 1988; Huer,
1989; Hagiwara, Fosnot & Alessi, 2002; mentioned in Hattori, 2009) “for perceptual evaluation
of patients’ speech production” (Hattori, 2009: 133). These studies provide proof for the
effectiveness of the use of spectrograms. Chaney (1988; mentioned in Hattori, 2009) used
spectrograms to analyze American children’s correctly or incorrectly produced semivowels (i.e.
/w/, /r/, /l/ and /j/). Huer (1989; quoted in Hattori, 2009) tested a 10-year-old girl, over a
period of 70 days, who substituted /w/ for /r/. By means of acoustic tracking, Huer (1989;
mentioned in Hattori, 2009) was able to determine whether or not the girl’s speech deficit had
improved. Hagiwara, Fosnot & Alessi (2002; mentioned in Hattori, 2009) made an acoustical
analysis, before and after speech therapy, of a 6-year-old who pronounced /r/ wrongly. Using
25 Real-time spectrograms are spectrograms which are projected simultaneously with the speech production itself. This way, speakers can instantly notice whether or not they have produced a feature correctly whilst the method they used in the production is still fresh in their memories. They can furthermore compare between different productions of the same feature and thus ascertain themselves what the best method is for them to pronounce the feature in the right way.
29
this method, they were able to determine that after having received therapy, the child’s
pronunciation of /r/ had changed for the better, i.e. pronunciation of /r/ had improved.
Recently, real-time spectrograms have been used by a number of scholars in L2 learning
studies (e.g. Hattori & Iverson, 2008; Hattori, 2009). In their experiment, Hattori & Iverson
(2008) made use of real-time spectrograms to establish whether native speakers of Japanese
had improved their production of English /r/ and /l/ after training. They found that, after
pronunciation training, the Japanese participants pronounced English /r/ more accurately26
than before. However, Hattori & Iverson (2008) were also able to conclude that the speakers did
not improve the level of accuracy of identifying English /r/ and /l/ nor did they improve their
ability to discriminate between both phonemes.
Following Hattori & Iverson (2008), Hattori (2009) made use of real-time spectrograms in
his study on the perception and production of English /r/ and /l/ by Japanese speakers. During
the sessions, real-time spectrograms were employed when the participants produced English
/r/ and /l/. The study showed that Japanese speakers had benefited from the pronunciation
training, since they produced “more identifiable English /r/ and /l/ syllables after the training”
(Hattori, 2009; p. 152). Through these positive results, Hattori (2009) was able to confirm that
the procedures employed in the training sessions, i.e. specific instructions and feedback (see
Section 3.3.1), were effective. He states furthermore that “L2 learners seem capable of learning
details of non-native segments (e.g., articulatory movements and temporal information) as long
as specialists (e.g., phoneticians, teachers) orient the L2 learner’s attention to specific aspects of
L2 production.” (Hattori, 2009: 177-178). Hattori (2009) also points out that these results could
suggests that if L2 learners were to be provided with explicit instruction early on in the learning
process, they may not establish incorrect phonetic categories and articulatory movements. This
way they would more quickly and more easily be able to improve L2 phoneme learning.
3.3 OTHER TECHNIQUES USED IN PRONUNCIATION TRAINING
There are other factors beside perception, production or audiovisual training which can be
beneficial to a learner’s pronunciation of a non-native language. Two of these are discussed in
the following sections: feedback and contrasting the foreign language to the native language.
26 The greatest improvement could be noticed in those speakers who produced poor English /r/ before training.
30
3.3.1 FEEDBACK
Various studies on training discussed in the previous sections (e.g. Flege, 1989; Hattori &
Iverson, 2008; Hattori, 2009; see Sections 3.1 and 3.2) have shown a positive effect of feedback
on learners’ pronunciation. The technique of giving immediate feedback to the learner has
proven its effectiveness in training sessions.
In his study on the effect of training on Chinese speakers’ perception of the word-final
English /t/-/d/ contrast, Flege (1989) found a positive effect of feedback training in only two of
his participants. He also addresses the phenomenon of generalization, i.e. generalizing specific
properties – learned in specific tokens – to other words which were not included in the training
session(s). According to Flege (1989), there is a condition which needs to be met in the feedback
training in order to reach generalization. If the feedback training has a beneficial effect on the
trainee’s “tacit knowledge” (Flege, 1989, p. 1691), the learned features are expected to have a
generalizing effect onto untrained words as well as on trained ones. If, however, only the
phonological or phonetic specification of individual words is affected due to training, no
generalization will be found. Furthermore, Flege argues that
“[t]he multiple natural token approach to speech training assumes that exposure
to the acoustic variation between tokens of a single category will induce subjects
to derive a more general representation than they would derive had they been
trained on just a single token.” (Flege, 1989: 1691)
Bradlow et al. (1997) also found that both perceptual and production knowledge was
generalized to novel words.
Hattori (2009) states that the informants who participated in his research were provided
with instant feedback on their production27. The use of this technique ameliorated the
participants’ pronunciation of English /r/ and /l/, not only in words, but also in sentences and
passages. According to Hattori (2009), the extension of knowledge from words to longer speech
utterances proves that the participants generalized /r/ and /l/ productions to continuous
speech.
27 Hattori (2009) mentions an example of what kind of feedback was given to the participants: “if he [the instructor] found that participants’ English /r/ F3 was too high (e.g., 2500 Hz), he told participants to check their tongue potion, tongue shape, and lip shape. If he found that participants were producing good English /r/, he provided positive feedback and encouraged the participants to maintain the articulation and produce the consonant.” (Hattori, 2009: 137).
31
3.3.2 CONTRASTING WITH NATIVE LANGUAGE
In Collins & Vandenbergen’s Modern English Pronunciation, A practical guide for speakers of
Dutch (2000; henceforth MEP), the predominant technique to teach learners a good
pronunciation of English is to provide learners with practice sentences. These practice sentences
are designed to train learners on specific features, one by one. Other exercises are also included
to instruct features of connected speech, such as intonation and stress. Since this guide, designed
by Collins & Vandenbergen, is aimed specifically at learners from the Dutch-speaking part of
Belgium, on many occasions, a direct comparison is made between the pronunciation of certain
features in English to that in Dutch. Many of those instructions point out a certain feature which
exists in one of the languages but does not in the other. Also, references are made to specific
features of English which differ only slightly from the way they are pronounced in Dutch. These
are then described using a particular word to indicate or illustrate the differences and/or
similarities between English and Dutch. For example “English /e/ (in DRESS) is a little closer
than Dutch /ε/ (in ZET)” (MEP, p. 37) or also “English /æ/ is much more open then Dutch /ε/ -
nearer in quality to a shortened version of Dutch /a:/ (in LA)” (MEP, p. 38). Targeting a specific
language and highlighting differences and similarities between the languages seems to be an apt
instruction technique for learners of a foreign language.
32
4. CASE STUDY
4.1 HYPOTHESES
4.1.1 GENERAL AIM OF THE CASE STUDY
The general aim of the case study was to determine in what way pronunciation training affects
the participants’ production of aspiration and prevoicing.
Previous studies (e.g. Collins & Mees, 2008; van Alphen & Smits, 2004) have shown that
both voiced and voiceless plosives are realized differently in Dutch and English. In short,
aspiration is a phonetic process which can be found in English but not in Dutch voiceless stops
and prevoicing can be found in Dutch voiced stops but not in English ones. Therefore it can be
expected that these particular features are difficult to acquire or unlearn, respectively, for
Belgian Dutch speakers of English. On the basis of this knowledge, the following hypotheses can
be proposed.
Since the participants did not get any particularly specific pronunciation training on either
of these features, they are expected to transfer the phonetic realizations of their native language,
i.e. Belgian Dutch, into the second language, i.e. English, in the pretest (see Sections 4.2.2 and
4.3). Aspiration will most likely be absent in their realization of English voiceless plosives
through influence of the Belgian Dutch voiceless unaspirated equivalents of /p, t, k/. For the
production of prevoicing, it is expected that the informants will be inclined to produce
prevoicing in their realization of the English voiced plosives /b, d/, again as a result of the input
from their native pronunciation, which is marked by prevoicing. In other words, they are
expected to do the exact opposite of what is expected in English, i.e. omit aspiration in voiceless
stops and produce prevoicing in voiced stops.
In the posttests (see Sections 4.2.2 and 4.3), i.e. after they have received one-on-one training
and feedback, the informants are expected to have improved their pronunciation of both
features, since the techniques which were used in the training session have shown beneficial
effects on learners’ pronunciation (e.g. Flege, 1989; Hattori & Iverson, 2008; Hattori, 2009). A
relatively big improvement is expected to arise in the picture-naming task which is identical to
the task in the pretest. The participants’ target-like pronunciation is believed to spike in the
words that they were trained on during the session (e.g. pig, tent, key, bus, dad).
33
4.1.2 SPECIFIC HYPOTHESES
A number of specific hypotheses related to the topic can also be suggested. In line with Flege’s
(1989) observations, namely that auditorily easy detectable features of pronunciation are easier
to learn, it can be hypothesized that the habits of the production of aspiration will be more easily
adapted than those of prevoicing, since the latter is not as easily auditorily detected as the
former. With regards to the voiceless plosives, the way the VOTs of /p, t, k/ relate to each other
is expected to be p < t < k, because the present study was conducted with adults. Most of the
studies on VOT in voiceless plosives which were conducted with adult informants have reported
this relationship between VOT lengths (cf. Section 2.2.3). Concerning the voiced plosives, the
following specific hypothesis can be put forward. Since the pool of informants in the present
study includes men as well as women, the difference in frequency according to gender (van
Alphen & Smits, 2004; see Section 2.3.3.2) could possibly also be detected. Moreover, the results
of the current study might show a difference in VOT duration which would contradict van
Alphen & Smits’ (2004; see Section 2.3.3.1) findings that no significant difference according to
PoA of voiced plosives can be found.
4.2 METHOD
4.2.1 PARTICIPANTS
All thirteen participants were selected on the basis of their age, i.e. between 21 and 31 (mean
age: 24,4), and their language background, i.e. they were all native speakers of Belgian Dutch,
who had knowledge of English but had not necessarily received specific training in
pronunciation. The informants all took part in the experiment on a voluntary basis. Before the
actual experiment started, they were asked to fill out a questionnaire28 (see Appendix A) which
contained queries on their language-background, their knowledge of English and a self-
evaluation of their pronunciation of English, alongside two meta-linguistic questions on
aspiration and prevoicing.
Of the thirteen volunteers, eight were female and five were male. All participants – and their
parents – were native speakers of Belgian Dutch. While all of them knew other languages (e.g.
French, German, Italian, Spanish and Japanese), most of them claimed that the only language
they used on a daily basis was Dutch. Four of the thirteen participants stated that they used
English on a daily basis. Only three informants claimed never to have had any contact with
28 Each participant was told that they could fill out the questionnaire in Dutch, if they felt they would otherwise be restricted in giving a satisfactory answer. However, seven of them chose to fill it out in English, which indicates that they felt confident enough to use English to express themselves.
34
native speakers of English. The other ten had come into contact with native speakers of English –
in both spoken and written form – through friends, relatives, customers, business associates or
colleagues. The same ten informants had also spent some time in English speaking countries.
Most of them had spent time in these countries on holiday, but others stated to have gone there
for work or for an internship associated with a student organisation. On average, they had spent
five days to a week in an English speaking country. One person had been there for only one day,
while another had spent six weeks abroad.
All participants had taken an English course at secondary school, ranging from four to seven
years. While seven of them claimed to have received training in pronunciation, they could not
remember any specific features which were trained on during class. They did however
remember the methods which were used in class to improve their pronunciation, i.e. reading
aloud and repetition tasks. At college or university level, five informants had taken an English
class for an average of 2,2 years. Two informants claimed to have received specific instructions
on pronunciation, on all possible features of English. One participant reported to have received
training in a so-called language lab. In other words, he was asked to read out loud into a
microphone, his readings were recorded and afterwards, the instructor listened to the sound
file. The instructor then pointed out the mistakes which were made. One of the participants who
took English courses at university or college – but did not receive any pronunciation training –
stated that she wrote her dissertation in English.
All participants, except for one, acknowledged to have got some of their knowledge of
English from the media, i.e. television, radio, newspapers, magazines, novels, video-games, music
or the internet.
Participants 1, 7 and 12 rated their pronunciation as good to very good and informants 3, 9
10 and 11 marked their pronunciation of English as okay. Participants 2, 4, 5, 8 and 13, however,
labelled their pronunciation as bad, and informant 6 even indicated her pronunciation to be
horrible. The informants explained why they had chosen to self-evaluate their English in that
way. Eight of them stated that their pronunciation clearly gives away that they are not native
speakers, and thus felt their pronunciation to be okay, good or very good. Participant 7 stated
that she is often asked whether she is British or whether she has lived in the UK, which is why
she put down very good. Participant 4 stated that she was used to reading in English but not
speaking and so she felt that her speaking-skills were not satisfactory, so she marked her own
pronunciation as bad. Informant 6 said that she rated her pronunciation as horrible because it
was never trained on in school, so she did not know which features she needed to improve29.
Even though many of them felt as if they had an okay up to a very good pronunciation, all
29 This last claim was also expressed by participant 5.
35
participants expressed that they would like to improve their pronunciation of English. Reasons
for wanting to improve included: the importance of English as a world language, improving
comprehensibility and achieving a native-like pronunciation. The specific features which were
mentioned by the informants, which they would like to improve on are the vowels, the
difference between the onset phonemes in that and think, the difference between final /d/ and
/t/ (as in for example bed and bet). One informant (P7) wanted to be able to know the difference
between British and American English pronunciation of certain words and provided the example
of <schedule>.
Informants took part in the experiment voluntarily and were unaware of the purpose before
taking part in the pretest. The questionnaire revealed that only three of the thirteen informants
knew what the process of aspiration entails (and gave a correct example) and just one claimed to
have heard of prevoicing but failed to give an example.
4.2.2 STIMULI AND DESIGN
4.2.2.1 PRE- AND POSTTEST: PICTURE-NAMING TASK
Each participant was asked to perform a picture-naming task30 which consisted of 75 pictures31.
Of these 75 images, 50 were target-tokens and the remaining 25 were fillers. The fillers were
added in order to draw the participants’ attention away from the purpose of the experiment.
Some examples of distractors are flower, lemon and heart. The 50 target-tokens are consist of 10
word with each of five English plosives (i.e. /p, t, k, b, d/) in the onset, e.g. pig, tent, key, ball, dad.
In the present study, the target stimuli only included plosives in word-initial stressed position;
otherwise, the training session might have become too intricate. In order not to overload the
participants with too much information in a short time-span, these were left out. Only
monosyllables were chosen as target stimuli because these are known to render the longest
VOTs (e.g. van Alphen & Smits, 2004; Spencer32, 1996). Moreover, the tokens chosen to elicit
prevoicing were all words in which the initial voiced plosive was followed by a vowel. Van
Alphen & Smits (2004) showed that initial voiced plosives followed by a vowel are more often
produced with prevoicing than voiced plosives followed by a consonant. Furthermore, the
duration of prevoicing was also found to be longer in voiced plosives with a vowel as the second
30 A picture-naming task was chosen because it was believed that this way, informants would less easily become aware of which features they were being tested on, and because they are then less influenced in their production. It was thought that by naming pictures, participants would produce the words in a more spontaneous way than they would when reading the orthographic forms of the words. 31 See Appendix B1 for a complete list of the target stimuli and fillers. 32 Spencer (1996) states that aspiration occurs after initial voiceless stops /p, t, k/, at the beginning of any stressed syllable, and furthermore distinguishes between monosyllabic and polysyllabic words.
36
segment (cf. Section 2.3.3.3). These are the reasons why the present study contained only target-
tokens of words with initial voiced plosives followed by a vowel.
All of the words, which were believed to be fairly easily recognizable, were retrieved from
memory or through the use of dictionaries. The images accompanying them were looked up with
Google Image Search. The pictures were then set into a PowerPoint presentation33. Some of the
images were complemented by short hints or an explanatory sentence34, in order to heighten the
likelihood of the target words being produced, e.g. This is a ... (cup) of coffee. The images were
projected onto a computer-screen randomly and individually for a period of 7 seconds (i.e. the
participants did not need to press any keys; the slides were programmed to proceed
automatically), which was believed to be long enough to identify them. The participants were
however told that if they had not had enough time to name the picture they could click back and
take their time to name it. Furthermore, they were informed that if they thought they had not
produced the right word, they could correct themselves and name the picture again. This did not
affect the results in any way since naturally, the incorrect production of the target-stimuli was
not taken into account in the analysis.
The same picture-naming task was used in the posttest. It took the informants maximally 10
minutes to perform the picture-naming task.
4.2.2.2 TRAINING SESSION
After having performed the pretest, the participants were given an individual training session of
approximately 25 minutes by a phonetically-trained native speaker of Dutch with a high
proficiency in English. This session was conducted in Dutch rather than in English in order to
ensure that no information was lost on the informants. This way they would also not feel
restricted if they wanted to ask any questions. The training session consisted of two main parts:
a theoretical explanation of aspiration and prevoicing, and a practical production task. During
the session, two laptops were made use of; one that showed the PowerPoint presentation35 and
one which was used during the exercises.
First, the participants were provided with some theoretical background information on the
process aspiration (e.g. what is aspiration, what is positive VOT, etc.). This theoretical part only
33 For a complete rendering of the picture-naming task the way it was presented to the volunteers, see Appendix B2. 34 After having analyzed the sound files of three participants (P1, P2 and P4), it was noticed that some of the tokens were named differently than was intended. Moreover, participants themselves expressed doubt about the correct naming of some of the pictures. It was then decided to change some pictures and/or to add a hint to make them more obvious to the rest of the participants so they would produce the target-token as they were meant to be produced (e.g. the token dive was named as swim, so the picture was changed and the comment He likes to scuba … (dive) was added). 35 See Appendix C1 for the entire PowerPoint presentation as it was used during the training session.
37
included information on aspiration of voiceless plosives in word-initial stressed position. No
information on s + stop clusters nor on unstressed syllables (cf. Section 2.2.1) was provided, in
order to keep it simple enough to understand and remember after only a single short session
and because the stimuli did not contain any of these structures. With regards to the process of
prevoicing, also only basic information was provided on the lack of prevoicing in voiced plosives
in English (e.g. what is prevoicing, what is negative VOT, etc.) so as not to overburden the
informants with too much detailed information. The differences between English and Dutch
production of both of these phenomena were explained and demonstrated by means of listening
fragments36 and of stills of some waveforms and spectrograms (cf. Fig. 1, Fig. 2, Fig. 6 and Fig. 7).
Flege (1989) partially attributes the mispronunciation of certain features to the “inability to
perceive L2 phones or phonetic contrasts in a nativelike manner” (Flege, 1989, p. 1684). Hence,
the participants in the current study could benefit from listening to a native speaker producing
aspiration and prevoicing as it should be. After having noticed the contrast with their native
language, i.e. Belgian Dutch, they might be taught the correct pronunciation in a more easy way.
Aside from the purely theoretical background, the informants were given the opportunity to test
how to produce aspiration and how not to produce prevoicing.
Secondly, the final part of the training session contained a few exercises on words which
had occurred in the pretest (e.g. pen, tea, cat, bus, dog) and which would also be tested again in
the posttest37. The participants could record themselves and re-listen, plus they could watch
real-time spectrograms (SFS/RTGram, Version 1.3) of their own voices, which were provided
with feedback. They were also given a handout38 which contained a summary of the useful tips
that were explained to them during the theoretical part of the training. This way, if they had not
pronounced aspiration or if they had produced prevoicing, they could refresh their memories on
the articulatory gestures39 involved in both processes.
4.2.2.3 EXPANSION TO THE POSTTEST: WORDS
The posttest did not only contain an identical version of the pre- picture-naming test, but also a
short reading test40. This was given to them in a printed out version. Using orthographic forms of
words was expected to have an influence on the participant’s pronunciation. Since they could
36 These soundfiles can also be found on the cd-rom included in Appendix D. 37 This was done in order to see if the words that had specifically been trained on showed a larger improvement than those which were not, or if the participants had generalized the acquired information to so-called ‘new words’ which were not trained on. 38 For an example of the handout, see Appendix C2. 39 These articulatory gestures are different from those in their L1. 40 See Appendix B3 for a complete list of the target-stimuli and the fillers.
38
already see the phoneme with which the words began, they could consider more quickly how to
produce it correctly. The word-reading task took on average 2 minutes to complete.
The list41 contained 30 words, of which an equal number were distractors (e.g. film, chair,
map) and target stimuli (e.g. pay, text, coast, bed, dust). Each plosive occurred three times.
Furthermore, five words – one for each plosive – that had been trained on, i.e. pig, tent, cat, bus,
dad, were also included. This could reveal if these showed more improvements because they had
been trained on. The informants were asked to read the words out loud at their own pace.
4.2.3 PROCEDURE
Before taking part in the experiment, the informants were asked to complete a questionnaire
which gave insights into their respective language backgrounds, knowledge of English, etc. All
thirteen participants were then asked to perform the pretest, i.e. the picture-naming task. They
were seated in front of a laptop, in a quiet room, with the recording device lying on the table
between them and the computer. The recordings were all made on a Philips Digital Voicetracer
7675.
Before testing began, participants were given oral instructions in Dutch on what was
required of them. The same information was repeated in English in written form on the first
slide of the PowerPoint presentation. These 15 seconds during which the proper instructions
were shown on their screen, gave the participants the opportunity to settle themselves and to
prepare to switch to English. The instructor left the room for the duration of the test so they
would feel more at ease and so they would not look for confirmation from the instructor which
would lead to hesitation in pronunciation. They were asked to name the pictures out loud which
appeared on their screen one by one 42. The pretest took about 15 minutes per person including
giving the proper instructions and answering possible questions on the part of the volunteers.
Immediately after the first task, the informants were given the individual training session.
They were told that they could interrupt and ask questions at any time. During this session, a
second laptop was made use of, on which a programme was installed which made it possible for
the informants not only to record themselves and re-listen to their own pronunciation but also
to watch real-time spectrograms (SFS/RTGram, Version 1.3). The instructor was seated next to
them and provided them with instant feedback on the words they trained on.
The posttest was conducted the day after the pretest and the training session. For the
picture-naming task, which was identical to the one in the pretest, the same instructions were
41 See Appendix B4 for the complete list as it was given to the informants. 42 After it was noticed that a few of the participants produced an article before the target-word (e.g. a bear), the request not to do this was added in the instructions to the remaining participants of the pretest, and to all participants before the posttest.
39
repeated. Again, the volunteers were seated at a desk in front of a computer screen, with the
recorder placed between them and the laptop. The test on written words was held afterwards.
The participants were told to read the words out loud at their own pace. The recording device
was placed on the desk next to the paper version of the test. After that, the instructor left the
room for the duration of the tests.
4.2.4 ANALYSIS
Once the sound files were acquired, they were saved on a computer43. Using the software
included with the Philips Digital Voicetracer 7675, the .zva format was converted to a .wav for
analysis. The recordings were analysed in Praat (Boersma & Weenink, 2011). In Praat, the VOTs
were measured (in ms) for both voiced and voiceless plosives. This way it could be determined
whether or not the participants had produced aspiration or prevoicing. To determine production
of prevoicing, the researcher relied on van Alphen & Smits (2004) who stated that:
“The beginning of the prevoicing was defined as the point in time at which evidence
of vocal fold vibration could be detected. Any clearly visible detectable period, no
matter how small in amplitude, was accepted as part of voicing. The end of the
prevoicing was defined as the point in time at which the noise of the release burst
started, visible as a sudden peak in the waveform.”
(van Alphen & Smits, 2004: 461-462)
Aspiration was measured from the onset of the burst up till the onset of voicing for the
following vowel, i.e. up till the moment the waveform became periodic.
4.3 RESULTS AND DISCUSSION
4.3.1 ANALYSIS
In English, words with the voiceless stops /p, t, k/ in the onset are produced with aspiration.
This leads to a longer VOT than in Dutch, since the process of aspiration is absent in the latter. If
participants produce the words in the pretest with VOTs typical for Dutch, it can be concluded
that their native language, i.e. Belgian Dutch, interfered in the pronunciation of English tokens.
Words with voiced stops /b, d/ in the onset are – in Dutch – typically produced with prevoicing,
which renders negative VOT. In English, this process is not present. In case the informants
43 A copy of the acquired recordings and the analysis of the VOTs done in Praat is provided in Appendix D.
40
uttered the words with prevoicing in the pretest, this can be attributed to the influence of
Belgian Dutch.
VOTs for both voiced and voiceless stops were measured again after training. If these
results turned out to be better, i.e. longer positive VOTs for voiceless and shorter negative VOTs
for voiced plosives, it can be concluded that training learners has a positive effect on their
pronunciation of English.
The VOTs for all target tokens, for each of the participants and for each of the voiced and
voiceless stops seperately, are presented in Appendices E and F. Of these results, averages were
calculated, per informant, per target-token, for each of the plosives separately. Results for pre-
(see Appendix E) and posttest (see Appendix F) were first kept separately, but put together
(Appendix G) in the order of the picture-naming tasks as presented to the informants. The
results were then processed in Excel, and put into tables and graphs, in order to highlight any
possible progress participants could have made.
First, an analysis was made of the results of the pretest (cf. Section 4.3.2; Appendix E). The
results of this test were helpful in determining the beginning level of VOTs. This way, the results
of the pre- and post-training tests could be compared to each other in order to establish whether
the informants had improved their pronunciation of English. The results for voiced and voiceless
stops were analyzed separately.
Secondly, the results of the posttest (cf. Section 4.3.3; Appendix F) for both voiced an
voiceless stops were analyzed individually. The results of the picture-naming posttest were then
compared to the results from the pretest, to determine whether training had had an effect on the
participants’ pronunciation habits, and whether the words that were specifically trained on
during the pronunciation session showed greater improvement than so-called new words, i.e.
words which did not receive attention during training. The results of the word-reading posttest
were also analyzed. Those words which had already appeared in both picture-naming tasks
were compared to the VOTs from the word-reading task.
4.3.2 PRETEST
As described in Section 4.2.2, all participants performed a picture-naming task which included
words with both voiced and voiceless stops in the onset. The pretest was designed to establish a
level to compare the posttest results to. This way a possible evolution, as a result of the training
session, could become apparent. In the following sections, voiceless and voiced stops will be
discussed separately.
41
4.3.2.1 ASPIRATION
During the analysis of these results, two trends surfaced: (1) the effect of PoA (discussed in
Section 2.2.3) on the VOTs of the target plosives, i.e. /p, t, k/, manifested itself and (2) the VOTs
produced in the pretest showed a large amount of target-like VOTs44 for each of the plosives.
As described in the literature review (cf. Section 2.2.3), PoA has an effect on the VOT of the
voiceless stops /p, t, k/. In Graph 145, it can be noted that eight of the thirteen participants
demonstrated this effect. Their average VOTs show an increase in length from /p/ over /t/ to
/k/, i.e. VOT of p < t < k. The mean VOT value which was calculated for all thirteen informants
shows that the mean VOT for /p/ (48,2 ms) is 15,7 ms shorter than for /t/ (63,9 ms). The mean
VOT for /k/ (67,6 ms) in turn, is 3,7 ms longer than for /t/. These findings confirm that PoA, i.e.
bilabial, alveolar or velar, affects the duration of VOT. The VOTs of each of the voiceless plosives
relate to each other as p < t < k, as they do here.
The results presented in Graph 1 also show that – in this pretest – many participants
already produced a lot of target-like VOTs. Each of the plosives will be discussed separately.
Even though VOTs ranged widely from participant to participant, most informants produced
a relatively large amount of VOTs – for /p/ and /t/ as well as for /k/ – which are target-like.
Participants 1 and 11 (henceforth, P#) showed the highest VOT values for all of the plosives, i.e.
all above 70,7 ms. The minimum average among all informants for /p/ was 18,1 ms with an
individual minimum of 6,8 in the word <pig>, pronounced by P6. The maximum VOT found for
/p/ was 88,6 ms. The individual maximum of 122,2 ms which was found in the word <pan> was
uttered by P11. Of the 130 tokens with /p/ in the onset, 22 were named differently than was
intended or were not uttered at all.
The remaining 108 target-tokens showed 46 tokens which were pronounced with a target-
like VOT, i.e. of 54,15 ms or more. This means that in the pretest, already 42,6% of the tokens
with the bilabial voiceless plosive in the onset was produced with aspiration.
For /t/, the means among all thirteen participants ranged from 34,4 ms to 94,9 ms.
Individual mean VOTs were measured between 15,9 ms (<tape> produced by P8) and 152,7 ms
(<tea> uttered by P4). In the case of /t/, of the 130 tokens, 26 tokens were not named as was
intended or were not produced. The remaining 104 tokens showed 53 of them which had been
44 The averages – which were used to compare to the ones obtained in this experiment – are the ones calculated from means reported in previous studies (Table 1). These are considered as the target-like VOTs. All the results from this experiment were labeled as target-like, if the VOT for the voiceless stops was anything from the number mentioned in Table 1 upwards. 45 For the exact numbers, see Appendix E1.
42
0,0
10,0
20,0
30,0
40,0
50,0
60,0
70,0
80,0
90,0
100,0
Po
stit
ive
VO
T (
ms)
Informants
/p/
/t/
/k/
produced with a target-like VOT, i.e. of 66,49 ms or more. This leads to the conclusion that for
/t/ a striking 51,0% of the tokens in the pretest was pronounced with aspiration.
Among all informants, average VOTs for /k/ were recorded between a minimum of 32,1 ms
and a maximum of 35,1. Individual averages ranged from as little as 2,4 ms in the word <key>
(produced by P6) to 125,4 ms in the word <cup> (uttered by P13). Ten of the 130 tokens with
/k/ in the onset were named incorrectly, i.e. not the way that was intended by the researcher, or
were not produced. The 120 tokens that remain contained 58 tokens produced with a target-like
VOT, i.e. of 70,02 ms or more. In other words, 48,3% of all tokens with velar plosive /k/ in the
onset were pronounced with aspiration.
Taking all tokens and all informants into consideration, the average values for each of the
plosives are 48,2 ms for /p/, 63,9 ms for /t/ and 67,6 for /k/. These means come very close to
the means found in previous studies (cf. Table 1). For /p/ this is only 5,95 ms shorter, for /t/
only 2,59 ms shorter and for /k/ only 2,42 ms shorter. Of all three voiceless stops together, the
informants produced an impressive 47,3% with aspiration.
Graph 1 Average VOT results for voiceless plosives
in the pretest
43
Table 5 provides an overview of the number of times (out of a possible 10), each participant
uttered each of the plosives with a target-like VOT. The Table illustrates that P1 uttered tokens
with target-like VOT the most often of all informants, i.e. 93%. P6 and P12 performed the worst,
with only 3% of tokens produced with a target-like VOT. What can also be noticed in the Table is
that the tokens with /k/ in the onset were most often produced with aspiration. On average
among all participants, /k/ was pronounced target-like 4,5 times, while target-like VOTs in /p/
and in /t/ were produced 3,5 and 4,0 times, respectively.
Table 5 Number of times target-like VOT per informant per plosive
All in all, the results of the pretest lead to the conclusion that the informants performed the
test differently than was expected. It was foreseen that more influence for Belgian Dutch would
be noticeable, i.e. that VOTs would be shorter since no aspiration is produced in Dutch. Not only
did the participants produce longer VOTs than was anticipated, they uttered the voiceless
plosives more often than was expected with target-like VOTs.
4.3.2.2 PREVOICING
The analysis of the results obtained in the pretest of the voiced plosives /b/ and /d/ revealed
five things: (1) the frequency of target-like VOT is not significantly influenced by PoA, (2) the
effect of gender on the production of prevoicing did not manifest itself, (3) the height of the
vowel did not seem to influence the production of prevoicing, (4) prevoicing is a process which
some speakers have the tendency to produce (more consistently) while others do not and (5)
those speakers who did prevoice did it less heavily than was expected.
44
What the results of the pretest definitely demonstrate is that prevoicing is a process which
some speakers have the tendency to produce while others do not, or not as frequently. Hence,
the two extremes can be found, i.e. from 0 ms to over -200 ms. Each of the plosives is discussed
separately.
The average minimum negative VOT for /b/ was measured at -38,3 ms. The maximum VOT
that was recorded is -143,5 ms. An individual maximum can be found in P13 with a VOT of -
228,7 ms in the word <bus>. The minimum VOT – aside from 0 ms – found in individual results
is -20,6 ms by P3. Of the 130 tokens with the bilabial voiced plosive in the onset, 6 tokens were
not named or were named wrongly. The 124 tokens which were named as was intended,
rendered 31 tokens, that is 25,0%, with a VOT of 0 ms, i.e. target-like.
For /d/, mean negative VOTs ranged from -114,8 ms to -24,6 ms. The longest VOT was
measured in the word <dead>, uttered by P12 (-192,2 ms). P9 produced the shortest VOT,
namely -31,4 ms in <dance>. Ten of the total of 130 tokens were not named the way it was
intended or were not uttered at all. The remaining 120 tokens gave 37 target-like VOTs. This
means that 30,8% of the tokens with /d/ in the onset had a VOT of 0 ms.
Taking both voiced plosives into account, 27,9% of all cases was produced without
prevoicing. This means that the informants prevoice less heavily than was expected. Besides
from the 27,9% of tokens produced with a VOT of 0 ms, the mean VOTs reported in Graph 246 are
in line with (or are a little shorter than) the results obtained by Simon (2010), who also tested
on prevoicing in English spoken by native speakers of Belgian Dutch (cf. Section 2.3.4).
Even though no real manifestation of difference between PoA could be noticed, the
difference in percentages between /b/ and /d/, 25,0% and 30,8% respectively, does show a
slight tendency of the bilabial plosive to be produced more often with prevoicing than its
alveolar counterpart.
46 For the exact numbers see Appendix E2.
45
Graph 2 Average VOT results for voiced plosives in the pretest
An overview of the number of times (out of a possible 10), each participant, uttered each of
the plosives with a target-like VOT is presented in Table 6. The Table shows that P1 consistently
produced prevoicing in the pretest, which is equivalent to 0 % target-like VOTs of 0 ms. The best
results are found in P7 and P11, who uttered 55,0% of all tokens target-like. Only a slight
difference can be seen between both voiced plosives, i.e. alveolar /d/ (2,8 times) was produced
with a VOT of 0 ms, a mere 0,4 times more often than bilabial /b/ (2,4 times). The bilabial voiced
plosive /b/ as well as the alveolar voiced plosive /d/ were produced with a VOT of 0 ms roughly
only 2,5 times, i.e. a fourth of all times.
The anticipated effect of PoA on VOT in voiced plosives did not manifest itself dramatically.
The informants omitted prevoicing only 6 times more often in /d/ compared to /b/ (31 times 0
ms for /b/ and 37 times 0 ms for /d/). The effect of gender is also not noticeably present, neither
is the supposed influence of the height of the vowel (cf. note 21).
0,0
20,0
40,0
60,0
80,0
100,0
120,0
140,0
160,0
Ne
ga
tiv
e
VO
T (
ms)
Informants
/b/
/d/
46
Table 6 Number of times target-like VOT per informant per plosive
To sum up, the participants produced a relatively great number of tokens with a target-like
VOT (27,9%). The average VOTs of those tokens which were pronounced with prevoicing were a
little shorter than anticipated though comparing them to results obtained from previous studies.
This led to the conclusion that the informants not only prevoiced less often than expected but
also less heavily than was foreseen.
4.3.3 POSTTEST
The posttest was conducted after each participant had taken part in a one-on-one training
session on the processes of aspiration and prevoicing. The test consisted of two parts, namely a
picture-naming task (identical to the one performed in the pretest) and a word-reading task.
Both tasks will be discussed separately.
4.3.3.1 PICTURE-NAMING TASK
4.3.3.1.1 ASPIRATION
Three things became apparent through the analysis of the results of the posttest picture-naming
task: (1) the effect of PoA shows itself in the mean results, (2) the informants produced many
tokens with aspiration, i.e. VOTs were target-like on many occasions and (3) the results obtained
for those tokens which were trained on during the session did not differ significantly from new
words.
47
First, the mean VOTs presented in Graph 347 show gradually increasing VOTs starting from
/p/ going to /t/ and ending at /k/, for seven of the thirteen participants. One of the remaining
six who does not show this evolution is P4, who diverges the most obviously. In the alveolar
plosive /t/, P4 consistently produced VOTs of over 120 ms. As a consequence, the mean
calculated for all informants together shows p < k < t. If P4 is taken out of the equation, the
average VOT for /t/ is 83,0 ms (instead of 87,6 ms). This is 18,9 ms longer than the average
reported for /p/ (64,1 ms) and 1,9 ms shorter than the mean VOT for /k/ (84,9 ms), i.e. VOT p <
t < k.
What can also be retained from looking at the results reported in Graph 3 is that many of
the VOT durations are target-like. The averages for each of the voiceless stops will be discussed
individually.
Starting with /p/, the lowest average duration of 14,8 ms was recorded while the highest
average was 120,1 ms. P2 provided an individual minimum VOT value of 5,2 ms produced in the
onset of the word <pea>. The target-token <pool> produced by P10 contained the individual
maximum VOT of 209,8 ms. The 130 tokens consisted of 14 tokens which were not produced or
which were named differently than was intended. Of the remaining 116 correctly named tokens,
66 tokens were produced with target-like VOTs, i.e. 54,15 ms or more. This means that 56,9% of
all tokens with /p/ in the onset were produced with aspiration.
The average production of VOT in /t/ ranged from 37,5 ms to 143,2 ms. Of all the words
with /t/ in the onset, P2 produced the lowest VOT of 14,9 ms in the word <two>. P4, though,
produced a VOT of 190,6 ms in the word <tea>. Twenty-three out of a possible 130 tokens were
not uttered or were named in a different way than the researcher had intended. The 107
correctly named target-tokens contained 79 tokens which were produced with a VOT of 66,49
ms or more. In other words, 79 tokens – which is 73,8% – were produced with a target-like VOT,
i.e. with aspiration.
The velar voiceless plosive /k/ gave rise to averages for all participants between 49,4 ms
and 135,5 ms. The informants which produced the individual lowest and highest VOTs are P10
and P8, with a VOT of 22,3 ms in <cat> and 154,9 ms in <curl>, respectively. Only 6 of the 130
tokens starting with /k/ were not included in the calculations because they were either not
produced or they were not named as was intended. The remaining 124 tokens consisted of 82
tokens which were pronounced with aspiration, i.e. with a VOT of 70,02 ms or more. This is
66,1%.
In all three voiceless plosives /p/, /t/ and /k/, the maximum and minimum values
described above, are at two ends of the extreme. The largest difference is to be noticed in the
47 For the exact numbers, see Appendix F1.
48
0,0
20,0
40,0
60,0
80,0
100,0
120,0
140,0
160,0
Po
siti
ve
VO
T (
ms)
Informants
/p/
/t/
/k/
alveolar plosive /t/, i.e. 175,7 ms difference between the shortest and longest VOT (14,9 ms as
opposed to 190,6 ms). Nevertheless, of all 390 tokens taken together from all plosives, 43 tokens
had to be excluded because they were not uttered or because the informants gave them a
different name than was intended by the researcher. Of the remaining 347, 227 tokens or else
65,4% were produced with aspiration.
Graph 3 Average VOT results for voiceless plosives in the posttest picture-naming task
The six words, two for each of the plosives, which were specifically trained on during the
session (i.e. pig, pen, tent, tea, cat and key) did not show significant difference in length of VOT
with the ones which were not trained on. However, the highest individual VOTs could often be
found in the words practiced on in the training session. This was not the case for /p/, but it was
for the other two voiceless plosives. P2, P4 and P13 produced the longest VOTs for /t/ in the
word <tea> and P6 in the target-token <tent>. For /k/, P3, P4, P7, P9, P10 and P12 all produced
the longest VOTs in the word <key>.
Table 7 presents an overview of the number of times (out of a possible 10), each participant
uttered each of the plosives with aspiration. Of all thirteen informants, P8 produced aspiration
the most consistently. She produced target-like VOTs 97% of the time. This stands in stark
contrast to P6 who applied the process of aspiration in the production of initial voiceless
plosives only 10% of the time. On average, the VOTs of the voiceless plosives were produced
target-like 5,7 times out of 10, i.e. more than half of the time. The difference between plosives is
49
not significant, with a difference of only 0,9 times in favour of /t/ compared to /p/ (5,2 times for
/p/ and 6,1 times for /t/). The difference between /t/ and /k/ is a negligible (0,2 times).
Table 7 Number of times target-like VOT per informant per plosive
The results provided in Appendix F1 also show that the highest number of times a word was
produced with a target-like VOT was to be found in one of the words that was trained on. For
/p/, the only word which was produced 9 out of 13 times with a target-like VOT was <pen>. The
word <tea> was produced with a target-like VOT 10 out of 13 times. Eleven out of 13 times, the
word <key> was uttered with a target-like VOT. All three aforementioned words, i.e. pen, tea and
key, were those which were practiced on last in the training session.
All in all, 65,4% of all pronounced tokens were characterized by the production of
aspiration. The words which were used and practiced on during the training session showed
mildly better results than the so-called new words, both in frequency and in duration of VOT.
4.3.3.1.2 PREVOICING
The results of the posttest picture-naming task show that – similar to the pretest – the expected
effects of gender, PoA and the height of the following vowel do not manifest themselves. The
analysis of the results48 of the production of the voiced plosives illustrated that more than half of
the tokens were produced with a target-like VOT of 0 ms. Both plosives are discussed separately.
48 The mean results of the picture-naming posttest on prevoicing are presented in Graph 4. The exact numbers of the test can be found in Appendix F1.
50
0,0
10,0
20,0
30,0
40,0
50,0
60,0
70,0
80,0
90,0
Ne
ga
tiv
eV
OT
(m
s)
Informants
/b/
/d/
The longest reported average VOT value for /b/ is -69,2 ms, while the shortest (besides
from 0 ms) is only -5,8 ms. On an individual level, P10 showed the maximum duration of VOT in
the word <bar> (-190,9 ms). The word <bird> made P8 prevoice for only -8,9 ms. Of the 130
tokens with /b/ in the onset, only 2 were named in another way than was intended or were not
produced at all. The 67 times out of a possible 128 in which a VOT of 0 ms was measured, means
that 52,3 % of all tokens with beginning /b/ were not produced with prevoicing.
In the voiced alveolar plosive /d/, the average VOTs lie between 0 ms and -80,0 ms. P5
however, produced a VOT of -151,9 ms in the token <dead>. Besides from quite a few target-like
VOTs of 0 ms, the shortest negative VOT was produced by P1 with in <dance> (-9,7 ms). For /d/,
out of 130 tokens, 9 were named incorrectly or not at all. In 62 tokens, out of a possible 121, a
VOT of 0 ms was recorded, which is 51,2%.
All tokens (260) of each plosive produced by all participants taken into account, 129 tokens
(besides the 11 which were not named correctly or were not produced at all) had a target-like
VOT. In other words, 51,8% of all 249 tokens were produced without prevoicing.
Graph 4 Average VOT results for voiced plosives in the posttest picture-naming task
51
Four specific words, two for each of the plosives (i.e. ball, bus, dog and dad), were given as
practice words in the training session. These words did not render significantly shorter VOTs –
besides from the cases in which the VOT was 0 ms – than those which were considered as new
words.
Table 8 shows the number of times (out of 10) each participant uttered each of the two
plosives with a target-like VOT. The overview illustrates that P11 performed the picture-naming
posttest on prevoicing the worst, with only 10% target-like VOTs. P3 pronounced 90% of all
tokens without prevoicing, i.e. an increase of 80% compared to P11, and thus performed the
best.
Similar to the pretest, only a minor difference between PoA, i.e. between the bilabial and the
alveolar stops can be seen in the posttest. Tokens with /b/ in the onset (5,2 times) were
produced on average only 0,4 times more often than in /d/ (4,8 times). Also, no contrast
between male or female speakers was found in terms of frequency of prevoicing. Neither was
the height of the following vowel was also not an important factor in the production of
prevoicing.
Table 8 Number of times target-like VOT per informant per plosive
With regards to the trained words, i.e. bus, ball, dog and dad, no drastic difference was found
between those words which were trained and those which were un-trained. However, two of the
trained words, one for each of the voiced stops, was produced without prevoicing the most times
of all tokens. In other words, <bus> gave rise to 9 times 0 ms and so did <dad>.
To sum up, more than half of all uttered tokens were produced without prevoicing. This
means that the informants were less inclined to prevoice than was expected (through the
52
influence of their native language). The words which were trained on showed a slight advantage
in frequency of target-like production.
4.3.3.2 WORD READING TASK
The final task which the informants were asked to perform, was a word-reading task. It had been
suggested that the visual clues of orthographically spelled words would simplify the process of
knowing when to aspirate or when not to produce prevoicing. The results for voiceless and
voiced plosives will be discussed one by one.
4.3.3.2.1 ASPIRATION
The word-reading task again showed that VOTs for /p/ are shorter than the ones for /t/, which
are in turn shorter than the VOTs for /k/. Six out of the 13 participants showed this relationship
between the voiceless plosives, and the mean for all participants together confirms it again. The
results49 of the final task also show that the participants produce a great number of target-like
VOTs, i.e. they produced aspiration a lot of the times, and one participant even consistently did
so. Furthermore, the words which the informants trained on did not render target-like VOTs
more frequently than the other words but they did show longer VOTs than the untrained words.
Each plosive will be discussed individually.
For the bilabial plosive /p/, average VOTs produced by all participants ranged from 23,3 ms
to 143,5 ms. The shortest and longest individual VOTs can both be found in the token <pig>
produced by P10 (13,1 ms) and by P6 (156,3 ms), respectively. Of the 39 tokens, 28 tokens were
produced with a target-like VOT for /p/. This is 71,8% of all tokens with /p/ in the onset.
The average maximum and minimum VOT values for /t/ are measured at 35,5 ms and 142,2
ms, respectively. The individual shortest VOT (of 20,7 ms) manifested itself in <tie> produced by
P12. A very long VOT was pronounced by P6 in the word <tent>, namely of 224,2 ms. One of the
39 tokens was produced incorrectly50. Of the other 38 words, 27 had a target-like VOT, which is
71,7% of all tokens.
Mean VOTs for /k/ can be found between 42,9 ms and 150,0 ms. Of all tokens with /k/ in
the onset, P4 produced the minimum VOT value (37,7 ms in the target-token <corn>), while P8
provided the maximum of 170,5 ms (in the word <coast>). The 39 tokens gave 27 tokens which
were pronounced with a VOT which was target like for the velar plosive /k/. This means that
69,2% of tokens with /k/ in the onset were aspirated.
49 The mean results of the word-reading posttest are presented in Graph 5. For the exact numbers of the test, see Appendix F2. 50 P3 said */θent/ instead of [thent].
53
0,0
20,0
40,0
60,0
80,0
100,0
120,0
140,0
160,0
Po
siti
ve
VO
T (
ms)
Informants
/p/
/t/
/k/
On average, taking into account all participants and all three plosives, 70,7% of all tokens, or
else 82 out of 116 tokens, were produced with aspiration.
Graph 5 Average VOT results for voiceless plosives in the posttest word-reading task
No significant difference in VOT duration can be found between trained and untrained
words. Of the three tokens starting with /p/, only P6 produced the longest VOT in <pig>, i.e. the
word with /p/ in the onset which was trained on during the practical part of the training
session. Of the three tokens with /t/ in the onset, 7 participants (P1, P2, P4, P6, P7, P11 and P11)
produced the longest VOT in the trained word, i.e. <tent>. P4 produced a VOT of 218,0 ms in
<tent>, which is a striking 181,7 ms longer than the second highest VOT value (in <tie> with
105,8 ms). Five informants produced the longest VOTs in the trained word with /k/ in the onset
(<cat> produced by P1, P3, P6, P7 and P10). The VOT in <cat> pronounced by P10 is 37,5 ms
longer than the second highest VOT (i.e. 84,8 ms in <corn>).
54
Table 9 Number of times target-like VOT per informant per plosive
The number of times (out of a possible 3) – presented separately for each participant – each
uttered plosive was produced with a target-like VOT are provided in Table 9. The Table
illustrates that P9 performed the worst, with only 10% target-like VOTs. P1 and P8, with 100%
of tokens pronounced target-like, had the best results. On average, the participants produced 2
out of the 3 target-tokens with target-like VOTs, or else 66,7%.
The words which were practiced on during the training session were not produced more
often with target-like VOTs than the other words which did not receive any attention whilst
training.
In short, the word-reading task gave rise to a great deal of target-like pronounced voiceless
stops, in some cases even 100% and on average an impressive 70,7 ms. The trained words were
not produced with aspiration more often than the others, but they did create longer VOTs.
4.3.3.2.2 PREVOICING
The posttest word-reading task on prevoicing showed remarkable results51. Three of the
thirteen informants did not produce prevoicing in any of the tested cases. The results also
showed that the trained words were not particularly more influenced by the training than the
other target-tokens which were tested on in the word-reading task. Each plosive is discussed
individually.
51 Graph 6 represents the mean results obtained in the word-reading task on the production of prevoicing. The exact numbers for this test can be found in Appendix F2.
55
0,0
20,0
40,0
60,0
80,0
100,0
120,0
140,0
160,0
Ne
ga
tiv
eV
OT
(m
s)
Informants
/b/
/d/
Mean values for /b/ range from 0 ms to -159,4ms. P10, however, produced the longest
negative VOT in the token <bus> with -198,3 ms. In 19 of the 39 cases, VOT of 0 ms was
recorded, which is 48,7%. P2 clearly found it harder not to produce prevoicing in /b/ (-124,0
ms) than in /d/ (-35,5 ms) since the difference in average VOT between both plosives in her case
is 88,5 ms. This is an individual instance in which the effect of PoA on prevoicing can be
ascertained. The tokens with voiced stops in the onset produced by P2 were influenced by the
PoA, i.e. /b/ is produced with a longer negative VOT than /d/. However, this individual case
cannot lead us to make any general assumptions on this topic.
The recorded mean VOTs for /d/ in the word-reading task range from 0 ms to -123,3 ms.
The word <dunk> rendered the longest negative VOT, i.e. -153,0 ms, uttered by P5. One of the 39
target-tokens for /d/ was pronounced incorrectly52. The other 38 tokens contained 19 tokens
which had a VOT of 0 ms. This means that in 50% of all cases, no prevoicing was produced.
If all the tokens uttered by all thirteen participants are taken into account, 38 tokens were
pronounced with a target-like VOT of 0 ms, which is roughly half of the times (49,4%).
Graph 6 Average VOT results for voiced plosives
in the posttest word-reading task
52 Surprisingly, P12 replaced initial /d/ in <dunk> by /θ/.
56
The words which were specifically concentrated on during the training session, i.e. <bus>
and <dad>, did not give rise to significantly shorter VOTs than the other so-called new words,
except in the case of P2.
Table 10 Number of times target-like VOT per informant per plosive
The number of times (out of a possible 3), each of the plosives was produced with a target-
like VOT by each of the participants are presented in Table 10. This Table shows that on the one
hand, P5, P10 and P11 performed 0% target-like VOTs and on the other hand that P1, P4 & P7
produced 100% of the tokens without prevoicing. These two extremes, i.e. from 0% to 100%
target-like production, clearly illustrate the tendency of some speakers to prevoice and others
not to. The table furthermore also shows that /b/ and /d/ were both produced with a VOT of 0
ms in 1,5 of the 3 times, i.e. half of the times. In other words the frequency with which the
participants produced prevoicing was not influenced by the PoA, since both plosives gave rise to
the same amount of target-like VOTs.
The trained word <bus> was produced 6 times without prevoicing, which is an equal
number of times or even less than the other two words (<bed> also 6 times and <boat> 7 times).
The token with /d/ in the onset which was trained on, i.e. <dad>, was produced without
prevoicing 9 out of 13 times. This is 3 times more than for <dust> and 5 times more than for
<dunk>. In other words, <dad> seemed to be a word which the informants found easier to
produce without prevoicing. This could be attributed to the fact that the token <dad> was
practiced on during the one-on-one training session.
57
To summarize, the words which were specifically trained on did not render significantly
different results from the ones which were not practiced during the session. Nonetheless, the
informants omitted the production of prevoicing approximately half of all times.
4.3.4 PRETEST VS. POSTTEST: A COMPARISON
In this section, the results of the pretest and those of the posttest will be compared53. By
confronting these results, we will be able to see whether or not the informants improved from
the pre-training to the post-training tests. In the following sections, the processes of aspiration
and prevoicing will be discussed separately.
4.3.4.1 ASPIRATION
The differences in VOT duration between the pre- and posttest will be discussed first. Looking at
the results obtained from both picture-naming tasks, presented in Graph 7, it becomes clear that
all participants except one (P1)54 on average produced longer VOTs for all plosives in the
posttest than in the pretest. Eight out of the thirteen informants produced even longer VOTs in
the posttest word-reading task. On average, an increase of VOT duration is visible from the
pretest over the posttest picture-naming task, to the posttest word-reading task. P6 showed the
greatest improvement among all participants, in the word-reading task. There, she made her
mean VOTs on average 87,2 ms longer than in the pretest and 75,2 ms longer than in the picture-
naming posttest. P8 also made a considerable improvement in VOT length, from 39,7 ms in the
pretest to 125,5 ms and 135,7 ms in the picture-naming and word-reading posttests,
respectively. In the posttest word-reading task, P6 corrected herself after having said <cat> with
a VOT of approximately 23 ms. When she produced the word a second time, she added
aspiration (VOT of 135,7 ms). Self-correction did not always lead to an improvement in VOT
production. P10 uttered the target-token <tongue> a first time in the posttest but corrected
himself because he knew he had not pronounced it correctly, i.e. he had not produced aspiration.
The second time, the VOT was only a little longer but unfortunately it was still not target-like.
It became clear during the training that P12 experienced difficulty in producing aspiration,
especially in initial /p/ and /t/. While practicing, the instructor noted that he often uttered /θ/
instead of [th]. He was corrected by the instructor, who explained that this was not the aim. By
the end of the training session, P12 had improved on his pronunciation of all three plosives.
Unfortunately, in the posttest, he produced quite a lot of /θ/ where he should have aspirated.
53 For a list of the results of the pre- and posttest picture-naming task, in order in which the tests were conducted, see Appendix G. 54 The difference however is not significant (only 2,8 ms).
58
0,0
20,0
40,0
60,0
80,0
100,0
120,0
140,0
Po
siti
ve
VO
T (
ms)
Informants
Pretest
Posttest Picture-naming
Posttest Word-reading
Once, he even produced /θ/ in a word with /d/ in the onset (namely <dunk> in the word-
reading task). The only other participant who confused aspiration for /θ/ was P3, but only in a
single production (<tent> in the word-reading task).
Graph 7 Comparison between mean VOTs produced in voiceless stops in the pre- and possttest
Secondly, the mean number of times a target-like VOT was produced, also increased from
the pre- to the posttest. On average, each of the plosives was produced two times more with a
target-like VOT in the posttest picture-naming task than in the pretest. The informant who
increased the number of times the most was P13, with 3,2 times more in the posttest compared
to the pretest.
The words which the participants were specifically trained on did not show particularly
greater improvements than those which did not receive any special attention during the training
session. The participants also improved their VOTs in other words. Even some cases when a
different word than the target-word was uttered – also with a voiceless stop in the onset –
improvement could be noticed. For example, P3 uttered <cake> instead of <pie> in both pre- and
posttest. The second time the word <cake> was pronounced, the informant produced a VOT of
101,3 ms, which is an increase of 50 ms from the VOT of 51,3 ms produced in the pretest. In
other words, P3 produced aspiration in the word <cake> in the posttest. Another example can be
found in P8 who said <coconut> instead of <palm> in both picture-naming tasks. The first time,
59
the word was uttered with a VOT of 35,6 ms, i.e. not target-like. In the posttest, P8 produced a
clearly target-like VOT of 123,1 ms. These two cases illustrate that improvement in VOT
duration was not limited to only those words which the participants practiced on during the
training session. The informants generalized the information received in the pronunciation
training session to new words with /p/, /t/ or /k/ in the onset.
4.3.4.2 PREVOICING
With regards to the process of prevoicing, some participants changed their pronunciation
drastically. Three participants (P1, P4 and P7) performed the posttest word-reading task
perfectly, i.e. they did not produce prevoicing in any of the target-tokens with a voiced plosive in
the onset. All but two participants (P10 and P11) produced shorter VOTs in the posttest picture-
naming task compared to the pretest. Nine informants produced shorter VOTs in the posttest
word-reading task compared to the pretest. Not all informants however produced shorter VOTs
in the word-reading task than in the posttest picture-naming test. For seven of them (P2, P3, P5,
P10, P12 and P13) the word-reading task rendered longer VOTs than the picture-naming task in
the posttest. Looking only at the mean results presented in Graph 8, an improvement can be
found from the pretest to both posttests.
P1 was the only participant who prevoiced consistently in both /b/ and /d/ and produced
the longest mean VOTs in the pretest. He went on to omitting the production of prevoicing for
100% in the word-reading posttest. Surprisingly, P11 did worse in the posttest picture-naming
task on prevoicing than in the pretest. In the posttest she produced only 10% of all target-tokens
with a target-like VOT of 0 ms while in the pretest this was 55%.
60
0,0
20,0
40,0
60,0
80,0
100,0
120,0
140,0
160,0
Ne
ga
tiv
eV
OT
(m
s)
Informants
Pretest
Posttest Picture-naming
Posttest Word-reading
Graph 8 Comparison between mean VOTs produced in voiced stops in the pre- and possttest
Similar as in the case of aspiration, self-correction did not always lead to better results. P12,
for example, corrected himself a few times in the word <dad> but still did not manage to
produce the token without prevoicing. In other instances when P12 corrected himself – because
he knew he had still produced prevoicing when he should not have – he only succeeded a few
times (e.g. the VOT of the initial /b/ in <bus> went from -76,3 ms to 0 ms in the word-reading
task). Surprisingly, in the pretest, P11 said <dæns> without prevoicing but when she corrected
herself and said <da:ns>, she produced prevoicing (VOT of -65,6 ms). A possible explanation
could be that, since the second utterance was a correction of the first, she hyper-articulated in
order to make sure that she pronounced the word in the right way the second time around.
The words which the participants were trained on did not render significantly lower VOTs
than those which were not included in the training session. However, the greatest number of
times each plosive was produced without prevoicing is to be found in the practiced words (i.e.
<bus> and <dad>). On average, tokens were produced without prevoicing 2,4 times more in the
posttest picture-naming task than in the pretest. P3 produced the most tokens with a target-like
VOT (9 times) but P1 showed the greatest improvement, i.e. from 0 times in the pretest to 7,5
times in the word-reading posttest. Strikingly, P11 produced prevoicing 4,5 times less in the
posttest compared to the pretest (from 5,5 times in the pretest to only 1 time in the posttest).
61
4.3.4.3 GENERAL DISCUSSION
Table 11 provides a clear overview of the average VOT durations for all plosives. The table
shows that on average, the informants in voiceless plosives produced VOTs of approximately 23
ms longer from pretest to posttest. For the voiced plosives /b/ and /d/ this is about 26 ms.
Table 11 Comparison between VOTs in pretest and posttest
The table also illustrates that the VOTs of voiceless plosives become continuously longer
moving from the pretest over the posttest picture-naming task to the posttest word-reading
task. For the voiced plosives, this is not the case. While there is definitely improvement from the
pretest to the posttest in general, the word-reading task did not render VOTs closer to the
target-like VOT of 0 ms than the picture-naming posttest.
Those participants who had claimed to have received pronunciation training before taking
part in the experiment, did not produce more target-like VOTs in the pretest. The informants
who stated to have heard of the processes of aspiration and prevoicing did also not perform
better55 than those who did not.
It could be noted that those who had expressed to possess a relatively good pronunciation
of English performed better in both tests than those who were very uncertain in naming the
words. Some did not only express that they did not possess a good pronunciation of English, bit
also that they were not that good at English in general. For those informants (P5 and P6), the
lower level of proficiency impeded them to pay extra attention to their production of either of
the features. However, P6 did show considerable improvement in the production of both
aspiration and prevoicing.
Overall, an 18,1% and 23,9% increase of target-like VOTs from pre- to posttest for
aspiration and prevoicing, respectively, proves that even a single short session has an impact on
speakers’ pronunciation of voiced and voiceless plosives.
55 Not even those who were able to provide a correct example in the questionnaire, i.e. [khæt] or [thent].
62
4.3.5 SUMMARY
Contrary to what was hypothesized, before taking part in the training session, the participants
on average already produced a great number of target-like VOTs in voiceless plosives. They did
however produce even more target-like VOTs after training. All tests confirmed that PoA
influences VOT, with velar /k/ rendering the longest, alveolar /t/ the intermediate and bilabial
/p/ the shortest VOTs. Since the informants produced even longer VOTs in the posttest word-
reading task, it can be suggested that orthographic clues help learners ascertain when to
produce aspiration.
Prevoicing proved to be a process which certain speakers have a greater tendency of
producing than others, as was suggested in the literature (see Section 2.3). In other words, some
participants did not prevoice in some instances already in the pretest, some improved after
training and some did not succeed as well as others in omitting the production of prevoicing in
English. The informants did not prevoice as heavily or as frequently as was anticipated. The
foreseen difference in PoA and between genders was not significant. On average, there was no
greater improvement to be found in the word-reading task, which leads us to suggest that the
orthographic forms of words do not aid speakers in their production of prevoicing.
In short, the pronunciation training proved to be successful for both the process of
aspiration as for prevoicing.
63
5. CONCLUSION
Native speakers of Belgian Dutch are influenced in the pronunciation of English by their L1.
Since the majority of the participants in the current study had not received any particular
training in pronunciation, interference from their L1 was definitely expected to occur before
they took part in a specially designed training session on aspiration and prevoicing.
The general aim of the study was to find out whether or not a single training session could
have a positive effect on native speakers of Belgian Dutch’s production of aspiration and
prevoicing. The resulst showed a clear improvement from the pretest to the posttests. In other
words, the pronunciation training session was definitely effective. Since a single training session
can improve a speakers’ pronunciation, this implies that if more attention was to be paid to
pronunciation training in classes (at secondary school and/or at university or college), more
non-native speakers of English might possess a more native-like pronunciation. It must be
recognized though that studying as few participants as was done here, it is impossible to make a
generalizing conclusion on this topic. A larger group of informants trained and tested over a
longer period of time could render a more general conclusion on the effect training has on the
pronunciation of English by non-native speakers.
A possible explanation for the overall positive results, could be the training methods which
were used during the session. It could be that the combination of perception, production and the
use of real-time spectrograms provided the informants with sufficient information to
understand the phenomena well enough to not only apply them in the words which were
specifically practiced but also in new words.
Maybe more training sessions could render even greater improvements, since a clearly
positive effect of the pronunciation training can already be detected after only one session. The
informants who did not improve significantly would probably benefit the most from more
training sessions. However, I believe that their minor improvement is more due to the fact that
they do not possess the language well enough to pay extra attention to pronunciation features as
specific as aspiration and prevoicing. These participants showed that it might be more beneficial
to them to train the correct pronunciation through orthographically spelled words, since they
showed higher VOTs and more often target-like VOTs in the word-reading task than in the
picture-naming tasks.
A possible suggestion for further research could be to use the combination of pronunciation
training techniques on other features of English. Combining the widely researched methods, i.e.
perception training, production training, audiovisual training and the use of real-time
spectrograms, and not one or the other might render the best results when training a non-native
speaker in pronunciation.
64
REFERENCES
van Alphen, Petra M. & Smits, Roel. 2004. Acoustical and perceptual analysis of the voicing
distinction in Dutch initial plosives: the role of prevoicing. Journal of Phonetics, 32, 455-491.
Boersma, Paul & Weenink, David. 2010. Praat: doing phonetics by computer (Version 5.1.29).
[Software]. Retrieved March 14, 2010, from http://www.praat.org/. Bradlow, Ann R., Pisoni, David B., Akahane-Yamada Reiko & Tohkura Yoh’ichi. 1997. Training
Japanese listeners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production. Journal of the acoustical society of America, 101(4), 2299-2310.
Cho, Taehong, Ladefoged, Peter. 1999. Variation and universals in VOT: evidence from 18
languages. Journal of Phonetics, 27, 207-229. Collins, Beverley & Inger M. Mees. 2008. Practical phonetics and phonology. London:
Routledge, 83-84.
Collins, Beverley & Vandenbergen, Anne-Marie. 2000. Modern English pronunciation. A practical guide for speakers of Dutch. Gent: Academia Press.
Docherty, Gerard J. 1992. The timing of voicing in British English Obstruents. Berlin, New York:
Foris Publications, 25 & 116. Flege, James Emil. 1989. Chinese subjects’ perception of the word-final English /t/-/d/ contrast:
Performance before and after training. Journal of the acoustical society of America, 86(5), -1684-1697.
Hattori, Kota. 2009. Perception and Production of English /r/-/l/ by Adult Japanese Speakers.
Unpublished Doctoral Dissertation, University College London. Hattori, Kota & Iverson, Paul.2008. English /r/-/l/ pronunciation training for Japanese speakers.
Journal of the acoustical society of America, 123, p. 3327. Hazan, Valerie, Sennema, Anke, Iba, Midori & Faulkner, Andrew. 2005. Effect of audiovisual
training on the perception and production of consonants by Japanese learners of English. Speech Communication, 47(3), 360-378.
Huckvale, Mark. 2010. SFS/RTGram (Version 1.3). [Software]. Retrieved April 12, 2011, from
http://www.phon.ucl.ac.uk/resourse/sfs/rtgram. Kendrick, Helen. 1997. Keep them talking! A project for improving students’ L2 pronunciation.
System, 25(4), 545-560. Kessinger, Rachel H. & Blumstein, Sheila E. 1998. Effects of speaking rate on voice-onset time
and vowel production: Some implications for perception studies. Journal of Phonetics, 26, 117-128.
Lisker, Leigh & Abramson, Arthur S. 1964. A cross-language study of voicing in initial stops:
acoustical measurements. Word (reprinted from), 20(3), 384-422.
65
Liu, Hanjun, Ng, Manwa L, Wan, Mingxi, Wang, Supin & Zhang, Yi. 2007. Effects of Place of Articulation and Aspiration on Voice Onset Time in Mandarin Esophageal Speech. Folia Phoniatrica et Logopaedica, 59, 147–154.
Magloire, Joël & Green, Kerry P. 1999. A Cross-Language Comparison of Speaking Rate Effects on
the Production of Voice Onset Time in English and Spanish. Phonetica, 56, 158-185. Mildner, Vesna & Tomić, Diana. 2007. Effects of phonetic speech training on the pronunciation of
vowels in a foreign language. In: Trouvain, J, Barry, W.J. (eds.), Proceedings of the 16th International Congress of Phonetic Sciences, Saarbruecken, pp. 1665-1668.
Simon, Ellen. 2010. Voicing in Contrast. Acquiring a Second Language Laryngeal System. Gent:
Academia Press. Spencer, Andrew. 1996. Phonology. Theory and Description. Cambridge MA: Blackwell,
206-212. Vanlocke, Janey. 2010. The phonological representations of cognates vs. noncognates in second
language learners. The production of aspiration of voiceless stops in English-Dutch cognates by native speakers of Dutch. Unpublished Bachelor Research Paper, Ghent University.
Whalen, D. H., Levitt, Andrea G. & Goldstein, Louis M. 2007. VOT in the babbling of French- and
English-learning infants. Journal of Phonetics, 35, 341-352.
66
APPENDICES
APPENDIX A: QUESTIONNAIRE
Q U E S T I O N N A I R E
Informant No. ...... Age: …… Gender: male/female Language background
- Native language: ……………………………………………………………………………………………………………………. - Native language of your parents:
o Mother: ……………………………………………………………………………………………………………………… o Father: ……………………………………………………………………………………………………………………….
- Language(s) used on a daily basis: ..………………………………………………………………………………………… - Besides Dutch, have you studied any other languages? Yes/No
o Which one(s)? .…………………………………………………………………………………………………………… - Have you had any contact with native speakers of English? (e.g. friends, relatives, colleagues,
fellow students, etc.) Yes/No ………………………………………………………………………………………………... o How? Spoken/written/both
- Have you spent any time in English-speaking countries? Yes/No o What is the longest period you have ever spent in an English-speaking country?
…………………………………………………………………………………………………………………………………… o In what context? (e.g. work, vacation, foreign exchange student programme, etc.)
…………………………………………………………………………………………………………………………………… Knowledge of English
- Did you take an English course at secondary school? Yes/No (In case your answer is no, skip this question and go ahead to the next one)
o How many years? ………………………………………………………………………………………………………. o Did you get any specific training in pronunciation? Yes/No
Which skills were trained during the course? (e.g. vowels, the difference between that and think, the difference between bet and bed, etc.) ………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………...........
How were these skills trained? (e.g. reading aloud, repetition task, etc.) ………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………..........
- Did you take an English course at college or university? Yes/No (In case your answer is no, skip this question and go ahead to the next one)
o How many years? ………………………………………………………………………………………………………. o Did you get any specific training in pronunciation? Yes/No
Which skills were trained during the course? (e.g. vowels, the difference between that and think, the difference between bet and bed, etc.) ………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………...........
How were these skills trained? (e.g. reading aloud, repetition task, etc.) ………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………..........
67
- Did you take an adult education English course? Yes/No (In case your answer is no, skip this question and go ahead to the next one)
o Why? (e.g. to improve your knowledge of English, to improve your pronunciation, to train in speaking English, etc.) ………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………....................................... ...............
o How many years? ………………………………………………………………………………………………………. o Did you get any specific training in pronunciation? Yes/No
Which skills were trained during the course? (e.g. vowels, the difference between that and think, the difference between bet and bed, etc.) ………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………..........
How were these skills trained? (e.g. reading aloud, repetition task, etc.)? ………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………..........
- Did you get (part of) your knowledge of English via the media? Yes/No (Please mark which ones)
o Television o Radio o Newspaper o Magazine o Internet o Other: …………………………………………………………………………………………………………………………
Self-evaluation of pronunciation of English
- How would you rate your pronunciation of English? (Please mark one of the possibilities)
o Horrible o Very bad o Bad o Okay o Good o Very good o Perfect
- Why did you rate yourself in that way? (e.g. because people do not always understand what you are saying, because you do not experience any problems with comprehensibility but your pronunciation clearly gives away that you are not a native speaker, because your pronunciation is native-like, etc.) ………………………………………………………………………………………………………………………………………………..………………………………………………………………………………………………………………………………………………..………………………………………………………………………………………………………………………………………………............
- Would you like to improve your pronunciation of English? Yes/No (Please give a reason why in any case)
o Why? (e.g. to improve comprehensibility, to achieve native-like pronunciation, you do not think it is necessary, etc.) ………………………………………………………………………………………………………………………………………………..………………………………………………………………………………………………………………………………………………..………………………………………………………………………………………………………….......
o Which feature(s) would you like to improve? (e.g. vowels, the contrast between that and
think, the difference between bet and bed, etc.) (If your answer to the previous question was no, skip this one and go ahead to the next one) ………………………………………………………………………………………………………………………………………………..………………………………………………………………………………………………………………………………………………..………………………………………………………………………………………………………….......
68
Some final questions
- Have you heard of the process of aspiration? Yes/No o Could you give an example? ……………………………………………………………………………………..
- Have you heard of the process of prevoicing? Yes/No o Could you give an example? ………………………………………………………………………………………
T H A N K Y O U F O R Y O U R P A R T I C I P A T I O N !
69
APPENDIX B: PRETEST AND POSTTEST
1. LIST OF TOKENS USED IN PRE- AND POSTTEST PICTURE-NAMING TASK
ASPIRATION
/P/ /T/ /K/
1. palm talk call
2. pan tall car
3. pea tape card
4. pear tea cat
5. pen tent cold
6. pie time cow
7. pig toe cup
8. pill tongue curl
9. pink toy key
10. pool two king
PREVOICING
/B/ /D/
1. back dad
2. ball dance
3. bar dark
4. bear day
5. beer dead
6. bike deer
7. bird dive
8. bomb dog
9. box door
10. bus duck
70
DISTRACTORS
1. ambulance 16. moon
2. apple 17. nurse
3. chicken 18. old
4. ear 19. orange
5. eat 20. out
6. egg 21. rainbow
7. elephant 22. sheep
8. fish 23. spoon
9. flower 24. volcano
10. grass 25. world
11. hand
12. heart
13. lemon
14. light
15. milk
Note: all tokens were presented to the informants randomly, see Appendix B.2.
71
2. SLIDES AS PRESENTED IN PRE- AND POSTTEST PICTURE-NAMING TASK
72
73
74
75
76
3. LIST OF TOKENS US ED IN POSTTEST WORD-READING TASK
ASPIRATION
/P/ /T/ /K/
1. pay tent cat
2. pig text coast
3. pin tie corn
PREVOICING
/B/ /D/
1. bed dad
2. boat dunk
3. bus dust
DISTRACTORS
1. ankle 11. sheets
2. chair 12. sun
3. enter 13. swim
4. film 14. under
5. filter 15. walk
6. green
7. hope
8. instant
9. link
10. map
Notes:
- Tokens in bold had already occurred in the pre- and posttest picture-naming task, and were trained on during the training session.
- All tokens were presented to the informants randomly, see Appendix B.4.
77
4. LIST OF WORDS AS PRESENTED IN POSTTEST WORD-READING TASK
Practice words
1. film
2. coast
3. under
4. pay
5. link
6. tent
7. ankle
8. cat
9. enter
10. bus
11. chair
12. hope
13. tie
14. swim
15. corn
16. walk
17. dad
18. sun
19. pig
20. filter
21. dust
22. sheets
23. pin
24. boat
25. instant
26. text
27. bed
28. green
29. dunk
30. map
78
APPENDIX C: TRAINING SESSION
1. SLIDES USED IN TRAINING SESSION ON ASPIRATION AND PREVOICING
79
80
81
2. HANDOUT WITH TIPS ON ASPIRATION AND PREVOICING
Aspiration Hoe testen op aspiratie? Houd een stukje papier voor je mond. Zeg nu de woorden <pig>, <pen>, <tent>, <tea>, <key> en <cat>. Als het papiertje merkbaar bewoog dan heb je aspiratie geproduceerd. Als het niet bewoog, probeer het opnieuw.
Hoe je productie van aspiratie te verbeteren?
- Voor /p/: o Ontspan je lippen o Verwijder de spanning die bij het Nederlands aanwezig is
- Voor /t/: o Ontspan je tong o Gebruik het puntje van je tong o Verwijder de spanning die bij het Nederlands aanwezig is
- Voor /k/: o Ontspan je tong o Verwijder de spanning die bij het Nederlands aanwezig is
Prevoicing Hoe testen op prevoicing?
Voel of je stembanden trillen wanneer je de woorden <bus>, <ball>, <dad> en <dog> uitspreekt. Als je je stembanden voelde trillen vóór je de /b/ of/d/ uitsprak dan heb je prevoicing geproduceerd. Probeer het opnieuw zonder je stembanden te laten trillen.
Hoe geen prevoicing te produceren in het Engels? Zorg voor een beetje meer spanning in je tong en in je spraakkanaal voor je de /b/ of /d/ produceert, zodanig dat je stembanden minder snel gaan trillen.
82
APPENDIX D: COPY OF RECORDINGS
83
APPENDIX E: RESULTS OF PRETEST
1. ASPIRATION
/P/
/T/
/K/
84
2. PREVOICING
/B/
/D/
Notes:
- In case a word was not named correctly, the word that was uttered instead was added in the table (e.g. * swim).
- In case no word was uttered, the symbol / was added.
85
APPENDIX F: RESULTS OF POSTTEST
1. PICTURE-NAMING TASK
ASPIRATION
/P/
/T/
Notes: - In case a word was not named correctly, the word that was uttered instead was added in
the table (e.g. * cake). - In case no word was uttered, the symbol / was added. - P12 often mistook aspiration (e.g. <tea>, [thi:]) for /θ/ (e.g. <tea>, *[θi:]). In that case, the
words in the table were spelled using <th> as /θ/.
86
/K/
PREVOICING
/B/
/D/
87
2. WORD-READING TASK
ASPIRATION
/P/ /T/
/K/
88
PREVOICING
/B/ /D/
Notes:
- P3 mistook aspiration in [thent] for /θ/, i.e. *[θent]. In that case, the word in the table was spelled using <th> as /θ/.
- P12 mistook not producing prevoicing in <dunk> for /θ /, i.e. *[θunk]. In that case, the word in the table was spelled using <th> as /θ/.
89
APPENDIX G: RESULTS OF PRE- AND POSTTEST PICTURE-NAMING TASK
90
Notes:
- In case a word was not named correctly, the word that was uttered instead was added in the table (e.g. * greet).
- In case no word was uttered, the symbol / was added. - P12 often mistook aspiration (e.g. <tea>, [thi:]) for /θ/ (e.g. <tea>, *[θi:]). In that case, the
words in the table were spelled using <th> as /θ/.