Link to School of Computer Science Homepage Link to UNSW@ADFA Homepage

UNSW@ADFA Computer Science School Seminar

Title Detection of "Foreign Accent" by Native and Non-Native Listeners
Speaker Dr Kimiko Tsukada, Macquarie University
Date Thursday, 28th September 2000
Time 11:10 -- 12:00
Venue Computer Science - Room 152
Abstract

This study compares the perception of Japanese speaking listeners (n=7) with that of native English speaking listeners (n=19). These two groups of listeners (henceforth J and E, respectively) listened to pairs of monosyllabic English words (/CVt/ and /CVd/), the one produced by an Australian and the other by a Japanese talker and identified the native English talker in each pair. The Japanese productions of vowels in the stimuli were carefully matched in the F1-F2 plane (the two lowest resonance-frequencies of the vocal tract) at the temporal midpoint and duration to those produced by native Australian English talkers. The J listeners found the task very hard and performed more poorly than the E listeners (60% vs 79%). Japanese listeners made more perception errors when the stimuli included mid-low vowels such as /E/, /A/, /V/ and /O/. This may be related to the acoustic vowel space of Japanese in which, unlike English, the only low vowel /a/ is not cramped by the neighbouring vowels. The fact that there is a large area within which the Japanese /a/ is free to vary may impede the development of speech perception/production necessary to differentiate the English vowels. Analyzing the results by talker, it was noted that those non-native talkers who were categorized as native by the E group were also perceived as native by the J group. However, the J group inaccurately judged some non-native talkers as native which the E group generally classified correctly as non-native. Thus, the two groups of listeners appear to use different criteria in "foreign accent" detection. Furthermore, it was observed that some of the J group had difficulty making native vs non-native judgements even when the stimuli included their own speech. Although none of the English speaking listeners provided speech data, the results seem to suggest that there are some fundamental differences between the perceptual strategies of the J and E groups, which become more pronounced when the listening conditions are not ideal such as limited acoustic variability available to listeners. ___________________________________________ Target audience: [computer speech scientists and technologists, linguists, phoneticians] _______________________________ Potential applications: [automatic language identification, multi-lingual speech recognition and synthesis]

 

For information on our seminar program, suggestions for seminars, or mailing list updates, please email: seminars@cs.adfa.edu.au or see: http://www.cs.adfa.edu.au/seminars/2003/

 

CRICOS Provider Number: 00100GdotCopyright and DisclaimerdotLast update: Eri Uchida - 05 March 2003