Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








1,094 Hits in 5.0 sec

Speaker-basis Accent Clustering Using Invariant Structure Analysis and the Speech Accent Archive

Nobuaki Minematsu, Shun Kasahara, Takehiko Makino, Daisuke Saito, Keikichi Hirose
2014 The Speaker and Language Recognition Workshop (Odyssey 2014)   unpublished
the accent distance between any pair of the speakers by using their speech samples only.  ...  Creating the map, i.e., speaker-basis accent clustering, mathematically requires a distance matrix in terms of accents among all the speakers considered, and technically requires a method of predicting  ...  Use of speech structure to cluster simulated learners In [27] , we applied the pronunciation structure analysis to cluster simulated Japanese learners of English.  ... 
doi:10.21437/odyssey.2014-25 fatcat:6r4c3lljnfaoboudktqo62hl2q

Automatic pronunciation clustering using a World English archive and pronunciation structure analysis

H.-P. Shen, N. Minematsu, T. Makino, S. H. Weinberger, T. Pongkittiphan, C.-H. Wu
2013 2013 IEEE Workshop on Automatic Speech Recognition and Understanding  
In experiments, the Speech Accent Archive (SAA), which contains speech data of worldwide accented English, is used as training and testing samples.  ...  This paper investigates invariant pronunciation structure analysis and Support Vector Regression (SVR) to predict the inter-speaker pronunciation distances.  ...  The invariant structure analysis was proposed in [8, 9] inspired by Jakobson's structural phonology [10] and it can extract invariant and robust features.  ... 
doi:10.1109/asru.2013.6707733 dblp:conf/asru/ShenMMWPW13 fatcat:pl5l34ttrndblbhlbda5r2ibka

Noise-robust and stress-free visualization of pronunciation diversity of World Englishes using a learner's self-centered viewpoint

Yuichi Sato, Yosuke Kashiwagi, Nobuaki Minematsu, Daisuke Saito, Keikichi Hirose
2015 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE)  
Accent clustering requires a technique to quantify the accent gap between any speaker pair and visualization requires a technique of stress-free plotting of the speakers.  ...  We have developed two techniques of individual-based clustering of the diversity [1, 2] and educationallyeffective visualization of the diversity [3].  ...  Training of SVR is done by using all the archive speakers and testing is done by predicting the accent gap between that new speaker and each of the archive speakers.  ... 
doi:10.1109/icsda.2015.7357855 dblp:conf/ococosda/SatoKMSH15 fatcat:pt4ritnbhngalbcklw6sldgv2q

P7. Experimental investigation of the definition of reference accent distance between speakers toward automatic accent clustering of speakers of World Englishes (Summaries of Presentations at the 28th General Meeting)

Tianze Shi, Shun Kasahara, Nobuaki Minematsu, Daisuke Saito, Keikichi Hirose
2014 Journal of the Phonetic Society of Japan  
Saito, and K. Hirose, "Speaker-basis accent clustering using invariant structure analysis and the speech accent archive," Proc.  ...  This research trial was conducted by using the Speech Accent Archive (SA A) [2],where readings of a common paragraph were collected from more than t8K international speakers including many non-native speakers  ... 
doi:10.24467/onseikenkyu.18.3_62_2 fatcat:npcn6bwvubbwflcj2adh7if5cy

ワークショップ「有声促音の音声学的諸問題:地域変異と発話スタイルを中心に」(日本音声学会2014年度(第28回)全国大会発表要旨)

松浦 年男
2014 Journal of the Phonetic Society of Japan  
For automatic clustering of speakers in terms of accents, it is necessary to measure the accent distance between an arbitrary pair of speakers. For that, in [1], we trained a machine so that  ...  日本音声学会 2014 年度 (第 28 回) 全国大会 発表要旨 only speech samples but also their IPA narrow transcripts.  ...  Saito, and K. Hirose, "Speaker-basis accent clustering using invariant structure analysis and the speech accent archive," Proc.  ... 
doi:10.24467/onseikenkyu.18.3_63_2 fatcat:ku3sejtqmrchzmb6zbbhr7j6ki

Global Performance Disparities Between English-Language Accents in Automatic Speech Recognition [article]

Alex DiChristofano, Henry Shuster, Shefali Chandra, Neal Patwari
2023 arXiv   pre-print
We audit some of the most popular English language ASR services using a large and global data set of speech from The Speech Accent Archive, which includes over 2,700 speakers of English born in 171 different  ...  Past research has identified discriminatory automatic speech recognition (ASR) performance as a function of the racial group and nationality of the speaker.  ...  ACKNOWLEDGMENTS The authors thank Abigail Lewis for her helpful insights on quantitative analysis, the stargazer R package [19] , and The Speech Accent Archive [48] .  ... 
arXiv:2208.01157v2 fatcat:zl2fbtb6ircvfaovpoojtnobou

Neural Representations for Modeling Variation in Speech [article]

Martijn Bartelds, Wietse de Vries, Faraz Sanal, Caitlin Richter, Mark Liberman, Martijn Wieling
2022 arXiv   pre-print
We use these representations to compute word-based pronunciation differences between non-native and native speakers of English, and between Norwegian dialect speakers.  ...  Transformers) lead to a better match with human perception than two earlier approaches on the basis of phonetic transcriptions and MFCC-based acoustic features.  ...  Acknowledgments The authors thank Hedwig Sekeres for creating the transcriptions of the Dutch speakers dataset, and Anna Pot for creating the visualization of the acoustic distance measure.  ... 
arXiv:2011.12649v3 fatcat:mifjjs23tbgmfc2bf67hr7mzhu

Diachronic and Synchronic Variability of the English Phoneme /h/

Christelle Exare
2020 Recherches anglaises et nord-américaines  
Trask (2003: 106) writes that, nowadays, "for most English and Welsh speakers, the in hair and head is just as dead as those in light and loud". e loss of /h/ (H-dropping, or aich dropping) remains stigmatized  ...  and contrasts with a tendency to hypercorrection-i.e. the insertion of an illicit [h]. is article is a synthesis of the literature on English /h/, with special attention to its diachronic and synchronic  ...  She stresses that these phonemes have all undergone positional and structural weakening in the history of English.  ... 
doi:10.4000/ranam.728 fatcat:kepzy6peqnerrfwgevqlxo3peu

Page 1963 of Linguistics and Language Behavior Abstracts: LLBA Vol. 26, Issue 4 [page]

1992 Linguistics and Language Behavior Abstracts: LLBA  
Sample analyses of archival X-ray ut- terance data from a speaker of American English are presented.  ...  It is shown that this explicit gestural model of phonetic structure can be used to investigate the contextual variation of phonetic units such as schwa in natural speech.  ... 

Transcription of multi-genre media archives using out-of-domain data

P. J. Bell, M. J. F. Gales, P. Lanchantin, X. Liu, Y. Long, S. Renals, P. Swietojanski, P. C. Woodland
2012 2012 IEEE Spoken Language Technology Workshop (SLT)  
We describe our work on developing a speech recognition system for multi-genre media archives.  ...  The high diversity of the data makes this a challenging recognition task, which may benefit from systems trained on a combintation of in-domain and out-of-domain data.  ...  Each show was first segmented and clustered by speaker using the CU RT-04 diarisation system [20] .  ... 
doi:10.1109/slt.2012.6424244 dblp:conf/slt/BellGLLLRSW12 fatcat:aim2jg6trfeitnt6l2amqkbje4

Chapter 5. "Organically German"? [chapter]

2021 Studies in Language Variation  
The work that forms the basis for this plenary lecture is published as Szmrecsanyi et al. (2019).  ...  Benedikt Szmrecsanyi's plenary lecture, "Variation squared", aimed to bridge the gap between the intra-speaker approach to variation from comparative sociolinguistics and the inter-speaker focus from quantitative  ...  Acknowledgements We gratefully acknowledge the use of the Speech Accent Archive under the Creative Commons License.  ... 
doi:10.1075/silv.25.05ful fatcat:ymnwueq5zjdrxop2ltrkuty53q

Unsupervised Learning for Expressive Speech Synthesis

Igor Jauk
2018 IberSPEECH 2018  
Once the feature set is defined, it is used for unsupervised clustering of an audiobook, where from each cluster a voice is trained.  ...  Results show that a combination of traditional and i-vector based features performs better in unsupervised clustering of expressive speech than traditional features and even better than large state-of-the-art  ...  Acknowledgements First of all I would like to thank Antonio Bonafonte for his help, lead and patience, and for the opportunity to work and to develop this work in his group.  ... 
doi:10.21437/iberspeech.2018-38 dblp:conf/iberspeech/Jauk18 fatcat:6zogjdy3gjgslfbbgrqirjzsx4

The co-variation of phonology with morphology and syntax: A hopeful history

FRANS PLANK
1998 Linguistic Typology  
Dependencies between sound structure on the one hand and word, phrase, clause, sentence, and discourse structure, or also lexical structure, on the other were something 4  ...  The variables that are allegedly interrelated pertain to segment inventories, the shapes of syllables, morphemes, and words, phonological or morphonological rules, tones and accents, and rhythmic or prosodic  ...  distinctions lost in consonant shifts), and even the structure of verse and music (with poets and singers drawing on what they know and do äs Speakers).  ... 
doi:10.1515/lity.1998.2.2.195 fatcat:2qifcdnqfzcydckeae3cjsyeyi

The co-variation of phonology with morphology and syntax: A hopeful history

Frans Plank
2017 Linguistic Typology  
The variables that are allegedly interrelated pertain to segment inventories, the shapes of syllables, morphemes, and words, phonological or morphonological rules, tones and accents, and rhythmic or prosodic  ...  patterns on the one hand and to analytic or (poly-)synthetic grammar, Separatist or cumulative morphological exponence, the complexity of grammatical units, and their linear order on the other.  ...  distinctions lost in consonant shifts), and even the structure of verse and music (with poets and singers drawing on what they know and do äs Speakers).  ... 
doi:10.1515/lingty-2017-1007 fatcat:nhq4zcwtzvgk3pwvdsbntbhlha

Computational intelligence in processing of speech acoustics: a survey

Amitoj Singh, Navkiran Kaur, Vinay Kukreja, Virender Kadyan, Munish Kumar
2022 Complex & Intelligent Systems  
This paper presents a comprehensive survey on the speech recognition techniques for non-Indian and Indian languages, and compiled some of the computational models used for processing speech acoustics.  ...  Combination of MFCC and DNN–HMM classifier is most commonly used system for developing ASR minority languages, whereas in some of the majority languages, researchers are using much advance algorithms of  ...  archives.  ... 
doi:10.1007/s40747-022-00665-1 fatcat:6pu2xccbq5as7bn2y2tav2fdwa
« Previous Showing results 1 — 15 out of 1,094 results