Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion
release_y3xq5czyhjbkvhr4ilbqfwhztu
by
Narjes Bozorg, Michael T.Johnson
2020
Abstract
This paper presents Articulatory-WaveNet, a new approach for
acoustic-to-articulator inversion. The proposed system uses the WaveNet speech
synthesis architecture, with dilated causal convolutional layers using previous
values of the predicted articulatory trajectories conditioned on acoustic
features. The system was trained and evaluated on the ElectroMagnetic
Articulography corpus of Mandarin Accented English (EMA-MAE),consisting of 39
speakers including both native English speakers and native Mandarin speakers
speaking English. Results show significant improvement in both correlation and
RMSE between the generated and true articulatory trajectories for the new
method, with an average correlation of 0.83, representing a 36% relative
improvement over the 0.61 correlation obtained with a baseline Hidden Markov
Model (HMM)-Gaussian Mixture Model (GMM) inversion framework. To the best of
our knowledge, this paper presents the first application of a point-by-point
waveform synthesis approach to the problem of acoustic-to-articulatory
inversion and the results show improved performance compared to previous
methods for speaker dependent acoustic to articulatory inversion.
In text/plain
format
Archived Files and Locations
application/pdf 614.5 kB
file_eqz4eb53grcv7fsc2sqyc2vzza
|
arxiv.org (repository) web.archive.org (webarchive) |
2006.12594v1
access all versions, variants, and formats of this works (eg, pre-prints)