A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Fast Gated Recurrent Network for Speech Synthesis
2022
IEICE transactions on information and systems
This research proposes a fast gated recurrent neural network, a fast RNN-based architecture, for speech synthesis based on the minimal gated unit (MGU). ...
The recurrent neural network (RNN) has been used in audio and speech processing, such as language translation and speech recognition. ...
This paper is organized as follows: in Sect. 2, we give the background for speech synthesis and the MGU. In Sect. 3, we propose a fast gated recurrent network. ...
doi:10.1587/transinf.2021edl8032
fatcat:doq5zmojj5hqtayjl4pdpmyu5e
Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit
[article]
2020
arXiv
pre-print
We adopt a simple recurrent unit (SRU) for the proposed model to achieve a recurrent architecture, in which we can execute fast speech parameter generation by using the high parallelization nature of SRU ...
This paper presents a deep Gaussian process (DGP) model with a recurrent architecture for speech sequence modeling. ...
On the other hand, NN-based speech synthesis studies have shown the effectiveness of utterancelevel modeling using recurrent NNs (RNNs) and attention-based networks [7, 8] . ...
arXiv:2004.10823v1
fatcat:dikiv5fuk5ddxjmobcose2pvk4
Grow and Prune Compact, Fast, and Accurate LSTMs
[article]
2018
arXiv
pre-print
This learns both the weights and the compact architecture of H-LSTM control gates. We have GP-trained H-LSTMs for image captioning and speech recognition applications. ...
Thus, GP-trained H-LSTMs can be seen to be compact, fast, and accurate. ...
Introduction Recurrent neural networks (RNNs) have been ubiquitously employed for sequential data modeling because of their ability to carry information through recurrent cycles. ...
arXiv:1805.11797v2
fatcat:jyd2u6o2kbfwvdtybqied2oe4a
Analysis by Adversarial Synthesis — A Novel Approach for Speech Vocoding
2019
Interspeech 2019
In this work, we introduce a new methodology for neural speech vocoding based on generative adversarial networks (GANs). ...
Classical parametric speech coding techniques provide a compact representation for speech signals. ...
Generative Adversarial Networks (GANs) provide an alternative approach for very fast generation of realistic data samples [8] . ...
doi:10.21437/interspeech.2019-1195
dblp:conf/interspeech/MustafaBBSM19
fatcat:4yeskn5mwbeijkjnzmkd34m7ji
UFANS: U-shaped Fully-Parallel Acoustic Neural Structure For Statistical Parametric Speech Synthesis With 20X Faster
[article]
2018
arXiv
pre-print
Neural networks with Auto-regressive structures, such as Recurrent Neural Networks (RNNs), have become the most appealing structures for acoustic modeling of parametric text to speech synthesis (TTS) in ...
In this paper, we propose a U-shaped Fully-parallel Acoustic Neural Structure (UFANS), which is a deconvolutional alternative of RNNs for Statistical Parametric Speech Synthesis (SPSS). ...
Variants like Long Short-Term Memory (LSTM) [5] , Gated Recurrent Unit (GRU) [6] and other RNN structures are now broadly used in text-to-speech [7] with very good records. ...
arXiv:1811.12208v1
fatcat:vodixdxg7fagtahwitd2c34viu
A deep recurrent approach for acoustic-to-articulatory inversion
2015
2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Experimental results indicate that recurrent model can produce more accurate predictions for acoustic-to-articulatory inversion than deep neural network having fixed-length context window. ...
To solve the acoustic-to-articulatory inversion problem, this paper proposes a deep bidirectional long short term memory recurrent neural network and a deep recurrent mixture density network. ...
In speech synthesis, articulatory features can be incorporated into the traditional speech synthesis method to modify the characteristics of the synthesized speech [2] . ...
doi:10.1109/icassp.2015.7178812
dblp:conf/icassp/LiuYWKMC15
fatcat:vwbdhyjeofezjhijwlf2fc4zmi
Noise and acoustic modeling with waveform generator in text-to-speech and neutral speech conversion
2020
Multimedia tools and applications
gated recurrent unit, and hybrid model). ...
This article focuses on developing a system for high-quality synthesized and converted speech by addressing three fundamental principles. ...
Consequently, the second goal of this paper is to build a deep learning-based acoustic model for speech synthesis using feedforward and recurrent neural network as an alternative to HMMs. ...
doi:10.1007/s11042-020-09783-9
fatcat:5we3ryq6arb4xdxiblymuqwqlu
Google Duplex - A Big Leap in the Evolution of Artificial Intelligence
2021
International Journal of Computer Applications
that is fed into recurrent neural network. ...
[3, 7]
Gated Activation and Residual Units In the non-linearity part of network structure, Oord et al applied a gated activation unit similar to the activation in LSTM. ...
doi:10.5120/ijca2021921019
fatcat:f5e4do6kczfjnpsf4ooocyyvri
Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion
[article]
2020
arXiv
pre-print
The proposed system uses the WaveNet speech synthesis architecture, with dilated causal convolutional layers using previous values of the predicted articulatory trajectories conditioned on acoustic features ...
This paper presents Articulatory-WaveNet, a new approach for acoustic-to-articulator inversion. ...
Fast WaveNet caches previously computed information from the overlapping network states, called recurrent states, to eliminate redundant convolutions. ...
arXiv:2006.12594v1
fatcat:y3xq5czyhjbkvhr4ilbqfwhztu
Quality Evaluation of Reverberant Speech Based on Deep Learning
2020
Menoufia Journal of Electronic Engineering Research
Spectrogram and MFCC are used as features to be classified with Long Short Term Recurrent Neural Network (LSTM RNN). Two models are presented and compared. ...
This paper presents an efficient approach for classification of speech signals as reverberant or not. The reverberation is a severe effect encountered in closed room. ...
Long Short Term Memory Recurrent Neural Network Deep RNN has wide use in speech processing for its ability to label sequences, means that each input sequence is assigned to a certain class. ...
doi:10.21608/mjeer.2020.103754
fatcat:mgei345mgrh53b6pz2qwdnxpka
Leveraging Product as an Activation Function in Deep Networks
[article]
2018
arXiv
pre-print
We demonstrate that WPUNNs can also generalize gated units in recurrent neural networks, yielding results comparable to LSTM networks. ...
We present windowed product unit neural networks (WPUNNs), a simple method of leveraging product as a nonlinearity in a neural network. ...
LSTM networks, in particular, have proven to be particularly powerful models for speech recognition [23] , language modeling [24] , text-to-speech synthesis [25] , and handwriting recognition and generation ...
arXiv:1810.08578v1
fatcat:rjk55htxnbbxxdqqrxa6gjzw7q
The artificial intelligence renaissance: deep learning and the road to human-Level machine intelligence
2018
APSIPA Transactions on Signal and Information Processing
A number of problems that were considered too challenging just a few years ago can now be solved convincingly by deep neural networks. ...
to a matter of data collection and labeling, we believe that many insights learned from 'pre-Deep Learning' works still apply and will be more valuable than ever in guiding the design of novel neural network ...
ACKNOWLEDGEMENTS The first author would like to thank Irwin Sobel for pointers on the pioneering work at MIT, and Xiaonan Zhou for her work on many of the deep neural network results shown. ...
doi:10.1017/atsip.2018.6
fatcat:6iftrepekjdmjffcb5ouz42jke
On the quantization of recurrent neural networks
[article]
2021
arXiv
pre-print
In this work, we present an integer-only quantization strategy for Long Short-Term Memory (LSTM) neural network topologies, which themselves are the foundation of many production ML systems. ...
Integer quantization of neural networks can be defined as the approximation of the high precision computation of the canonical neural network formulation, using reduced integer precision. ...
Unidirectional long short-term memory recurrent neural network with recurrent
output layer for low-latency speech synthesis. ...
arXiv:2101.05453v1
fatcat:zr5vqtgunjdsvgfgyjepjryl7e
Background Noise Suppression in Audio File using LSTM Network
2022
International Journal for Research in Applied Science and Engineering Technology
As a result of the advent of deep neural networks, several novel ways for audio processing methods based on deep models have been presented. ...
The goal of the project is to use a stacked Dual signal Transformation LSTM Network (DTLN) to combine both analysis and synthesis into one model. ...
Long Short-Term Memory is the neural network that employs these gates (LSTM). ...
doi:10.22214/ijraset.2022.44109
fatcat:snqgeixzzraudnais2z6hz6hki
An Optimal Feature Parameter Set Based on Gated Recurrent Unit Recurrent Neural Networks for Speech Segment Detection
2020
Applied Sciences
Speech segment detection based on gated recurrent unit (GRU) recurrent neural networks for the Kurdish language was investigated in the present study. ...
Identification of the phoneme boundaries using a GRU recurrent neural network was performed with six different classification algorithms for the C/V/S discrimination. ...
Gated Recurrent Unit Recurrent Neural Networks The gated recurrent unit (GRU) represents a kind of recurrent neural network. ...
doi:10.3390/app10041273
fatcat:rll6wnkklzcxxhgtx2h6337xfy
« Previous
Showing results 1 — 15 out of 3,719 results