Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








3,719 Hits in 4.6 sec

Fast Gated Recurrent Network for Speech Synthesis

Bima PRIHASTO, Tzu-Chiang TAI, Pao-Chi CHANG, Jia-Ching WANG
2022 IEICE transactions on information and systems  
This research proposes a fast gated recurrent neural network, a fast RNN-based architecture, for speech synthesis based on the minimal gated unit (MGU).  ...  The recurrent neural network (RNN) has been used in audio and speech processing, such as language translation and speech recognition.  ...  This paper is organized as follows: in Sect. 2, we give the background for speech synthesis and the MGU. In Sect. 3, we propose a fast gated recurrent network.  ... 
doi:10.1587/transinf.2021edl8032 fatcat:doq5zmojj5hqtayjl4pdpmyu5e

Utterance-level Sequential Modeling For Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit [article]

Tomoki Koriyama, Hiroshi Saruwatari
2020 arXiv   pre-print
We adopt a simple recurrent unit (SRU) for the proposed model to achieve a recurrent architecture, in which we can execute fast speech parameter generation by using the high parallelization nature of SRU  ...  This paper presents a deep Gaussian process (DGP) model with a recurrent architecture for speech sequence modeling.  ...  On the other hand, NN-based speech synthesis studies have shown the effectiveness of utterancelevel modeling using recurrent NNs (RNNs) and attention-based networks [7, 8] .  ... 
arXiv:2004.10823v1 fatcat:dikiv5fuk5ddxjmobcose2pvk4

Grow and Prune Compact, Fast, and Accurate LSTMs [article]

Xiaoliang Dai, Hongxu Yin, Niraj K. Jha
2018 arXiv   pre-print
This learns both the weights and the compact architecture of H-LSTM control gates. We have GP-trained H-LSTMs for image captioning and speech recognition applications.  ...  Thus, GP-trained H-LSTMs can be seen to be compact, fast, and accurate.  ...  Introduction Recurrent neural networks (RNNs) have been ubiquitously employed for sequential data modeling because of their ability to carry information through recurrent cycles.  ... 
arXiv:1805.11797v2 fatcat:jyd2u6o2kbfwvdtybqied2oe4a

Analysis by Adversarial Synthesis — A Novel Approach for Speech Vocoding

Ahmed Mustafa, Arijit Biswas, Christian Bergler, Julia Schottenhamml, Andreas Maier
2019 Interspeech 2019  
In this work, we introduce a new methodology for neural speech vocoding based on generative adversarial networks (GANs).  ...  Classical parametric speech coding techniques provide a compact representation for speech signals.  ...  Generative Adversarial Networks (GANs) provide an alternative approach for very fast generation of realistic data samples [8] .  ... 
doi:10.21437/interspeech.2019-1195 dblp:conf/interspeech/MustafaBBSM19 fatcat:4yeskn5mwbeijkjnzmkd34m7ji

UFANS: U-shaped Fully-Parallel Acoustic Neural Structure For Statistical Parametric Speech Synthesis With 20X Faster [article]

Dabiao Ma, Zhiba Su, Yuhao Lu, Wenxuan Wang, Zhen Li
2018 arXiv   pre-print
Neural networks with Auto-regressive structures, such as Recurrent Neural Networks (RNNs), have become the most appealing structures for acoustic modeling of parametric text to speech synthesis (TTS) in  ...  In this paper, we propose a U-shaped Fully-parallel Acoustic Neural Structure (UFANS), which is a deconvolutional alternative of RNNs for Statistical Parametric Speech Synthesis (SPSS).  ...  Variants like Long Short-Term Memory (LSTM) [5] , Gated Recurrent Unit (GRU) [6] and other RNN structures are now broadly used in text-to-speech [7] with very good records.  ... 
arXiv:1811.12208v1 fatcat:vodixdxg7fagtahwitd2c34viu

A deep recurrent approach for acoustic-to-articulatory inversion

Peng Liu, Quanjie Yu, Zhiyong Wu, Shiyin Kang, Helen Meng, Lianhong Cai
2015 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)  
Experimental results indicate that recurrent model can produce more accurate predictions for acoustic-to-articulatory inversion than deep neural network having fixed-length context window.  ...  To solve the acoustic-to-articulatory inversion problem, this paper proposes a deep bidirectional long short term memory recurrent neural network and a deep recurrent mixture density network.  ...  In speech synthesis, articulatory features can be incorporated into the traditional speech synthesis method to modify the characteristics of the synthesized speech [2] .  ... 
doi:10.1109/icassp.2015.7178812 dblp:conf/icassp/LiuYWKMC15 fatcat:vwbdhyjeofezjhijwlf2fc4zmi

Noise and acoustic modeling with waveform generator in text-to-speech and neutral speech conversion

Mohammed Salah Al-Radhi, Tamás Gábor Csapó, Géza Németh
2020 Multimedia tools and applications  
gated recurrent unit, and hybrid model).  ...  This article focuses on developing a system for high-quality synthesized and converted speech by addressing three fundamental principles.  ...  Consequently, the second goal of this paper is to build a deep learning-based acoustic model for speech synthesis using feedforward and recurrent neural network as an alternative to HMMs.  ... 
doi:10.1007/s11042-020-09783-9 fatcat:5we3ryq6arb4xdxiblymuqwqlu

Google Duplex - A Big Leap in the Evolution of Artificial Intelligence

Parth Patel, Pratik Kanani
2021 International Journal of Computer Applications  
that is fed into recurrent neural network.  ...  [3, 7] Gated Activation and Residual Units In the non-linearity part of network structure, Oord et al applied a gated activation unit similar to the activation in LSTM.  ... 
doi:10.5120/ijca2021921019 fatcat:f5e4do6kczfjnpsf4ooocyyvri

Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion [article]

Narjes Bozorg, Michael T.Johnson
2020 arXiv   pre-print
The proposed system uses the WaveNet speech synthesis architecture, with dilated causal convolutional layers using previous values of the predicted articulatory trajectories conditioned on acoustic features  ...  This paper presents Articulatory-WaveNet, a new approach for acoustic-to-articulator inversion.  ...  Fast WaveNet caches previously computed information from the overlapping network states, called recurrent states, to eliminate redundant convolutions.  ... 
arXiv:2006.12594v1 fatcat:y3xq5czyhjbkvhr4ilbqfwhztu

Quality Evaluation of Reverberant Speech Based on Deep Learning

Samia Abd El-Moneim, Mahmoud Saied, M. A. Nassar, Moawad I. Dessouky, N. Ismail, Adel Saleeb, Adel S. El-Fishawy, Fathi E. Abd El-Samie
2020 Menoufia Journal of Electronic Engineering Research  
Spectrogram and MFCC are used as features to be classified with Long Short Term Recurrent Neural Network (LSTM RNN). Two models are presented and compared.  ...  This paper presents an efficient approach for classification of speech signals as reverberant or not. The reverberation is a severe effect encountered in closed room.  ...  Long Short Term Memory Recurrent Neural Network Deep RNN has wide use in speech processing for its ability to label sequences, means that each input sequence is assigned to a certain class.  ... 
doi:10.21608/mjeer.2020.103754 fatcat:mgei345mgrh53b6pz2qwdnxpka

Leveraging Product as an Activation Function in Deep Networks [article]

Luke B. Godfrey, Michael S. Gashler
2018 arXiv   pre-print
We demonstrate that WPUNNs can also generalize gated units in recurrent neural networks, yielding results comparable to LSTM networks.  ...  We present windowed product unit neural networks (WPUNNs), a simple method of leveraging product as a nonlinearity in a neural network.  ...  LSTM networks, in particular, have proven to be particularly powerful models for speech recognition [23] , language modeling [24] , text-to-speech synthesis [25] , and handwriting recognition and generation  ... 
arXiv:1810.08578v1 fatcat:rjk55htxnbbxxdqqrxa6gjzw7q

The artificial intelligence renaissance: deep learning and the road to human-Level machine intelligence

Kar-Han Tan, Boon Pang Lim
2018 APSIPA Transactions on Signal and Information Processing  
A number of problems that were considered too challenging just a few years ago can now be solved convincingly by deep neural networks.  ...  to a matter of data collection and labeling, we believe that many insights learned from 'pre-Deep Learning' works still apply and will be more valuable than ever in guiding the design of novel neural network  ...  ACKNOWLEDGEMENTS The first author would like to thank Irwin Sobel for pointers on the pioneering work at MIT, and Xiaonan Zhou for her work on many of the deep neural network results shown.  ... 
doi:10.1017/atsip.2018.6 fatcat:6iftrepekjdmjffcb5ouz42jke

On the quantization of recurrent neural networks [article]

Jian Li, Raziel Alvarez
2021 arXiv   pre-print
In this work, we present an integer-only quantization strategy for Long Short-Term Memory (LSTM) neural network topologies, which themselves are the foundation of many production ML systems.  ...  Integer quantization of neural networks can be defined as the approximation of the high precision computation of the canonical neural network formulation, using reduced integer precision.  ...  Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis.  ... 
arXiv:2101.05453v1 fatcat:zr5vqtgunjdsvgfgyjepjryl7e

Background Noise Suppression in Audio File using LSTM Network

W. Shivani Patnaik
2022 International Journal for Research in Applied Science and Engineering Technology  
As a result of the advent of deep neural networks, several novel ways for audio processing methods based on deep models have been presented.  ...  The goal of the project is to use a stacked Dual signal Transformation LSTM Network (DTLN) to combine both analysis and synthesis into one model.  ...  Long Short-Term Memory is the neural network that employs these gates (LSTM).  ... 
doi:10.22214/ijraset.2022.44109 fatcat:snqgeixzzraudnais2z6hz6hki

An Optimal Feature Parameter Set Based on Gated Recurrent Unit Recurrent Neural Networks for Speech Segment Detection

Batur Dinler, Aydin
2020 Applied Sciences  
Speech segment detection based on gated recurrent unit (GRU) recurrent neural networks for the Kurdish language was investigated in the present study.  ...  Identification of the phoneme boundaries using a GRU recurrent neural network was performed with six different classification algorithms for the C/V/S discrimination.  ...  Gated Recurrent Unit Recurrent Neural Networks The gated recurrent unit (GRU) represents a kind of recurrent neural network.  ... 
doi:10.3390/app10041273 fatcat:rll6wnkklzcxxhgtx2h6337xfy
« Previous Showing results 1 — 15 out of 3,719 results