Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








10,638 Hits in 2.4 sec

Conversion of Sign Language To Text And Speech Using Machine Learning Techniques

Victoria Adewale, Adejoke Olamiti
2018 Journal of Review and Research in Sciences  
Conversion of images to text as well as speech can be of great benefit to the non-hearing impaired and hearing impaired people (the deaf/mute) from circadian interaction with images.  ...  In turn, the best match was converted to text as well as speech. Result: The introduced system achieved a 78% accuracy of unsupervised feature learning.  ...  grouped together to form the desired Region. ∑ | − | = ( ) < ∑ | − | = ( ) equation Text and Speech synthesis of classified images Text-to-Speech (TTS) refers to the ability of computers to read text  ... 
doi:10.36108/jrrslasu/8102/50(0170) fatcat:ug33zyddkvhhrad3j72koja5ym

Self-supervised Image-to-Text and Text-to-Image Synthesis [chapter]

Anindya Sundar Das, Sriparna Saha
2021 Lecture Notes in Computer Science  
In recent years, most of the works related to Text-to-Image synthesis and Image-to-Text generation, focused on supervised generative deep architectures to solve the problems, where very little interest  ...  In this paper, we propose a novel self-supervised deep learning based approach towards learning the cross-modal embedding spaces; for both image to text and text to image generations.  ...  We showed that our Text-to-Image and Image-to-Text synthesis networks learn to map the semantic space of one modality to the semantic space of the other modality in an unsupervised fashion.  ... 
doi:10.1007/978-3-030-92273-3_34 fatcat:qfps4l6hgngjpkdeb3cw623fti

Robust unsupervised segmentation of degraded document images with topic models

Timothy J. Burns, Jason J. Corso
2009 2009 IEEE Conference on Computer Vision and Pattern Recognition  
We take an analysis-by-synthesis approach to examine the model, and provide quantitative segmentation results on a manuallylabeled document image data set.  ...  Our model automatically discovers different regions present in a document image in a completely unsupervised fashion.  ...  Introduction We examine the problem of segmenting document images into text, whitespace, images, and figures through unsupervised learning methods.  ... 
doi:10.1109/cvpr.2009.5206606 dblp:conf/cvpr/BurnsC09 fatcat:svm7ayafhjbb5jodhmqvw4o74y

Robust unsupervised segmentation of degraded document images with topic models

T.J. Burns, J.J. Corso
2009 2009 IEEE Conference on Computer Vision and Pattern Recognition  
We take an analysis-by-synthesis approach to examine the model, and provide quantitative segmentation results on a manuallylabeled document image data set.  ...  Our model automatically discovers different regions present in a document image in a completely unsupervised fashion.  ...  Introduction We examine the problem of segmenting document images into text, whitespace, images, and figures through unsupervised learning methods.  ... 
doi:10.1109/cvprw.2009.5206606 fatcat:fcbdvfraavgd5iut3sswrotox4

DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery [article]

Chaofan Ma, Yuhuan Yang, Chen Ju, Fei Zhang, Jinxiang Liu, Yu Wang, Ya Zhang, Yanfeng Wang
2023 arXiv   pre-print
To alleviate data insufficiency, we synthesize abundant images, and propose a novel training-free AttentionCut to obtain masks in the first synthesis stage.  ...  In the second exploitation stage, to bridge the structural gap, we use the inversion technique, to map the given image back to diffusion features.  ...  Extracting Diffusion Knowledge For diffusion models, they are fed with noise and text to output synthesis images; while for object discovery models, they are fed with images to output pixel-level masks  ... 
arXiv:2303.09813v1 fatcat:zjzdnp54fbg6ta22qglmwwhryu

Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition [article]

Junrui Ni, Liming Wang, Heting Gao, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson
2022 arXiv   pre-print
An unsupervised text-to-speech synthesis (TTS) system learns to generate speech waveforms corresponding to any written sentence in a language by observing: 1) a collection of untranscribed speech waveforms  ...  This paper proposes an unsupervised TTS system based on an alignment module that outputs pseudo-text and another synthesis module that uses pseudo-text for training and real text for inference.  ...  We would like to thank one anonymous reviewer for insights on Sec 4.4.  ... 
arXiv:2203.15796v2 fatcat:5qvvvivz6belzdtxoxyfpoefsm

Real-Time Cerebral Vessel Segmentation in Laser Speckle Contrast Image Based on Unsupervised Domain Adaptation

Heping Chen, Yan Shi, Bin Bo, Denghui Zhao, Peng Miao, Shanbao Tong, Chunliang Wang
2021 Frontiers in Neuroscience  
Synthetic LSCI images were obtained with a synthesis network from LSCI images and public labeled dataset of Digital Retinal Images for Vessel Extraction, which were then used to train the segmentation  ...  Using matching strategies to reduce the size discrepancy between retinal images and laser speckle contrast images, we could further significantly improve image synthesis and segmentation performance.  ...  images based on unsupervised domain adaptation and image-to-image translation could be inspiring.  ... 
doi:10.3389/fnins.2021.755198 pmid:34916898 pmcid:PMC8669333 fatcat:qis23rct3nbjzfau6cu5y2yexi

SYRAC: Synthesize, Rank, and Count [article]

Adriano D'Alessandro, Ali Mahdavi-Amiri, Ghassan Hamarneh
2023 arXiv   pre-print
We report state-of-the-art results for unsupervised crowd counting.  ...  To address this, we use latent diffusion models to create two types of synthetic data: one by removing pedestrians from real images, which generates ranked image pairs with a weak but reliable object quantity  ...  We utilize SD to perform text-to-image synthesis, using text to guide the model towards generating synthetic images with a specific number of pedestrians.  ... 
arXiv:2310.01662v3 fatcat:4xe6mkbd45atzn3ululxso7wky

DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Networks (with Supplementary Materials) [article]

Shuang Ma, Jianlong Fu, Chang Wen Chen, Tao Mei
2018 arXiv   pre-print
Unsupervised image translation, which aims in translating two independent sets of images, is challenging in discovering the correct correspondences without paired data.  ...  To address the above issues, we propose a novel framework for instance-level image translation by Deep Attention GAN (DA-GAN).  ...  Text to Image Synthesis We conduct qualitative and quantitative evaluation on the text-to-image synthesis task.  ... 
arXiv:1802.06454v1 fatcat:lwn3x3ymrzcibcvxk4wauosrai

The Learnable Typewriter: A Generative Approach to Text Line Analysis [article]

Ioannis Siglidis, Nicolas Gonthier, Julien Gaubil, Tom Monnier, Mathieu Aubry
2023 arXiv   pre-print
Our main idea is to build on unsupervised multi-object segmentation methods and in particular those that reconstruct images based on a limited amount of visual elements, called sprites.  ...  We present a generative document-specific approach to character analysis and recognition in text lines.  ...  Acknowledgments We would like to thank Malamatenia Vlachou and Dominique Stutzmann for sharing ideas, insights and data for applying our method in paleography; Vickie Ye and Dmitriy Smirnov for useful  ... 
arXiv:2302.01660v2 fatcat:47gdzbaffrdl7gfarkj2kclj6m

Self-Supervised Text Erasing with Controllable Image Synthesis [article]

Gangwei Jiang, Shiyao Wang, Tiezheng Ge, Yuning Jiang, Ying Wei, Defu Lian
2022 arXiv   pre-print
We first design a style-aware image synthesis function to generate synthetic images with diverse styled texts based on two synthetic mechanisms.  ...  To this end, we study an unsupervised scenario by proposing a novel Self-supervised Text Erasing (STE) framework that jointly learns to synthesize training images with erasure ground-truth and accurately  ...  CONCLUSION In this paper, we propose a novel framework, named Self-supervised Text Erasing (STE), to learn the generated training image pairs in an unsupervised fashion for the text erasing task.  ... 
arXiv:2204.12743v1 fatcat:x7ejkurlcffmhc6ensdjod7xfi

DA-GAN: Instance-Level Image Translation by Deep Attention Generative Adversarial Networks

Shuang Ma, Jianlong Fu, Chang Wen Chen, Tao Mei
2018 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition  
Unsupervised image translation, which aims in translating two independent sets of images, is challenging in discovering the correct correspondences without paired data.  ...  To address the above issues, we propose a novel framework for instance-level image translation by Deep Attention GAN (DA-GAN).  ...  Text to Image Synthesis We conduct qualitative and quantitative evaluation on the text-to-image synthesis task.  ... 
doi:10.1109/cvpr.2018.00593 dblp:conf/cvpr/MaFCM18 fatcat:fzgyadju7vdzxnhfkisk5xp7gu

Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis [article]

Wenda Xu, Yilin Tuan, Yujie Lu, Michael Saxon, Lei Li, William Yang Wang
2022 arXiv   pre-print
SESCORE outperforms all prior unsupervised metrics on multiple diverse NLG tasks including machine translation, image captioning, and WebNLG text generation.  ...  This pipeline applies a series of plausible errors to raw text and assigns severity labels by simulating human judgements with entailment.  ...  To adopt to the text domain of WebNLG and Image captioning, we generate 30k and 40k error synthetic sentences from the text portion of the WebNLG (Gardent et al., 2017b) and image captioning's training  ... 
arXiv:2210.05035v2 fatcat:n6khmrdr5vd33i67yjswzx6n2u

Dual Adversarial Inference for Text-to-Image Synthesis [article]

Qicheng Lao, Mohammad Havaei, Ahmad Pesaranghader, Francis Dutil, Lisa Di Jorio, Thomas Fevens
2019 arXiv   pre-print
We achieve this by augmenting current text-to-image synthesis frameworks with a dual adversarial inference mechanism.  ...  Through extensive experiments, we show that our model learns, in an unsupervised manner, style representations corresponding to certain meaningful information present in the image that are not well described  ...  Related Work Text-to-image synthesis methods Text-to-image synthesis has been made possible by Reed et al.  ... 
arXiv:1908.05324v1 fatcat:xxcisuu6vvbgbhhvn323gyevuu

TUCH: Turning Cross-view Hashing into Single-view Hashing via Generative Adversarial Nets

Xin Zhao, Guiguang Ding, Yuchen Guo, Jungong Han, Yue Gao
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
Cross-view retrieval, which focuses on searching images as response to text queries or vice versa, has received increasing attention recently.  ...  TUCH is a novel deep architecture that integrates a language model network T for text feature extraction, a generator network G to generate fake images from text feature and a hashing network H for learning  ...  Conditional Generative Adversarial Networks for Text to Image Synthesis Conditional Generative Adversarial Networks have been used for text to image synthesis by small modification over GANs [Reed et  ... 
doi:10.24963/ijcai.2017/491 dblp:conf/ijcai/ZhaoDGHG17 fatcat:ajwn5xn4ejhqrpk7q3633eo5dm
« Previous Showing results 1 — 15 out of 10,638 results