A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Conversion of Sign Language To Text And Speech Using Machine Learning Techniques
2018
Journal of Review and Research in Sciences
Conversion of images to text as well as speech can be of great benefit to the non-hearing impaired and hearing impaired people (the deaf/mute) from circadian interaction with images. ...
In turn, the best match was converted to text as well as speech. Result: The introduced system achieved a 78% accuracy of unsupervised feature learning. ...
grouped together to form the desired Region. ∑ | − | = ( ) < ∑ | − | = ( ) equation
Text and Speech synthesis of classified images Text-to-Speech (TTS) refers to the ability of computers to read text ...
doi:10.36108/jrrslasu/8102/50(0170)
fatcat:ug33zyddkvhhrad3j72koja5ym
Self-supervised Image-to-Text and Text-to-Image Synthesis
[chapter]
2021
Lecture Notes in Computer Science
In recent years, most of the works related to Text-to-Image synthesis and Image-to-Text generation, focused on supervised generative deep architectures to solve the problems, where very little interest ...
In this paper, we propose a novel self-supervised deep learning based approach towards learning the cross-modal embedding spaces; for both image to text and text to image generations. ...
We showed
that our Text-to-Image and Image-to-Text synthesis networks learn to map the
semantic space of one modality to the semantic space of the other modality in
an unsupervised fashion. ...
doi:10.1007/978-3-030-92273-3_34
fatcat:qfps4l6hgngjpkdeb3cw623fti
Robust unsupervised segmentation of degraded document images with topic models
2009
2009 IEEE Conference on Computer Vision and Pattern Recognition
We take an analysis-by-synthesis approach to examine the model, and provide quantitative segmentation results on a manuallylabeled document image data set. ...
Our model automatically discovers different regions present in a document image in a completely unsupervised fashion. ...
Introduction We examine the problem of segmenting document images into text, whitespace, images, and figures through unsupervised learning methods. ...
doi:10.1109/cvpr.2009.5206606
dblp:conf/cvpr/BurnsC09
fatcat:svm7ayafhjbb5jodhmqvw4o74y
Robust unsupervised segmentation of degraded document images with topic models
2009
2009 IEEE Conference on Computer Vision and Pattern Recognition
We take an analysis-by-synthesis approach to examine the model, and provide quantitative segmentation results on a manuallylabeled document image data set. ...
Our model automatically discovers different regions present in a document image in a completely unsupervised fashion. ...
Introduction We examine the problem of segmenting document images into text, whitespace, images, and figures through unsupervised learning methods. ...
doi:10.1109/cvprw.2009.5206606
fatcat:fcbdvfraavgd5iut3sswrotox4
DiffusionSeg: Adapting Diffusion Towards Unsupervised Object Discovery
[article]
2023
arXiv
pre-print
To alleviate data insufficiency, we synthesize abundant images, and propose a novel training-free AttentionCut to obtain masks in the first synthesis stage. ...
In the second exploitation stage, to bridge the structural gap, we use the inversion technique, to map the given image back to diffusion features. ...
Extracting Diffusion Knowledge For diffusion models, they are fed with noise and text to output synthesis images; while for object discovery models, they are fed with images to output pixel-level masks ...
arXiv:2303.09813v1
fatcat:zjzdnp54fbg6ta22qglmwwhryu
Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
[article]
2022
arXiv
pre-print
An unsupervised text-to-speech synthesis (TTS) system learns to generate speech waveforms corresponding to any written sentence in a language by observing: 1) a collection of untranscribed speech waveforms ...
This paper proposes an unsupervised TTS system based on an alignment module that outputs pseudo-text and another synthesis module that uses pseudo-text for training and real text for inference. ...
We would like to thank one anonymous reviewer for insights on Sec 4.4. ...
arXiv:2203.15796v2
fatcat:5qvvvivz6belzdtxoxyfpoefsm
Real-Time Cerebral Vessel Segmentation in Laser Speckle Contrast Image Based on Unsupervised Domain Adaptation
2021
Frontiers in Neuroscience
Synthetic LSCI images were obtained with a synthesis network from LSCI images and public labeled dataset of Digital Retinal Images for Vessel Extraction, which were then used to train the segmentation ...
Using matching strategies to reduce the size discrepancy between retinal images and laser speckle contrast images, we could further significantly improve image synthesis and segmentation performance. ...
images based on unsupervised domain adaptation and image-to-image translation could be inspiring. ...
doi:10.3389/fnins.2021.755198
pmid:34916898
pmcid:PMC8669333
fatcat:qis23rct3nbjzfau6cu5y2yexi
SYRAC: Synthesize, Rank, and Count
[article]
2023
arXiv
pre-print
We report state-of-the-art results for unsupervised crowd counting. ...
To address this, we use latent diffusion models to create two types of synthetic data: one by removing pedestrians from real images, which generates ranked image pairs with a weak but reliable object quantity ...
We utilize SD to perform text-to-image synthesis, using text to guide the model towards generating synthetic images with a specific number of pedestrians. ...
arXiv:2310.01662v3
fatcat:4xe6mkbd45atzn3ululxso7wky
DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Networks (with Supplementary Materials)
[article]
2018
arXiv
pre-print
Unsupervised image translation, which aims in translating two independent sets of images, is challenging in discovering the correct correspondences without paired data. ...
To address the above issues, we propose a novel framework for instance-level image translation by Deep Attention GAN (DA-GAN). ...
Text to Image Synthesis We conduct qualitative and quantitative evaluation on the text-to-image synthesis task. ...
arXiv:1802.06454v1
fatcat:lwn3x3ymrzcibcvxk4wauosrai
The Learnable Typewriter: A Generative Approach to Text Line Analysis
[article]
2023
arXiv
pre-print
Our main idea is to build on unsupervised multi-object segmentation methods and in particular those that reconstruct images based on a limited amount of visual elements, called sprites. ...
We present a generative document-specific approach to character analysis and recognition in text lines. ...
Acknowledgments We would like to thank Malamatenia Vlachou and Dominique Stutzmann for sharing ideas, insights and data for applying our method in paleography; Vickie Ye and Dmitriy Smirnov for useful ...
arXiv:2302.01660v2
fatcat:47gdzbaffrdl7gfarkj2kclj6m
Self-Supervised Text Erasing with Controllable Image Synthesis
[article]
2022
arXiv
pre-print
We first design a style-aware image synthesis function to generate synthetic images with diverse styled texts based on two synthetic mechanisms. ...
To this end, we study an unsupervised scenario by proposing a novel Self-supervised Text Erasing (STE) framework that jointly learns to synthesize training images with erasure ground-truth and accurately ...
CONCLUSION In this paper, we propose a novel framework, named Self-supervised Text Erasing (STE), to learn the generated training image pairs in an unsupervised fashion for the text erasing task. ...
arXiv:2204.12743v1
fatcat:x7ejkurlcffmhc6ensdjod7xfi
DA-GAN: Instance-Level Image Translation by Deep Attention Generative Adversarial Networks
2018
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Unsupervised image translation, which aims in translating two independent sets of images, is challenging in discovering the correct correspondences without paired data. ...
To address the above issues, we propose a novel framework for instance-level image translation by Deep Attention GAN (DA-GAN). ...
Text to Image Synthesis We conduct qualitative and quantitative evaluation on the text-to-image synthesis task. ...
doi:10.1109/cvpr.2018.00593
dblp:conf/cvpr/MaFCM18
fatcat:fzgyadju7vdzxnhfkisk5xp7gu
Not All Errors are Equal: Learning Text Generation Metrics using Stratified Error Synthesis
[article]
2022
arXiv
pre-print
SESCORE outperforms all prior unsupervised metrics on multiple diverse NLG tasks including machine translation, image captioning, and WebNLG text generation. ...
This pipeline applies a series of plausible errors to raw text and assigns severity labels by simulating human judgements with entailment. ...
To adopt to the text domain of WebNLG and Image captioning, we generate 30k and 40k error synthetic sentences from the text portion of the WebNLG (Gardent et al., 2017b) and image captioning's training ...
arXiv:2210.05035v2
fatcat:n6khmrdr5vd33i67yjswzx6n2u
Dual Adversarial Inference for Text-to-Image Synthesis
[article]
2019
arXiv
pre-print
We achieve this by augmenting current text-to-image synthesis frameworks with a dual adversarial inference mechanism. ...
Through extensive experiments, we show that our model learns, in an unsupervised manner, style representations corresponding to certain meaningful information present in the image that are not well described ...
Related Work Text-to-image synthesis methods Text-to-image synthesis has been made possible by Reed et al. ...
arXiv:1908.05324v1
fatcat:xxcisuu6vvbgbhhvn323gyevuu
TUCH: Turning Cross-view Hashing into Single-view Hashing via Generative Adversarial Nets
2017
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Cross-view retrieval, which focuses on searching images as response to text queries or vice versa, has received increasing attention recently. ...
TUCH is a novel deep architecture that integrates a language model network T for text feature extraction, a generator network G to generate fake images from text feature and a hashing network H for learning ...
Conditional Generative Adversarial Networks for Text to Image Synthesis Conditional Generative Adversarial Networks have been used for text to image synthesis by small modification over GANs [Reed et ...
doi:10.24963/ijcai.2017/491
dblp:conf/ijcai/ZhaoDGHG17
fatcat:ajwn5xn4ejhqrpk7q3633eo5dm
« Previous
Showing results 1 — 15 out of 10,638 results