A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Unsupervised Image Captioning
2019
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
natural sentences to facilitate the unsupervised image captioning scenario. ...
In this paper, we make the first attempt to train an image captioning model in an unsupervised manner. ...
Unsupervised Image Captioning Unsupervised image captioning relies on a set of images I = {I 1 , . . . , I Ni }, a set of sentencesŜ = {Ŝ 1 , . . . ...
doi:10.1109/cvpr.2019.00425
dblp:conf/cvpr/Feng00L19a
fatcat:sxtjh2o3svdrnaj4isa4rgw2sm
Unsupervised Image Captioning
[article]
2019
arXiv
pre-print
natural sentences to facilitate the unsupervised image captioning scenario. ...
In this paper, we make the first attempt to train an image captioning model in an unsupervised manner. ...
Unsupervised Image Captioning Unsupervised image captioning relies on a set of images I = {I 1 , . . . , I Ni }, a set of sentencesŜ = {Ŝ 1 , . . . ...
arXiv:1811.10787v2
fatcat:odrroqn2cnfpxjroupyo5ggxde
Object-Centric Unsupervised Image Captioning
[article]
2022
arXiv
pre-print
In this paper, we explore the task of unsupervised image captioning which utilizes unpaired images and texts to train the model so that the texts can come from different sources than the images. ...
Image captioning is a longstanding problem in the field of computer vision and natural language processing. ...
than English, and really speaks to the importance of making advances in unsupervised image caption. ...
arXiv:2112.00969v2
fatcat:zoqnbuwxwve4jotqmgmfkqkcim
Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning
[article]
2021
arXiv
pre-print
Unsupervised image captioning is a challenging task that aims at generating captions without the supervision of image-sentence pairs, but only with images and sentences drawn from different sources and ...
The focus of the previous work was on the alignment of input images and pseudo-captions at the sentence level. However, pseudo-captions contain many words that are irrelevant to a given image. ...
That is, every image has completely matched captions in pseudo-captions, which is not the case in unsupervised image captioning. ...
arXiv:2104.13872v2
fatcat:gh4fxuhdljb3bozcd6sa3d6nwi
UNISON: Unpaired Cross-lingual Image Captioning
[article]
2022
arXiv
pre-print
The traditional paradigm of image captioning relies on paired image-caption datasets to train the model in a supervised manner. ...
Image captioning has emerged as an interesting research field in recent years due to its broad application scenarios. ...
Since the vocabulary in the collected MT corpus is quite different from the vocabulary in image caption datasets, we filter the sentences in MT datasets according to an existing caption-style dictionary ...
arXiv:2010.01288v3
fatcat:byutcnokejfrbm4cdcluyjjn7a
Removing Partial Mismatches in Unsupervised Image Captioning
2022
Transactions of the Japanese society for artificial intelligence
Unsupervised image captioning is a task to describe images without the supervision of image-sentence pairs. ...
They focused on aligning the pseudo-captions with input images at the sentence level. However, pseudo-captions contain many words that are irrelevant to a given image. ...
* 6 [Laina 19] で用いられる Conceptual Captions は,ウェブ上で 収集したテキストデータに対して,低頻度語を含む文の除去 や固有名詞の上位語変換などのフィルタリングを行ったテキス トデータである.学習に使った Conceptual Captions の語彙は 15,412 語で,文中に現れる ⟨unk⟩(未知語を表す特殊トークン) の割合は約 0.3%であった. ...
doi:10.1527/tjsai.37-2_h-l82
fatcat:5i2indknbvdkbjzem37nqz6oyu
Improving Generalization of Image Captioning with Unsupervised Prompt Learning
[article]
2023
arXiv
pre-print
In this paper, we propose an unsupervised prompt learning method to improve Generalization of Image Captioning (GeneIC), which learns a domain-specific prompt vector for the target domain without requiring ...
Pretrained visual-language models have demonstrated impressive zero-shot abilities in image captioning, when accompanied by hand-crafted prompts. ...
Therefore, we propose an unsupervised prompt learning method to improve generalization of image captioning. ...
arXiv:2308.02862v1
fatcat:vyzcwpuhm5clnphdvau3vy4cpy
Towards Unsupervised Image Captioning with Shared Multimodal Embeddings
[article]
2019
arXiv
pre-print
In this paper, we address image captioning by generating language descriptions of scenes without learning from annotated pairs of images and their captions. ...
Our approach allows to exploit large text corpora outside the annotated distributions of image/caption data. ...
In this work we explore unsupervised captioning, where image and language sources are independent. ...
arXiv:1908.09317v1
fatcat:nzdwgs22cjd4xbeaz5zu4npnfq
Recurrent Relational Memory Network for Unsupervised Image Captioning
[article]
2020
arXiv
pre-print
Unsupervised image captioning with no annotations is an emerging challenge in computer vision, where the existing arts usually adopt GAN (Generative Adversarial Networks) models. ...
R^2M encodes visual context through unsupervised training on images, while enabling the memory to learn from irrelevant textual corpus via supervised fashion. ...
We perform unsupervised captioning through mess occurrences of common visual concepts in disjoint images and sentences. ...
arXiv:2006.13611v1
fatcat:w5uwfq6tknevzin2zoxj5gpgy4
Recurrent Relational Memory Network for Unsupervised Image Captioning
2020
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
Unsupervised image captioning with no annotations is an emerging challenge in computer vision, where the existing arts usually adopt GAN (Generative Adversarial Networks) models. ...
R2M encodes visual context through unsupervised training on images, while enabling the memory to learn from irrelevant textual corpus via supervised fashion. ...
We perform unsupervised captioning through mess occurrences of common visual concepts in disjoint images and sentences. ...
doi:10.24963/ijcai.2020/128
dblp:conf/ijcai/GuoWSW20
fatcat:ttv3qb2kw5fyjatm3fknc25uk4
Triple Sequence Generative Adversarial Nets for Unsupervised Image Captioning
2021
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Labelling image-sentence is expensive and some unsupervised image captioning methods show promising results on caption generation. ...
In the experiments, we use a large number of unpaired images and sentences to train our model on the unsupervised and unpaired setting. ...
[7] propose three objectives to train the unsupervised image captioning model without any labelled image-sentence pairs. ...
doi:10.1109/icassp39728.2021.9414335
fatcat:76phn3ilwnhmtasabefmtxxzpy
Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions
[article]
2021
arXiv
pre-print
Inspired by unsupervised machine translation, we investigate if a strong V&L representation model can be learned through unsupervised pre-training without image-caption corpora. ...
However, existing models require a large amount of parallel image-caption data for pre-training. Such data are costly to collect and require cumbersome curation. ...
More specifically, we use 3M images from CC and 1M captions from SBU captions (Ordonez et al., 2011) . ...
arXiv:2010.12831v2
fatcat:ftyzelmc35dg3fwckci4kh5we4
Semantic-Enhanced Cross-Modal Fusion for Improved Unsupervised Image Captioning
2023
Electronics
Unsupervised image captioning often grapples with challenges such as image–text mismatches and modality gaps, resulting in suboptimal captions. ...
The findings not only contribute to the advancement of image captioning techniques but also open avenues for future research. ...
[3] presented an unsupervised image description method utilizing generative adversarial networks (GAN) to generate diverse image captions. ...
doi:10.3390/electronics12173549
fatcat:r2tfelzxxjbrhe32hpu65wcblm
Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning
2021
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
unpublished
Unsupervised image captioning is a challenging task that aims at generating captions without the supervision of image-sentence pairs, but only with images and sentences drawn from different sources and ...
The focus of the previous work was on the alignment of input images and pseudo-captions at the sentence level. However, pseudo-captions contain many words that are irrelevant to a given image. ...
That is, every image has completely matched captions in pseudo-captions, which is not the case in unsupervised image captioning. ...
doi:10.18653/v1/2021.eacl-main.323
fatcat:onivmomwgjaqla4mmdynipi5hi
Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions
[article]
2020
Inspired by unsupervised machine translation, we investigate if a strong V&L representation model can be learned through unsupervised pre-training without image-caption corpora. ...
However, existing models require a large amount of parallel image-caption data for pre-training. Such data are costly to collect and require cumbersome curation. ...
2018) and unsupervised image captioning (Feng et al., 2019) . ...
doi:10.48550/arxiv.2010.12831
fatcat:jrc37cyvl5auxiqsizgu7r3gam
« Previous
Showing results 1 — 15 out of 8,616 results