A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
Image Captioning with Synergy-Gated Attention and Recurrent Fusion LSTM
2022
KSII Transactions on Internet and Information Systems
Long Short-Term Memory (LSTM) combined with attention mechanism is extensively used to generate semantic sentences of images in image captioning models. ...
First, the Synergy-Gated Attention (SGA) method is proposed, which can process the spatial features and the salient region features of given images simultaneously. ...
Introduction Image captioning is a task that makes a sentence from reading an image. The sentence should be fluence and hold semantic consistency with image. ...
doi:10.3837/tiis.2022.10.010
fatcat:aqspiix37fcwbhcrr57tisd3ea
A Dual-Attention Learning Network with Word and Sentence Embedding for Medical Visual Question Answering
[article]
2022
arXiv
pre-print
In this study, a dual-attention learning network with word and sentence embedding (WSDAN) is proposed. ...
We design a module, transformer with sentence embedding (TSE), to extract a double embedding representation of questions containing keywords and medical information. ...
double embedding of words and sentences, but only uses image-guided attention. ( 4 ) WSDAN(NP) only with F (V,Q) has no pretrained model that uses double embedding but only uses question-guided attention ...
arXiv:2210.00220v2
fatcat:sjxdelnvwzakleawzdf5tttm4q
ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity
[article]
2022
arXiv
pre-print
An intuitive way to search for images is to use queries composed of an example image and a complementary text. ...
While the first provides rich and implicit context for the search, the latter explicitly calls for new traits, or specifies how some elements of the example image should be changed to retrieve the desired ...
to produce a sentence-level feature. ...
arXiv:2203.08101v2
fatcat:uwqqwuwazvbl7ajowvdj4uknze
A Survey of Natural Language Generation
[article]
2021
arXiv
pre-print
NLG tasks and datasets, and draw attention to the challenges in NLG evaluation, focusing on different evaluation methods and their relationships; (c) highlight some future emphasis and relatively recent ...
research issues that arise due to the increasing synergy between NLG and other artificial intelligence areas, such as computer vision, text and computational creativity. ...
Then the sentence-level and word-level dynamic attention and , are formulated as: = u ⊤ W 1 h , , = h , ⊤ W 2 h . (43) Finally, the static and dynamic attentions are combined into ˜ , to reweight the article ...
arXiv:2112.11739v1
fatcat:ygrpp6f25ja4vfbhcr5ycfpxhy
Vision-and-Language Pretrained Models: A Survey
[article]
2022
arXiv
pre-print
In this paper, we present an overview of the major advances achieved in VLPMs for producing joint representations of vision and language. ...
As the preliminaries, we briefly describe the general task definition and genetic architecture of VLPMs. ...
such as visual regions/patches and textual words of the aligned image-text pairs. ...
arXiv:2204.07356v5
fatcat:uesrj6kfkffvzeycvx7hredl3e
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
[article]
2018
arXiv
pre-print
recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating ...
This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively ...
Acknowledgements We thank the four reviewers for their detailed and constructive comments. ...
arXiv:1703.09902v4
fatcat:owx2fgo2bjej3b27ve2f3ledoe
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation
2018
The Journal of Artificial Intelligence Research
topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar ...
challenges faced in other areas of NLP, with an emphasis on different evaluation methods and the relationships between them. ...
Acknowledgments We thank the four reviewers for their detailed and constructive comments. ...
doi:10.1613/jair.5477
fatcat:ycuteghjzncn7nx6pzkhzd6mn4
Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining
[article]
2021
arXiv
pre-print
Notably, Product1M contains over 1 million image-caption pairs and consists of two sample types, i.e., single-product and multi-product samples, which encompass a wide variety of cosmetics brands. ...
To promote the study of this challenging task, we contribute Product1M, one of the largest multi-modal cosmetic datasets for real-world instance-level retrieval. ...
For the 'CAPTURE-1Inst' model, we feed the whole image and an image-level bounding box, which is of the same size as the image, to CAP-TURE for inference. ...
arXiv:2107.14572v2
fatcat:cemydi2wojbyvcmh44flggtjem
Message in a Bottle: An Advertising Campaign's Appropriation of Obama's Inclusive Rhetoric, and What This Reveals About National Identity
2011
Berkeley undergraduate journal
When Calvin Klein uses "we are one," "one for all" and "for all for ever" combined with the image of the unified group of young people walking arm in arm to sell perfume, these terms and image have moved ...
In the ck one ad, this contextual framing is completed with very few words: The caption at the top of the ad reads "we are one" and at the bottom reads "for all for ever." ...
doi:10.5070/b3242011670
fatcat:j6cwrxmnsjg4ziogrk72wpzxma
FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context
[article]
2022
arXiv
pre-print
of information in sketches and image captions, as well as the potential benefit of combining the two modalities. ...
Using our dataset, we study for the first time the problem of fine-grained image retrieval from freehand scene sketches and sketch captions. ...
[63] is one of the first popular works to use the attention mechanism with an LSTM for image captioning. ...
arXiv:2203.02113v3
fatcat:zpt353655vejzi7j3fpbp3jq5i
A Comprehensive Survey on Automatic Knowledge Graph Construction
[article]
2023
arXiv
pre-print
Thus, there is a demand for a systematic review of paradigms to organize knowledge structures beyond data-level mentions. ...
The survey concludes with discussions on the challenges and possible directions for future exploration. ...
the word-level attention and Multi-level CNN [169] developing an input attention mechanism with attention-based pooling. ...
arXiv:2302.05019v1
fatcat:7in54wjwyzhfnkx755izrkzr3y
A Roadmap for Big Model
[article]
2022
arXiv
pre-print
At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. ...
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. ...
However, even if the social bias is eliminated at the word level, the sentence-level bias can still exist due to the imbalanced combination of words. ...
arXiv:2203.14101v4
fatcat:rdikzudoezak5b36cf6hhne5u4
Non-Contrastive Learning Meets Language-Image Pre-Training
[article]
2022
arXiv
pre-print
Nonetheless, the loose correlation between images and texts of web-crawled data renders the contrastive objective data inefficient and craving for a large training batch size. ...
The synergy between two objectives lets xCLIP enjoy the best of both worlds: superior performance in both zero-shot transfer and representation learning. ...
For each attention head from the last layer, we extract the attention map with [CLS] token as the query. ...
arXiv:2210.09304v1
fatcat:tkphueek3be2vas5ygplk2m2zq
Neural Approaches to Conversational AI
[article]
2019
arXiv
pre-print
For each category, we present a review of state-of-the-art neural approaches, draw the connection between them and traditional approaches, and discuss the progress that has been made and challenges still ...
The present paper surveys neural approaches to conversational AI that have been developed in the last few years. ...
al., 2015a) , a sentence pair of different languages in machine translation (Gao et al., 2014a) , and an image-text pair in image captioning (Fang et al., 2015) and so on. ...
arXiv:1809.08267v3
fatcat:j57xlm4ogferdnrpfs4f2jporq
Effective writing skills for Public Relations
2008
Zenodo
Writing good English must be one of the most difficult jobs in the world. The tracking of a developing language that is rich, diverse, and constantly evolving in use and meaning is not an easy task. ...
Today's rules and uses quickly become outdated, but this book captures English as it should be used now. ...
For the illustration 18.1, Diabetes UK is the copyright holder for the website and image. Chinese artist Feng Feng provided oriental influence for WPP's Acknowledgements. ...
doi:10.5281/zenodo.5345141
fatcat:hd4nud5fyrg7rk3l36nrzbwrgi
« Previous
Showing results 1 — 15 out of 502 results