A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2022; you can also visit the original URL.
The file type is application/pdf
.
Filters
A Proposal for Common Dataset in Neural-Symbolic Reasoning Studies
2016
International Workshop on Neural-Symbolic Learning and Reasoning
We promote and analyze the needs of a common publicly available benchmark dataset to be used for neural-symbolic studies of learning and reasoning. ...
Along with the original tasks that were suggested by the Visual Genome creators, we propose neural-symbolic tasks that can be used as challenges to promote research in the field and competition between ...
We would like to thank the reviewers for detailed and very beneficial comments on the paper. ...
dblp:conf/nesy/YilmazGS16
fatcat:qf3grff5nbbjdbszeihh5ugghy
A Neural-Symbolic Approach to Design of CAPTCHA
[article]
2018
arXiv
pre-print
To develop image/visual-captioning-based CAPTCHAs, this paper proposes a new image captioning architecture by exploiting tensor product representations (TPR), a structured neural-symbolic framework developed ...
To address this, this paper promotes image/visual captioning based CAPTCHAs, which is robust against machine-learning-based attacks. ...
And as a by-product, the symbolic character of TPRs makes them amenable to conceptual interpretation in a way that standard learned neural network representations are not. ...
arXiv:1710.11475v2
fatcat:yiknq6uarbe23nmgwbzbq2qzgy
Multimodal Graph Networks for Compositional Generalization in Visual Question Answering
2020
Neural Information Processing Systems
In this paper, we propose to tackle this challenge by employing neural factor graphs to induce a tighter coupling between concepts in different modalities (e.g. images and text). ...
Our model first creates a multimodal graph, processes it with a graph neural network to induce a factor correspondence matrix, and then outputs a symbolic program to predict answers to questions. ...
Several neural architectures have shown great promise in learning multimodal representations to solve the task [42, 39, 54] . ...
dblp:conf/nips/SaqurN20
fatcat:xqqettxn2reixbcnvg7ldthlfm
Like a Baby: Visually Situated Neural Language Acquisition
[article]
2019
arXiv
pre-print
Thus, language models perform better when they learn like a baby, i.e, in a multi-modal environment. ...
We examine the benefits of visual context in training neural language models to perform next-word prediction. ...
While image-captioning generally focuses on ranking appropriate caption candidates, we intend to use the model to generate sen-tences using only the image for guidance. ...
arXiv:1805.11546v2
fatcat:m7cbjby4pfcr5af6gll2wglizu
What is not where: the challenge of integrating spatial representations into deep learning architectures
[article]
2018
arXiv
pre-print
This paper examines to what degree current deep learning architectures for image caption generation capture spatial language. ...
Although language models provide useful knowledge for image captions, we argue that deep learning image captioning architectures should also model geometric relations between objects. ...
The research of Dobnik was supported by a grant from the Swedish Research Council (VR project 2014-39) for the establishment of the Centre for Linguistic Theory and Studies in Probability (CLASP) at Department ...
arXiv:1807.08133v1
fatcat:ea34qml7jzbz5peenpwc4bowri
Neuro-Symbolic Learning: Principles and Applications in Ophthalmology
[article]
2022
arXiv
pre-print
Thus, the neuro-symbolic learning (NeSyL) notion emerged, which incorporates aspects of symbolic representation and bringing common sense into neural networks (NeSyL). ...
Attempts have been made to overcome the challenges in neural network computing by representing and embedding domain knowledge in terms of symbolic representations. ...
Fig. 8 : 8 Fig. 8: The combination of neural learning and symbolic learning. The neural network extracts features which passes for a symbolic unit for reasoning and inferencing. ...
arXiv:2208.00374v1
fatcat:pktmnomj3bbwpjyj7lmu37rl7i
Scene Graph based Image Retrieval – A case study on the CLEVR Dataset
[article]
2019
arXiv
pre-print
Motivated by this, we propose a neural-symbolic approach for a one-shot retrieval of images from a large scale catalog, given the caption description. ...
With the prolification of multimodal interaction in various domains, recently there has been much interest in text based image retrieval in the computer vision community. ...
In this work, we propose a neural symbolic approach for modeling a caption based image retrieval task. ...
arXiv:1911.00850v1
fatcat:j5lktzr65ffbna5x77uu5bnksy
Interpretable Detection of Out-of-Context Misinformation with Neural-Symbolic-Enhanced Large Multimodal Model
[article]
2024
arXiv
pre-print
The proposed model first symbolically disassembles the text-modality information to a set of fact queries based on the Abstract Meaning Representation of the caption and then forwards the query-image pairs ...
contents (e.g. mismatched images and captions) to deceive the public and fake news detection systems. ...
Neural-Symbolic Multi-Modal Learning: Existing Neural-Symbolic Multi-Modal Learning methods are usually designed for Vision Question Answering (Yi et al., 2018; Zhu et al., 2022) . ...
arXiv:2304.07633v2
fatcat:26xeo2q5knfxphfe27qfqi2kmq
Keyword Generation for Biomedical Image Retrieval with Recurrent Neural Networks
2017
Conference and Labs of the Evaluation Forum
The images are visually represented using a Convolutional Neural Network (CNN) and the Long Short-Term Memory (LSTM) based Recurrent Neural Network (RNN) Show-and-Tell model is adopted for image caption ...
The aim of this presented work is the generation of image keywords, which can be substituted as text representation for classifications tasks and image retrieval purposes. ...
Using image -caption pairs of 164,614 biomedical figures, distributed for training at the ImageCLEF Caption Prediction Task, long short-term memory based Recurrent Neural Network models were trained. ...
dblp:conf/clef/PelkaF17
fatcat:227flkcnwbca5cygipztjvpjn4
TPsgtR: Neural-Symbolic Tensor Product Scene-Graph-Triplet Representation for Image Captioning
[article]
2019
arXiv
pre-print
These neural symbolic representation helps in better definition of the neural symbolic space for neuro-symbolic attention and can be transformed to better captions. ...
In this work, we have introduced a novel technique for caption generation using the neural-symbolic encoding of the scene-graphs, derived from regional visual information of the images and we call it Tensor ...
For image captioning application, TP sgt R helps in providing several discrete interaction information through the required graphical layer based representation interface and these can be used as neuro-symbolic ...
arXiv:1911.10115v1
fatcat:jbbw4g2msjcsnf4jfukzmiqtkm
Generating Text from Images in a Smooth Representation Space
2018
Conference and Labs of the Evaluation Forum
Instead of generating textual sequences directly from images, we first learn a smooth, continuous representation space for the captions. ...
A methodology is described for the generation of relevant captions for images of an extensive medical dataset in the ImageCLEF 2018 Caption Prediction competition. ...
Generating captions from images is also a task that requires an understanding of data representations in neural networks. ...
dblp:conf/clef/SpinksM18
fatcat:wiponkzgbzd6vkc4kfh3zu5dmm
KANDINSKYPatterns – An experimental exploration environment for Pattern Analysis and Machine Intelligence
[article]
2021
arXiv
pre-print
This was experimentally proven by Hubel &Wiesel in the 1960s and became the basis for machine learning approaches such as the Neocognitron and the even later Deep Learning. ...
There is still a significant gap between machine-level pattern recognition and human-level concept learning. ...
ACKNOWLEDGEMENTS This work has received funding by the Austrian Science Fund (FWF), Project: P-32554 "A reference model for explainable Artificial Intelligence in the medical domain". ...
arXiv:2103.00519v1
fatcat:d57pwgzf4vhmpaa5hqwm7ls5zq
Unifying Neural Learning and Symbolic Reasoning for Spinal Medical Report Generation
[article]
2020
arXiv
pre-print
In this paper, we propose the neural-symbolic learning (NSL) framework that performs human-like learning by unifying deep neural learning and symbolic logical reasoning for the spinal medical report generation ...
Generally speaking, the NSL framework firstly employs deep neural learning to imitate human visual perception for detecting abnormalities of target spinal structures. ...
Combining neural learning and symbolic reasoning for the medical report generation is proper and novel. ...
arXiv:2004.13577v1
fatcat:oh5aka5zr5be3ipd7qqnyhikzy
Multimodal Machine Learning: Integrating Language, Vision and Speech
2017
Proceedings of ACL 2017, Tutorial Abstracts
With the initial research on audio-visual speech recognition and more recently with language & vision projects such as image and video captioning and visual question answering, this research field brings ...
some unique challenges for multimodal researchers given the heterogeneity of the data and the contingency often found between modalities. ...
-Multimodal applications: image
captioning,
video description,
AVSR,
• Core technical challenges
-Representation learning, translation,
alignment, fusion and co-learning
2. ...
doi:10.18653/v1/p17-5002
dblp:conf/acl/MorencyB17
fatcat:m24h75t6mvdyfeedrsjbvjjaom
Describing Semantic Representations of Brain Activity Evoked by Visual Stimuli
[article]
2018
arXiv
pre-print
To apply brain activity to the image-captioning network, we train regression models that learn the relationship between brain activity and deep-layer image features. ...
To effectively use a small amount of available brain activity data, our proposed method employs a pre-trained image-captioning network model using a deep learning framework. ...
Interestingly, a proper caption was generated using brain activity data with the three-layer neural network model compared to the image-captioning model for the second training example. ...
arXiv:1802.02210v1
fatcat:qfbevp2mfjcevo64vaq22kz3nm
« Previous
Showing results 1 — 15 out of 4,961 results