A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
Deep Learning for Embodied Vision Navigation: A Survey
[article]
2021
arXiv
pre-print
Building such an agent that observes, thinks, and acts is a key to real intelligence. ...
"Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation. ...
Studying NDH is fundamental for building a real-world dialog navigation robot. ...
arXiv:2108.04097v4
fatcat:46p2p3zlivabbn7dvowkyccufe
A Survey of Embodied AI: From Simulators to Research Tasks
[article]
2022
arXiv
pre-print
There has been an emerging paradigm shift from the era of "internet AI" to "embodied AI", where AI algorithms and agents no longer learn from datasets of images, videos or text curated primarily from the ...
Lastly, this paper surveys the three main research tasks in embodied AI -- visual exploration, visual navigation and embodied question answering (QA), covering the state-of-the-art approaches, evaluation ...
Acknowledgments This research is supported by the Agency for Science, Technology and Research (A*STAR), Singapore under its AME Programmatic Funding Scheme (Award #A18A2b0046) and the National Research ...
arXiv:2103.04918v8
fatcat:2zu4klcchbhnvmjej5ry3emu4u
HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation
[article]
2022
arXiv
pre-print
Pre-training has been adopted in a few of recent works for Vision-and-Language Navigation (VLN). ...
However, previous pre-training methods for VLN either lack the ability to predict future actions or ignore the trajectory contexts, which are essential for a greedy navigation process. ...
Introduction Vision-and-Language Navigation (VLN) has received large attention in communities of computer vision, natural language processing and robotics due to its great importance towards real-world ...
arXiv:2203.11591v1
fatcat:hqptujj4yzdnblsatph2gqe5x4
Core Challenges in Embodied Vision-Language Planning
[article]
2022
arXiv
pre-print
Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment ...
In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language ...
, and we thank the JAIR reviewers for their valuable feedback. ...
arXiv:2106.13948v4
fatcat:esrtfxpun5ae5kaydjnymf3v6u
Core Challenges in Embodied Vision-Language Planning
2022
The Journal of Artificial Intelligence Research
Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment ...
In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language ...
, and we thank the JAIR reviewers for their valuable feedback. ...
doi:10.1613/jair.1.13646
fatcat:rmgy6whqefeuvpvte7ultyyjlq
Visually Grounded Language Learning: a review of language games, datasets, tasks, and models
[article]
2023
arXiv
pre-print
Overall, these represent key requirements for developing grounded meanings in neural models. ...
In the literature, many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality. ...
• Real-world vision: is the model exposed to real-world images? • Natural Language: is the model exposed to Natural Language (i.e., English content generated by real users)? ...
arXiv:2312.02431v1
fatcat:l2nsv5dbibgpnjoooxncqfx3nq
Learning by Asking for Embodied Visual Navigation and Task Completion
[article]
2023
arXiv
pre-print
We evaluate our model on the TEACH vision-dialog navigation and task completion dataset. ...
The research community has shown increasing interest in designing intelligent embodied agents that can assist humans in accomplishing tasks. ...
Acknowledgments This work is supported by the Amazon -Virginia Tech Initiative for Efficient and Robust Machine Learning. ...
arXiv:2302.04865v1
fatcat:ljjwvuzsbbf4vhakgl2bbom73m
Visually Grounded Language Learning: A Review of Language Games, Datasets, Tasks, and Models
2024
The Journal of Artificial Intelligence Research
Overall, these represent key requirements for developing grounded meanings in neural models. ...
In the literature, many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality. ...
Acknowledgments We would like to thank Arash Eshghi and Raquel Fernandez for their feedback on a preliminary version of this manuscript which was part of the first author's PhD thesis (Suglia et al., ...
doi:10.1613/jair.1.15185
fatcat:4l7lajl2abasznwbuv5hwgt3nm
Transcultural Competence Model: An Inclusive Path for Communication and Interaction
2021
Journal of Transcultural Communication
The study also explores how educators and learners develop cognitive, emotional, and social qualities, engaging in dialog and critical reflection that informs our actions as the catalyst for positive social ...
Embracing transculturalism perspective calls for integration of new concepts and approaches in communication and education that promote active participation, adaptation, and interaction. ...
Empowered with transcultural knowledge, a majority of participants see themselves as active agents for promoting learning to family members, colleagues, and their community as a path for navigating successfully ...
doi:10.1515/jtc-2021-2008
fatcat:yl4yvfve4vdlfim4eetgdcqcua
Dialogic Cosmopolitanism and Global Justice
2009
International Studies Review
Be that as it may, the aspiration of dialogic cosmopolitans to global communality is smoothed by the light definition of community as 'the act of inclusion in the moral world' (Shapcott, 2001:3). 6 Dialogic ...
In other words, for dialogic cosmopolitans, the boundaries of moral community extend beyond those of political community. ...
people die every year from hunger and hunger-related diseases, according to the World Food Programme (2009) 6 Such a definition leaves aside other possible requirements for inclusion in a community, such ...
doi:10.1111/j.1468-2486.2009.00893.x
fatcat:7pijieazsjemvnzoh2h22a4hfu
A Cognitively Motivated Route-Interface for Mobile Robot Navigation
[chapter]
2009
Cognitive Systems Monographs
We present a spatial language to describe route-based navigation tasks for a mobile robot. ...
This enables both sides to communicate by using shared knowledge. Spatial knowledge can be (re)presented in various ways to increase the interaction between humans and mobile robots. ...
This motivates the research interest of using spatial language for interacting with artificial navigational agents. ...
doi:10.1007/978-3-642-10403-9_8
fatcat:jddkkuf2z5gdparcuk6hcsm5qi
Are nurses being heard? The power of Freirean dialogue to transform the nursing profession
2022
International Health Trends and Perspectives
creating a lively and interactive learning environment to allow nurses to build their self-confidence to act and become change agents. ...
Certainly, the COVID-19 pandemic, which exacerbated a nursing shortage and led to limited access to services and poor quality of care for all, underscores the urgency of the Freirean dialogic approach ...
This kind of education diminishes students' creativity, self-confidence, and empowerment to become change agents in society. ...
doi:10.32920/ihtp.v2i2.1651
fatcat:rrmzq5exlvcstnsevvazaawhem
Recursive Visual Attention in Visual Dialog
2019
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Visual dialog is a challenging vision-language task, which requires the agent to answer multi-round questions about an image. ...
Specifically, our dialog agent browses the dialog history until the agent has sufficient confidence in the visual co-reference resolution, and refines the visual attention recursively. ...
Acknowledgements This work was partially supported by National Natural Science Foundation of China (61573363 and 61832017), the Fundamental Research Funds for the Central Universities and the Research ...
doi:10.1109/cvpr.2019.00684
dblp:conf/cvpr/NiuZZZLW19
fatcat:oej7ojhdzfhurjzoaxdpoowyzy
Recursive Visual Attention in Visual Dialog
[article]
2019
arXiv
pre-print
Visual dialog is a challenging vision-language task, which requires the agent to answer multi-round questions about an image. ...
Specifically, our dialog agent browses the dialog history until the agent has sufficient confidence in the visual co-reference resolution, and refines the visual attention recursively. ...
and natural language are still far from being resolved, especially when the AI agent interacts with human in a continuous communication, such as vision-and-language navigation [4] and visual dialog ...
arXiv:1812.02664v2
fatcat:uxtitvjyirehvail3bpaq7nc6e
Multimodal Conversational AI: A Survey of Datasets and Approaches
[article]
2022
arXiv
pre-print
As humans, we experience the world with all our senses or modalities (sound, sight, touch, smell, and taste). ...
This paper motivates, defines, and mathematically formulates the multimodal conversational research objective. ...
Vision-and-Dialog Navigation (Thomason et al., 2019) contains natural dialogues grounded in a simulated environment. ...
arXiv:2205.06907v1
fatcat:u6kehgeeq5aefdlvv5bpbwsvsa
« Previous
Showing results 1 — 15 out of 7,374 results