Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








7,374 Hits in 6.2 sec

Deep Learning for Embodied Vision Navigation: A Survey [article]

Fengda Zhu, Yi Zhu, Vincent CS Lee, Xiaodan Liang, Xiaojun Chang
2021 arXiv   pre-print
Building such an agent that observes, thinks, and acts is a key to real intelligence.  ...  "Embodied visual navigation" problem requires an agent to navigate in a 3D environment mainly rely on its first-person observation.  ...  Studying NDH is fundamental for building a real-world dialog navigation robot.  ... 
arXiv:2108.04097v4 fatcat:46p2p3zlivabbn7dvowkyccufe

A Survey of Embodied AI: From Simulators to Research Tasks [article]

Jiafei Duan, Samson Yu, Hui Li Tan, Hongyuan Zhu, Cheston Tan
2022 arXiv   pre-print
There has been an emerging paradigm shift from the era of "internet AI" to "embodied AI", where AI algorithms and agents no longer learn from datasets of images, videos or text curated primarily from the  ...  Lastly, this paper surveys the three main research tasks in embodied AI -- visual exploration, visual navigation and embodied question answering (QA), covering the state-of-the-art approaches, evaluation  ...  Acknowledgments This research is supported by the Agency for Science, Technology and Research (A*STAR), Singapore under its AME Programmatic Funding Scheme (Award #A18A2b0046) and the National Research  ... 
arXiv:2103.04918v8 fatcat:2zu4klcchbhnvmjej5ry3emu4u

HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation [article]

Yanyuan Qiao, Yuankai Qi, Yicong Hong, Zheng Yu, Peng Wang, Qi Wu
2022 arXiv   pre-print
Pre-training has been adopted in a few of recent works for Vision-and-Language Navigation (VLN).  ...  However, previous pre-training methods for VLN either lack the ability to predict future actions or ignore the trajectory contexts, which are essential for a greedy navigation process.  ...  Introduction Vision-and-Language Navigation (VLN) has received large attention in communities of computer vision, natural language processing and robotics due to its great importance towards real-world  ... 
arXiv:2203.11591v1 fatcat:hqptujj4yzdnblsatph2gqe5x4

Core Challenges in Embodied Vision-Language Planning [article]

Jonathan Francis, Nariaki Kitamura, Felix Labelle, Xiaopeng Lu, Ingrid Navarro, Jean Oh
2022 arXiv   pre-print
Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment  ...  In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language  ...  , and we thank the JAIR reviewers for their valuable feedback.  ... 
arXiv:2106.13948v4 fatcat:esrtfxpun5ae5kaydjnymf3v6u

Core Challenges in Embodied Vision-Language Planning

Jonathan Francis, Nariaki Kitamura, Felix Labelle, Xiaopeng Lu, Ingrid Navarro, Jean Oh
2022 The Journal of Artificial Intelligence Research  
Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment  ...  In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language  ...  , and we thank the JAIR reviewers for their valuable feedback.  ... 
doi:10.1613/jair.1.13646 fatcat:rmgy6whqefeuvpvte7ultyyjlq

Visually Grounded Language Learning: a review of language games, datasets, tasks, and models [article]

Alessandro Suglia and Ioannis Konstas and Oliver Lemon
2023 arXiv   pre-print
Overall, these represent key requirements for developing grounded meanings in neural models.  ...  In the literature, many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality.  ...  • Real-world vision: is the model exposed to real-world images? • Natural Language: is the model exposed to Natural Language (i.e., English content generated by real users)?  ... 
arXiv:2312.02431v1 fatcat:l2nsv5dbibgpnjoooxncqfx3nq

Learning by Asking for Embodied Visual Navigation and Task Completion [article]

Ying Shen, Ismini Lourentzou
2023 arXiv   pre-print
We evaluate our model on the TEACH vision-dialog navigation and task completion dataset.  ...  The research community has shown increasing interest in designing intelligent embodied agents that can assist humans in accomplishing tasks.  ...  Acknowledgments This work is supported by the Amazon -Virginia Tech Initiative for Efficient and Robust Machine Learning.  ... 
arXiv:2302.04865v1 fatcat:ljjwvuzsbbf4vhakgl2bbom73m

Visually Grounded Language Learning: A Review of Language Games, Datasets, Tasks, and Models

Alessandro Suglia, Ioannis Konstas, Oliver Lemon
2024 The Journal of Artificial Intelligence Research  
Overall, these represent key requirements for developing grounded meanings in neural models.  ...  In the literature, many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality.  ...  Acknowledgments We would like to thank Arash Eshghi and Raquel Fernandez for their feedback on a preliminary version of this manuscript which was part of the first author's PhD thesis (Suglia et al.,  ... 
doi:10.1613/jair.1.15185 fatcat:4l7lajl2abasznwbuv5hwgt3nm

Transcultural Competence Model: An Inclusive Path for Communication and Interaction

Sinela Jurkova
2021 Journal of Transcultural Communication  
The study also explores how educators and learners develop cognitive, emotional, and social qualities, engaging in dialog and critical reflection that informs our actions as the catalyst for positive social  ...  Embracing transculturalism perspective calls for integration of new concepts and approaches in communication and education that promote active participation, adaptation, and interaction.  ...  Empowered with transcultural knowledge, a majority of participants see themselves as active agents for promoting learning to family members, colleagues, and their community as a path for navigating successfully  ... 
doi:10.1515/jtc-2021-2008 fatcat:yl4yvfve4vdlfim4eetgdcqcua

Dialogic Cosmopolitanism and Global Justice

Eduard Jordaan
2009 International Studies Review  
Be that as it may, the aspiration of dialogic cosmopolitans to global communality is smoothed by the light definition of community as 'the act of inclusion in the moral world' (Shapcott, 2001:3). 6 Dialogic  ...  In other words, for dialogic cosmopolitans, the boundaries of moral community extend beyond those of political community.  ...  people die every year from hunger and hunger-related diseases, according to the World Food Programme (2009) 6 Such a definition leaves aside other possible requirements for inclusion in a community, such  ... 
doi:10.1111/j.1468-2486.2009.00893.x fatcat:7pijieazsjemvnzoh2h22a4hfu

A Cognitively Motivated Route-Interface for Mobile Robot Navigation [chapter]

Mohammed Elmogy, Christopher Habel, Jianwei Zhang
2009 Cognitive Systems Monographs  
We present a spatial language to describe route-based navigation tasks for a mobile robot.  ...  This enables both sides to communicate by using shared knowledge. Spatial knowledge can be (re)presented in various ways to increase the interaction between humans and mobile robots.  ...  This motivates the research interest of using spatial language for interacting with artificial navigational agents.  ... 
doi:10.1007/978-3-642-10403-9_8 fatcat:jddkkuf2z5gdparcuk6hcsm5qi

Are nurses being heard? The power of Freirean dialogue to transform the nursing profession

Idevania Costa
2022 International Health Trends and Perspectives  
creating a lively and interactive learning environment to allow nurses to build their self-confidence to act and become change agents.  ...  Certainly, the COVID-19 pandemic, which exacerbated a nursing shortage and led to limited access to services and poor quality of care for all, underscores the urgency of the Freirean dialogic approach  ...  This kind of education diminishes students' creativity, self-confidence, and empowerment to become change agents in society.  ... 
doi:10.32920/ihtp.v2i2.1651 fatcat:rrmzq5exlvcstnsevvazaawhem

Recursive Visual Attention in Visual Dialog

Yulei Niu, Hanwang Zhang, Manli Zhang, Jianhong Zhang, Zhiwu Lu, Ji-Rong Wen
2019 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Visual dialog is a challenging vision-language task, which requires the agent to answer multi-round questions about an image.  ...  Specifically, our dialog agent browses the dialog history until the agent has sufficient confidence in the visual co-reference resolution, and refines the visual attention recursively.  ...  Acknowledgements This work was partially supported by National Natural Science Foundation of China (61573363 and 61832017), the Fundamental Research Funds for the Central Universities and the Research  ... 
doi:10.1109/cvpr.2019.00684 dblp:conf/cvpr/NiuZZZLW19 fatcat:oej7ojhdzfhurjzoaxdpoowyzy

Recursive Visual Attention in Visual Dialog [article]

Yulei Niu, Hanwang Zhang, Manli Zhang, Jianhong Zhang, Zhiwu Lu, Ji-Rong Wen
2019 arXiv   pre-print
Visual dialog is a challenging vision-language task, which requires the agent to answer multi-round questions about an image.  ...  Specifically, our dialog agent browses the dialog history until the agent has sufficient confidence in the visual co-reference resolution, and refines the visual attention recursively.  ...  and natural language are still far from being resolved, especially when the AI agent interacts with human in a continuous communication, such as vision-and-language navigation [4] and visual dialog  ... 
arXiv:1812.02664v2 fatcat:uxtitvjyirehvail3bpaq7nc6e

Multimodal Conversational AI: A Survey of Datasets and Approaches [article]

Anirudh Sundar, Larry Heck
2022 arXiv   pre-print
As humans, we experience the world with all our senses or modalities (sound, sight, touch, smell, and taste).  ...  This paper motivates, defines, and mathematically formulates the multimodal conversational research objective.  ...  Vision-and-Dialog Navigation (Thomason et al., 2019) contains natural dialogues grounded in a simulated environment.  ... 
arXiv:2205.06907v1 fatcat:u6kehgeeq5aefdlvv5bpbwsvsa
« Previous Showing results 1 — 15 out of 7,374 results