Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








1,465 Hits in 7.1 sec

Instance-aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning [article]

Yang Jiao, Zequn Jie, Shaoxiang Chen, Lechao Cheng, Jingjing Chen, Lin Ma, Yu-Gang Jiang
2023 arXiv   pre-print
Besides, a self-boosting learning strategy is further proposed to encourage the model to place more emphasis on challenging objects in computation-expensive temporal stereo matching.  ...  First, a category-specific structural priors mining approach is proposed for enhancing the efficacy of monocular depth generation.  ...  Structural Priors Mining approach (SPM) and Self-Boosting Learning strategy (SBL), respectively.  ... 
arXiv:2312.08004v1 fatcat:x7z4cwjnrzexvdxerwovuzeg6i

3D Object Detection for Autonomous Driving: A Review and New Outlooks [article]

Jiageng Mao, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
2022 arXiv   pre-print
Second, we conduct a comprehensive survey of the progress in 3D object detection from the aspects of models and sensory inputs, including LiDAR-based, camera-based, and multi-modal detection approaches  ...  This paper reviews the advances in 3D object detection for autonomous driving. First, we introduce the background of 3D object detection and discuss the challenges in this task.  ...  (Section 7.3), and self-supervised learning (Section 7.4) for 3D object detection.  ... 
arXiv:2206.09474v1 fatcat:3skws77uqngjtpo6mycpo4dhny

Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective

Wu Liu, Tao Mei
2022 ACM Computing Surveys  
Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications.  ...  Furthermore, we analyze the solutions for challenging cases, such as the lack of data, the inherent ambiguity between 2D and 3D, and the complex multi-person scenarios.  ...  parts, structure-aware 91.9 Yang et al. [224] ICCV'17 Hourglass 256 × 256 Pyramid residual module to learn various scales 92.0 Ke et al. [82] ECCV'18 Hourglass 256 × 256 Multi-scale supervision, structure-aware  ... 
doi:10.1145/3524497 fatcat:4pbvntngrnfp7lqhcpjmy7p2fq

2021 Index IEEE Transactions on Image Processing Vol. 30

2021 IEEE Transactions on Image Processing  
The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, month, and inclusive pagination.  ...  The Subject Index contains entries describing the item under all appropriate subject headings, plus the first author's name, the publication abbreviation, month, and year, and inclusive pages.  ...  ., +, TIP 2021 5613-5625 Salient Object Detection With Purificatory Mechanism and Structural Simi-larity Loss.  ... 
doi:10.1109/tip.2022.3142569 fatcat:z26yhwuecbgrnb2czhwjlf73qu

Autonomous Driving with Deep Learning: A Survey of State-of-Art Technologies [article]

Yu Huang, Yue Chen
2020 arXiv   pre-print
Due to the limited space, we focus the analysis on several key areas, i.e. 2D and 3D object detection in perception, depth estimation from cameras, multiple sensor fusion on the data, feature and task  ...  This is a survey of autonomous driving technologies with deep learning methods.  ...  The depth fusion is also solved efficiently with multi-task learning, with supervising signals as stereo, motion, pose, normal, segmentation and attention etc. 3) 3-D Object Detection Similarly, object  ... 
arXiv:2006.06091v3 fatcat:nhdgivmtrzcarp463xzqvnxlwq

2021 Index IEEE Transactions on Multimedia Vol. 23

2021 IEEE transactions on multimedia  
The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, month, and inclusive pagination.  ...  The Subject Index contains entries describing the item under all appropriate subject headings, plus the first author's name, the publication abbreviation, month, and year, and inclusive pages.  ...  ., +, TMM 2021 2471-2480 Temporal Locality-Aware Network With Dual Structure for Accurate and Fast Action Detection.  ... 
doi:10.1109/tmm.2022.3141947 fatcat:lil2nf3vd5ehbfgtslulu7y3lq

2020 Index IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 42

2021 IEEE Transactions on Pattern Analysis and Machine Intelligence  
., and Nishino, K., Recognizing Material Properties from Images; 1981-1995 Sebe, N., see Pilzer, A., 2380-2395 Seddik, M., see Tamaazousti, Y., 2212-2224 Shah, M., see Kalayeh, M.M., TPAMI June 2020  ...  ., +, TPAMI Nov. 2020 2874-2886 Learning (artificial intelligence) 3D Human Pose Machines with Self-Supervised Learning.  ...  ., +, TPAMI July 2020 1582-1593 Direction-Aware Spatial Context Features for Shadow Detection and Removal. Hu, X., +, TPAMI Nov. 2020 2795-2808 Feature Boosting Network For 3D Pose Estimation.  ... 
doi:10.1109/tpami.2020.3036557 fatcat:3j6s2l53x5eqxnlsptsgbjeebe

Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective [article]

Wu Liu, Qian Bao, Yu Sun, Tao Mei
2021 arXiv   pre-print
2D and 3D, and the complex multi-person scenarios.  ...  Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications.  ...  fusion, pose refinement, multi-task learning, and efficiency-aware design.  ... 
arXiv:2104.11536v1 fatcat:tdag2jq2vjdrjekwukm5nu7l6a

Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training [article]

Zhenyu Li, Zehui Chen, Ang Li, Liangji Fang, Qinhong Jiang, Xianming Liu, Junjun Jiang
2022 arXiv   pre-print
Monocular 3D object detection (Mono3D) has achieved unprecedented success with the advent of deep learning techniques and emerging large-scale autonomous driving datasets.  ...  STMono3D achieves remarkable performance on all evaluated datasets and even surpasses fully supervised results on the KITTI 3D object detection dataset.  ...  M3D-RPN [2] implements a single-stage multi-class detector with a region proposal network and depth-aware convolution.  ... 
arXiv:2204.11590v2 fatcat:oho5oi4go5fxtavavfkqeoiqme

2020 Index IEEE Transactions on Image Processing Vol. 29

2020 IEEE Transactions on Image Processing  
., +, TIP 2020 617-627 Object Discovery From a Single Unlabeled Image by Mining Frequent Itemsets With Multi-Scale Features.  ...  ., +, TIP 2020 8107-8119 Object Discovery From a Single Unlabeled Image by Mining Frequent Item- sets With Multi-Scale Features.  ... 
doi:10.1109/tip.2020.3046056 fatcat:24m6k2elprf2nfmucbjzhvzk3m

Deep Learning for Omnidirectional Vision: A Survey and New Perspectives [article]

Hao Ai, Zidong Cao, Jinjing Zhu, Haotian Bai, Yucheng Chen, Lin Wang
2022 arXiv   pre-print
with the 2D planar image data; (ii) A structural and hierarchical taxonomy of the DL methods for omnidirectional vision; (iii) A summarization of the latest novel learning strategies and applications;  ...  In recent years, the availability of customer-level 360 cameras has made omnidirectional vision more popular, and the advance of deep learning (DL) has significantly sparked its research and applications  ...  For instance, [213] introduces a multi-modal perception framework based on the visual and LiDAR information for 3D object detection and tracking.  ... 
arXiv:2205.10468v2 fatcat:73fks33oafa6zgxliccydvdbeq

3DLG-Detector: 3D Object Detection via Simultaneous Local-Global Feature Learning [article]

Baian Chen, Liangliang Nan, Haoran Xie, Dening Lu, Fu Lee Wang, Mingqiang Wei
2022 arXiv   pre-print
Capturing both local and global features of irregular point clouds is essential to 3D object detection (3OD).  ...  Second, it develops a Global Context Aggregation module to aggregate multi-scale features from different layers of the encoder to achieve scene context-awareness.  ...  We attempt to mine the global features by designing a Global Context Aggregation module. We propose 3DLG-Detector, a 3D object detection network by simultaneously learning local and global features.  ... 
arXiv:2208.14796v1 fatcat:wokzacdmh5hcjoj24uchdyilni

Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey [article]

Gaoang Wang, Mingli Song, Jenq-Neng Hwang
2024 arXiv   pre-print
Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories.  ...  Unlike other computer vision tasks, such as image classification, object detection, re-identification, and segmentation, embedding methods in MOT have large variations, and they have never been systematically  ...  Moreover, CCPNet [111] decouples different tasks with multiple decoders to learn instance segmentation and ID-based embeddings for multi-object tracking and segmentation (MOTS) tasks.  ... 
arXiv:2205.10766v2 fatcat:hxawibw42nbl3jcb34zncfusey

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision [article]

Shiyi Lan, Zhiding Yu, Christopher Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry S. Davis, Anima Anandkumar
2021 arXiv   pre-print
Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision.  ...  Our best model achieves 37.9% AP on COCO instance segmentation, surpassing prior weakly supervised methods and is competitive to supervised methods.  ...  C3DPO: Canonical 3d pose Faster R-CNN: Towards real-time object detection with re- networks for non-rigid structure from motion. In ICCV, gion proposal networks.  ... 
arXiv:2105.06464v2 fatcat:wr5iyiqvivb3novhhkgxwk6mv4

Comprehensive Review of Deep Learning-Based 3D Point Cloud Completion Processing and Analysis [article]

Ben Fei, Weidong Yang, Wenming Chen, Zhijun Li, Yikang Li, Tao Ma, Xing Hu, Lipeng Ma
2022 arXiv   pre-print
The progress of deep learning (DL) has impressively improved the capability and robustness of point cloud completion.  ...  Point cloud completion is a generation and estimation issue derived from the partial point clouds, which plays a vital role in the applications in 3D computer vision.  ...  Moreover, the large 3D environment reconstruction in underground mining space to accurately monitor mining safety. 3D detection.  ... 
arXiv:2203.03311v2 fatcat:e2kvryolufearetp4ujlw2gwwy
« Previous Showing results 1 — 15 out of 1,465 results