Instance-aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning.

Besides, a self-boosting learning strategy is further proposed to encourage the model to place more emphasis on challenging objects in computation-expensive temporal stereo matching. ... First, a category-specific structural priors mining approach is proposed for enhancing the efficacy of monocular depth generation. ... Structural Priors Mining approach (SPM) and Self-Boosting Learning strategy (SBL), respectively. ...

arXiv:2312.08004v1 fatcat:x7z4cwjnrzexvdxerwovuzeg6i

Open Access

Second, we conduct a comprehensive survey of the progress in 3D object detection from the aspects of models and sensory inputs, including LiDAR-based, camera-based, and multi-modal detection approaches ... This paper reviews the advances in 3D object detection for autonomous driving. First, we introduce the background of 3D object detection and discuss the challenges in this task. ... (Section 7.3), and self-supervised learning (Section 7.4) for 3D object detection. ...

arXiv:2206.09474v1 fatcat:3skws77uqngjtpo6mycpo4dhny

Open Access

Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications. ... Furthermore, we analyze the solutions for challenging cases, such as the lack of data, the inherent ambiguity between 2D and 3D, and the complex multi-person scenarios. ... parts, structure-aware 91.9 Yang et al. [224] ICCV'17 Hourglass 256 × 256 Pyramid residual module to learn various scales 92.0 Ke et al. [82] ECCV'18 Hourglass 256 × 256 Multi-scale supervision, structure-aware ...

doi:10.1145/3524497 fatcat:4pbvntngrnfp7lqhcpjmy7p2fq

The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, month, and inclusive pagination. ... The Subject Index contains entries describing the item under all appropriate subject headings, plus the first author's name, the publication abbreviation, month, and year, and inclusive pages. ... ., +, TIP 2021 5613-5625 Salient Object Detection With Purificatory Mechanism and Structural Simi-larity Loss. ...

doi:10.1109/tip.2022.3142569 fatcat:z26yhwuecbgrnb2czhwjlf73qu

Due to the limited space, we focus the analysis on several key areas, i.e. 2D and 3D object detection in perception, depth estimation from cameras, multiple sensor fusion on the data, feature and task ... This is a survey of autonomous driving technologies with deep learning methods. ... The depth fusion is also solved efficiently with multi-task learning, with supervising signals as stereo, motion, pose, normal, segmentation and attention etc. 3) 3-D Object Detection Similarly, object ...

arXiv:2006.06091v3 fatcat:nhdgivmtrzcarp463xzqvnxlwq

Open Access Multiple Versions

The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, month, and inclusive pagination. ... The Subject Index contains entries describing the item under all appropriate subject headings, plus the first author's name, the publication abbreviation, month, and year, and inclusive pages. ... ., +, TMM 2021 2471-2480 Temporal Locality-Aware Network With Dual Structure for Accurate and Fast Action Detection. ...

doi:10.1109/tmm.2022.3141947 fatcat:lil2nf3vd5ehbfgtslulu7y3lq

., and Nishino, K., Recognizing Material Properties from Images; 1981-1995 Sebe, N., see Pilzer, A., 2380-2395 Seddik, M., see Tamaazousti, Y., 2212-2224 Shah, M., see Kalayeh, M.M., TPAMI June 2020 ... ., +, TPAMI Nov. 2020 2874-2886 Learning (artificial intelligence) 3D Human Pose Machines with Self-Supervised Learning. ... ., +, TPAMI July 2020 1582-1593 Direction-Aware Spatial Context Features for Shadow Detection and Removal. Hu, X., +, TPAMI Nov. 2020 2795-2808 Feature Boosting Network For 3D Pose Estimation. ...

doi:10.1109/tpami.2020.3036557 fatcat:3j6s2l53x5eqxnlsptsgbjeebe

2D and 3D, and the complex multi-person scenarios. ... Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications. ... fusion, pose refinement, multi-task learning, and efficiency-aware design. ...

arXiv:2104.11536v1 fatcat:tdag2jq2vjdrjekwukm5nu7l6a

Monocular 3D object detection (Mono3D) has achieved unprecedented success with the advent of deep learning techniques and emerging large-scale autonomous driving datasets. ... STMono3D achieves remarkable performance on all evaluated datasets and even surpasses fully supervised results on the KITTI 3D object detection dataset. ... M3D-RPN [2] implements a single-stage multi-class detector with a region proposal network and depth-aware convolution. ...

arXiv:2204.11590v2 fatcat:oho5oi4go5fxtavavfkqeoiqme

Open Access Multiple Versions

., +, TIP 2020 617-627 Object Discovery From a Single Unlabeled Image by Mining Frequent Itemsets With Multi-Scale Features. ... ., +, TIP 2020 8107-8119 Object Discovery From a Single Unlabeled Image by Mining Frequent Item- sets With Multi-Scale Features. ...

doi:10.1109/tip.2020.3046056 fatcat:24m6k2elprf2nfmucbjzhvzk3m

with the 2D planar image data; (ii) A structural and hierarchical taxonomy of the DL methods for omnidirectional vision; (iii) A summarization of the latest novel learning strategies and applications; ... In recent years, the availability of customer-level 360 cameras has made omnidirectional vision more popular, and the advance of deep learning (DL) has significantly sparked its research and applications ... For instance, [213] introduces a multi-modal perception framework based on the visual and LiDAR information for 3D object detection and tracking. ...

arXiv:2205.10468v2 fatcat:73fks33oafa6zgxliccydvdbeq

Open Access Multiple Versions

Capturing both local and global features of irregular point clouds is essential to 3D object detection (3OD). ... Second, it develops a Global Context Aggregation module to aggregate multi-scale features from different layers of the encoder to achieve scene context-awareness. ... We attempt to mine the global features by designing a Global Context Aggregation module. We propose 3DLG-Detector, a 3D object detection network by simultaneously learning local and global features. ...

arXiv:2208.14796v1 fatcat:wokzacdmh5hcjoj24uchdyilni

Open Access

Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories. ... Unlike other computer vision tasks, such as image classification, object detection, re-identification, and segmentation, embedding methods in MOT have large variations, and they have never been systematically ... Moreover, CCPNet [111] decouples different tasks with multiple decoders to learn instance segmentation and ID-based embeddings for multi-object tracking and segmentation (MOTS) tasks. ...

arXiv:2205.10766v2 fatcat:hxawibw42nbl3jcb34zncfusey

Open Access Multiple Versions

Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. ... Our best model achieves 37.9% AP on COCO instance segmentation, surpassing prior weakly supervised methods and is competitive to supervised methods. ... C3DPO: Canonical 3d pose Faster R-CNN: Towards real-time object detection with re- networks for non-rigid structure from motion. In ICCV, gion proposal networks. ...

arXiv:2105.06464v2 fatcat:wr5iyiqvivb3novhhkgxwk6mv4

Multiple Versions

The progress of deep learning (DL) has impressively improved the capability and robustness of point cloud completion. ... Point cloud completion is a generation and estimation issue derived from the partial point clouds, which plays a vital role in the applications in 3D computer vision. ... Moreover, the large 3D environment reconstruction in underground mining space to accurately monitor mining safety. 3D detection. ...

arXiv:2203.03311v2 fatcat:e2kvryolufearetp4ujlw2gwwy

Multiple Versions

Instance-aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning [article]

Preserved Fulltext

3D Object Detection for Autonomous Driving: A Review and New Outlooks [article]

Preserved Fulltext

Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective

Preserved Fulltext

2021 Index IEEE Transactions on Image Processing Vol. 30

Preserved Fulltext

Autonomous Driving with Deep Learning: A Survey of State-of-Art Technologies [article]

Preserved Fulltext

Other Versions

2021 Index IEEE Transactions on Multimedia Vol. 23

Preserved Fulltext

2020 Index IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 42

Preserved Fulltext

Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective [article]

Preserved Fulltext

Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training [article]

Preserved Fulltext

Other Versions

2020 Index IEEE Transactions on Image Processing Vol. 29

Preserved Fulltext

Deep Learning for Omnidirectional Vision: A Survey and New Perspectives [article]

Preserved Fulltext

Other Versions

3DLG-Detector: 3D Object Detection via Simultaneous Local-Global Feature Learning [article]

Preserved Fulltext

Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey [article]

Preserved Fulltext

Other Versions

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision [article]

Preserved Fulltext

Other Versions

Comprehensive Review of Deep Learning-Based 3D Point Cloud Completion Processing and Analysis [article]

Preserved Fulltext

Other Versions