A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2024; you can also visit the original URL.
The file type is application/pdf
.
Filters
Instance-aware Multi-Camera 3D Object Detection with Structural Priors Mining and Self-Boosting Learning
[article]
2023
arXiv
pre-print
Besides, a self-boosting learning strategy is further proposed to encourage the model to place more emphasis on challenging objects in computation-expensive temporal stereo matching. ...
First, a category-specific structural priors mining approach is proposed for enhancing the efficacy of monocular depth generation. ...
Structural Priors Mining approach (SPM) and Self-Boosting Learning strategy (SBL), respectively. ...
arXiv:2312.08004v1
fatcat:x7z4cwjnrzexvdxerwovuzeg6i
3D Object Detection for Autonomous Driving: A Review and New Outlooks
[article]
2022
arXiv
pre-print
Second, we conduct a comprehensive survey of the progress in 3D object detection from the aspects of models and sensory inputs, including LiDAR-based, camera-based, and multi-modal detection approaches ...
This paper reviews the advances in 3D object detection for autonomous driving. First, we introduce the background of 3D object detection and discuss the challenges in this task. ...
(Section 7.3), and self-supervised learning (Section 7.4) for 3D object detection. ...
arXiv:2206.09474v1
fatcat:3skws77uqngjtpo6mycpo4dhny
Recent Advances of Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective
2022
ACM Computing Surveys
Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications. ...
Furthermore, we analyze the solutions for challenging cases, such as the lack of data, the inherent ambiguity between 2D and 3D, and the complex multi-person scenarios. ...
parts, structure-aware 91.9 Yang et al. [224] ICCV'17 Hourglass 256 × 256 Pyramid residual module to learn various scales 92.0 Ke et al. [82] ECCV'18 Hourglass 256 × 256 Multi-scale supervision, structure-aware ...
doi:10.1145/3524497
fatcat:4pbvntngrnfp7lqhcpjmy7p2fq
2021 Index IEEE Transactions on Image Processing Vol. 30
2021
IEEE Transactions on Image Processing
The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, month, and inclusive pagination. ...
The Subject Index contains entries describing the item under all appropriate subject headings, plus the first author's name, the publication abbreviation, month, and year, and inclusive pages. ...
., +, TIP 2021 5613-5625 Salient Object Detection With Purificatory Mechanism and Structural Simi-larity Loss. ...
doi:10.1109/tip.2022.3142569
fatcat:z26yhwuecbgrnb2czhwjlf73qu
Autonomous Driving with Deep Learning: A Survey of State-of-Art Technologies
[article]
2020
arXiv
pre-print
Due to the limited space, we focus the analysis on several key areas, i.e. 2D and 3D object detection in perception, depth estimation from cameras, multiple sensor fusion on the data, feature and task ...
This is a survey of autonomous driving technologies with deep learning methods. ...
The depth fusion is also solved efficiently with multi-task learning, with supervising signals as stereo, motion, pose, normal, segmentation and attention etc.
3) 3-D Object Detection Similarly, object ...
arXiv:2006.06091v3
fatcat:nhdgivmtrzcarp463xzqvnxlwq
2021 Index IEEE Transactions on Multimedia Vol. 23
2021
IEEE transactions on multimedia
The primary entry includes the coauthors' names, the title of the paper or other item, and its location, specified by the publication abbreviation, year, month, and inclusive pagination. ...
The Subject Index contains entries describing the item under all appropriate subject headings, plus the first author's name, the publication abbreviation, month, and year, and inclusive pages. ...
., +, TMM 2021 2471-2480 Temporal Locality-Aware Network With Dual Structure for Accurate and Fast Action Detection. ...
doi:10.1109/tmm.2022.3141947
fatcat:lil2nf3vd5ehbfgtslulu7y3lq
2020 Index IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 42
2021
IEEE Transactions on Pattern Analysis and Machine Intelligence
., and Nishino, K., Recognizing Material Properties from Images; 1981-1995 Sebe, N., see Pilzer, A., 2380-2395 Seddik, M., see Tamaazousti, Y., 2212-2224 Shah, M., see Kalayeh, M.M., TPAMI June 2020 ...
., +, TPAMI Nov. 2020 2874-2886 Learning (artificial intelligence) 3D Human Pose Machines with Self-Supervised Learning. ...
., +, TPAMI July 2020 1582-1593
Direction-Aware Spatial Context Features for Shadow Detection and
Removal. Hu, X., +, TPAMI Nov. 2020 2795-2808
Feature Boosting Network For 3D Pose Estimation. ...
doi:10.1109/tpami.2020.3036557
fatcat:3j6s2l53x5eqxnlsptsgbjeebe
Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep Learning Perspective
[article]
2021
arXiv
pre-print
2D and 3D, and the complex multi-person scenarios. ...
Estimation of the human pose from a monocular camera has been an emerging research topic in the computer vision community with many applications. ...
fusion, pose refinement, multi-task learning, and efficiency-aware design. ...
arXiv:2104.11536v1
fatcat:tdag2jq2vjdrjekwukm5nu7l6a
Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training
[article]
2022
arXiv
pre-print
Monocular 3D object detection (Mono3D) has achieved unprecedented success with the advent of deep learning techniques and emerging large-scale autonomous driving datasets. ...
STMono3D achieves remarkable performance on all evaluated datasets and even surpasses fully supervised results on the KITTI 3D object detection dataset. ...
M3D-RPN [2] implements a single-stage multi-class detector with a region proposal network and depth-aware convolution. ...
arXiv:2204.11590v2
fatcat:oho5oi4go5fxtavavfkqeoiqme
2020 Index IEEE Transactions on Image Processing Vol. 29
2020
IEEE Transactions on Image Processing
., +, TIP 2020 617-627 Object Discovery From a Single Unlabeled Image by Mining Frequent Itemsets With Multi-Scale Features. ...
., +, TIP
2020 8107-8119
Object Discovery From a Single Unlabeled Image by Mining Frequent Item-
sets With Multi-Scale Features. ...
doi:10.1109/tip.2020.3046056
fatcat:24m6k2elprf2nfmucbjzhvzk3m
Deep Learning for Omnidirectional Vision: A Survey and New Perspectives
[article]
2022
arXiv
pre-print
with the 2D planar image data; (ii) A structural and hierarchical taxonomy of the DL methods for omnidirectional vision; (iii) A summarization of the latest novel learning strategies and applications; ...
In recent years, the availability of customer-level 360 cameras has made omnidirectional vision more popular, and the advance of deep learning (DL) has significantly sparked its research and applications ...
For instance, [213] introduces a multi-modal perception framework based on the visual and LiDAR information for 3D object detection and tracking. ...
arXiv:2205.10468v2
fatcat:73fks33oafa6zgxliccydvdbeq
3DLG-Detector: 3D Object Detection via Simultaneous Local-Global Feature Learning
[article]
2022
arXiv
pre-print
Capturing both local and global features of irregular point clouds is essential to 3D object detection (3OD). ...
Second, it develops a Global Context Aggregation module to aggregate multi-scale features from different layers of the encoder to achieve scene context-awareness. ...
We attempt to mine the global features by designing a Global Context Aggregation module. We propose 3DLG-Detector, a 3D object detection network by simultaneously learning local and global features. ...
arXiv:2208.14796v1
fatcat:wokzacdmh5hcjoj24uchdyilni
Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey
[article]
2024
arXiv
pre-print
Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories. ...
Unlike other computer vision tasks, such as image classification, object detection, re-identification, and segmentation, embedding methods in MOT have large variations, and they have never been systematically ...
Moreover, CCPNet [111] decouples different tasks with multiple decoders to learn instance segmentation and ID-based embeddings for multi-object tracking and segmentation (MOTS) tasks. ...
arXiv:2205.10766v2
fatcat:hxawibw42nbl3jcb34zncfusey
DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision
[article]
2021
arXiv
pre-print
Specifically, we propose a self-ensembling framework where instance segmentation and semantic correspondence are jointly guided by a structured teacher in addition to the bounding box supervision. ...
Our best model achieves 37.9% AP on COCO instance segmentation, surpassing prior weakly supervised methods and is competitive to supervised methods. ...
C3DPO: Canonical 3d pose Faster R-CNN: Towards real-time object detection with re-
networks for non-rigid structure from motion. In ICCV, gion proposal networks. ...
arXiv:2105.06464v2
fatcat:wr5iyiqvivb3novhhkgxwk6mv4
Comprehensive Review of Deep Learning-Based 3D Point Cloud Completion Processing and Analysis
[article]
2022
arXiv
pre-print
The progress of deep learning (DL) has impressively improved the capability and robustness of point cloud completion. ...
Point cloud completion is a generation and estimation issue derived from the partial point clouds, which plays a vital role in the applications in 3D computer vision. ...
Moreover, the large 3D environment reconstruction in underground mining space to accurately monitor mining safety. 3D detection. ...
arXiv:2203.03311v2
fatcat:e2kvryolufearetp4ujlw2gwwy
« Previous
Showing results 1 — 15 out of 1,465 results