Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Self-Supervised Joint Learning Framework of Depth Estimation via Implicit Cues release_ge725spuinhs7asodxkskts2xq

by Jianrong Wang and Ge Zhang and Zhenyu Wu and XueWei Li and Li Liu

Released as a article .

2020  

Abstract

In self-supervised monocular depth estimation, the depth discontinuity and motion objects' artifacts are still challenging problems. Existing self-supervised methods usually utilize a single view to train the depth estimation network. Compared with static views, abundant dynamic properties between video frames are beneficial to refined depth estimation, especially for dynamic objects. In this work, we propose a novel self-supervised joint learning framework for depth estimation using consecutive frames from monocular and stereo videos. The main idea is using an implicit depth cue extractor which leverages dynamic and static cues to generate useful depth proposals. These cues can predict distinguishable motion contours and geometric scene structures. Furthermore, a new high-dimensional attention module is introduced to extract clear global transformation, which effectively suppresses uncertainty of local descriptors in high-dimensional space, resulting in a more reliable optimization in learning framework. Experiments demonstrate that the proposed framework outperforms the state-of-the-art(SOTA) on KITTI and Make3D datasets.
In text/plain format

Archived Files and Locations

application/pdf  1.1 MB
file_3iwipza7arf5bjhen2n25a2bem
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2020-06-17
Version   v1
Language   en ?
arXiv  2006.09876v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: e4a94f5f-47e4-449b-a9f1-06fb4c69efd2
API URL: JSON