Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

5,220 Hits in 2.7 sec

MoE-SPNet: A Mixture-of-Experts Scene Parsing Network [article]

Huan Fu, Mingming Gong, Chaohui Wang, Dacheng Tao
2018 arXiv   pre-print
multi-scale features and the spatial inhomogeneity of a scene.  ...  Recently, methods based on fully convolutional neural networks have achieved new records on scene parsing.  ...  aggregating these features for a better solution for scene parsing.  ... 
arXiv:1806.07049v1 fatcat:hrn7trkgizhp5mlwknybinn3ou

CaseNet: Content-Adaptive Scale Interaction Networks for Scene Parsing [article]

Xin Jin, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zhibo Chen
2020 arXiv   pre-print
In this paper, we propose a Content-Adaptive Scale Interaction Network (CASINet) to exploit the multi-scale features for scene parsing.  ...  We achieve state-of-the-art performance on three scene parsing benchmarks Cityscapes, ADE20K and LIP.  ...  To address the above problems, we propose a Content-Adaptive Scale Interaction Network (CASINet) to adaptively exploit multi-scale features for scene parsing.  ... 
arXiv:1904.08170v3 fatcat:rxeulrn225gerdd5mf5mgamqum

Dual Graph-Based Context Aggregation for Scene Parsing

Mengyu Liu, Hujun Yin
2021 British Machine Vision Conference  
Exploiting global contextual information has been shown useful for improving performance of scene parsing and hence is widely used.  ...  In this paper, unlike previous work that captures long-range dependencies with multi-scale feature fusion or attention mechanism, we address the scene parsing tasks by aggregating rich contextual information  ...  LIU AND YIN: DUAL GRAPH-BASED CONTEXT AGGREGATION FOR SCENE PARSING  ... 
dblp:conf/bmvc/LiuY21 fatcat:xbwdb4pkvvgijdwdh7bxsslcuy

Global-residual and Local-boundary Refinement Networks for Rectifying Scene Parsing Predictions

Rui Zhang, Sheng Tang, Min Lin, Jintao Li, Shuicheng Yan
2017 Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence  
Extensive experiments on ADE20K and Cityscapes datasets well demonstrate the effectiveness of the two refinement methods for refining scene parsing predictions.  ...  Most of existing scene parsing methods suffer from the serious problems of both inconsistent parsing results and object boundary shift.  ...  neural network within a uniform framework for scene parsing.  ... 
doi:10.24963/ijcai.2017/479 dblp:conf/ijcai/ZhangTLLY17 fatcat:2zclvahdt5f3thmvxuchozngui

PSANet: Point-wise Spatial Attention Network for Scene Parsing [chapter]

Hengshuang Zhao, Yi Zhang, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia
2018 Lecture Notes in Computer Science  
Each position on the feature map is connected to all the other ones through a self-adaptively learned attention mask. Moreover, information propagation in bi-direction for scene parsing is enabled.  ...  complex scenes.  ...  We thank Sensetime Research for providing computing resources.  ... 
doi:10.1007/978-3-030-01240-3_17 fatcat:mqwwq7mr5nfwxigtdibwwgobey

Adaptive Context Network for Scene Parsing [article]

Jun Fu, Jing Liu, Yuhang Wang, Yong Li, Yongjun Bao, Jinhui Tang, Hanqing Lu
2019 arXiv   pre-print
Recent works attempt to improve scene parsing performance by exploring different levels of contexts, and typically train a well-designed convolutional network to exploit useful contexts across all pixels  ...  Furthermore, we import multiple such modules to build several adaptive context blocks in different levels of network to obtain a coarse-to-fine result.  ...  Most recent approaches for scene parsing are based on Fully Convolutional Networks (FCNs) [24] . However, there are two limitations in FCN framworks.  ... 
arXiv:1911.01664v1 fatcat:mdw3adrzkbfe7hfw2akvlo45oi

3D Scene Parsing via Class-Wise Adaptation [article]

Daichi Ono, Hiroyuki Yabe, Tsutomu Horikawa
2019 arXiv   pre-print
We propose the method that uses only computer graphics datasets to parse the real world 3D scenes. 3D scene parsing based on semantic segmentation is required to implement the categorical interaction in  ...  Our application performs accurate 3D scene parsing in real-time on an actual room.  ...  They all evaluate performance on the dataset of driving scene parsing. Our approach aims for indoor scene parsing which has domain shift between computer graphics data and real data.  ... 
arXiv:1812.03622v2 fatcat:f3p3x42u5vcfre4d7tooh2hkoq

Multi-Branch Adaptive Hard Region Mining Network for Urban Scene Parsing of High-Resolution Remote-Sensing Images

Haiwei Bai, Jian Cheng, Yanzhou Su, Qi Wang, Haoran Han, Yijie Zhang
2022 Remote Sensing  
To deal with these dilemmas, in this paper, we propose a multi-branch adaptive hard region mining network (MBANet) for urban scene parsing of HRRSIs.  ...  In our experiments, the three branches complemented each other in feature extraction and demonstrated state-of-the-art performance for urban scene parsing of HRRSIs.  ...  Conclusions This paper proposes a multi-branch adaptive hard region mining network for urban scene parsing of HRRSIs.  ... 
doi:10.3390/rs14215527 fatcat:zfpbtfi3vre3fpx6sehgwc7wte

Traffic Scene Parsing through the TSP6K Dataset [article]

Peng-Tao Jiang, Yuqi Yang, Yang Cao, Qibin Hou, Ming-Ming Cheng, Chunhua Shen
2024 arXiv   pre-print
Furthermore, considering the vast difference in instance sizes, we propose a detail refining decoder for scene parsing, which recovers the details of different semantic regions in traffic scenes owing  ...  We perform a detailed analysis of the dataset and comprehensively evaluate previous popular scene parsing methods, instance segmentation methods and unsupervised domain adaption methods.  ...  [57] first proposed a fully convolutional network (FCN) that generates dense predictions for scene parsing.  ... 
arXiv:2303.02835v2 fatcat:jyfdzowu4jhzfhhgby5vje4iue

Video Scene Parsing with Predictive Feature Learning [article]

Xiaojie Jin, Xin Li, Huaxin Xiao, Xiaohui Shen, Zhe Lin, Jimei Yang, Yunpeng Chen, Jian Dong, Luoqi Liu, Zequn Jie, Jiashi Feng, Shuicheng Yan
2016 arXiv   pre-print
steering parsing architecture that effectively adapts the learned spatiotemporal features to scene parsing tasks and provides strong guidance for any off-the-shelf parsing model to achieve better video  ...  scene parsing performance.  ...  Related Work Recent image scene parsing progress is mostly stimulated by various new CNN architectures, including the fully convolutional architecture (FCN) with multi-scale or larger receptive fields  ... 
arXiv:1612.00119v2 fatcat:penm6nsyubevhptaeuar6awx7y

Deep Multiphase Level Set for Scene Parsing [article]

Pingping Zhang and Wei Liu and Yinjie Lei and Hongyu Wang and Huchuan Lu
2019 arXiv   pre-print
Recently, Fully Convolutional Network (FCN) seems to be the go-to architecture for image segmentation, including semantic scene parsing.  ...  To address these limitations, in this paper we propose a novel Deep Multiphase Level Set (DMLS) method for semantic scene parsing, which efficiently incorporates multiphase level sets into deep neural  ...  In summary, our contributions are three folds: • We propose a novel MLS framework for large-scale multi-class scene parsing.  ... 
arXiv:1910.03166v2 fatcat:7viff6ta75hy7fgg5rnxcya2wa

FoveaNet: Perspective-aware Urban Scene Parsing [article]

Xin Li, Zequn Jie, Wei Wang, Changsong Liu, Jimei Yang, Xiaohui Shen, Zhe Lin, Qiang Chen, Shuicheng Yan, Jiashi Feng
2017 arXiv   pre-print
Most of the current solutions employ generic image parsing models that treat all scales and locations in the images equally and do not consider the geometry property of car-captured urban scene images.  ...  Thus, they suffer from heterogeneous object scales caused by perspective projection of cameras on actual scenes and inevitably encounter parsing failures on distant objects as well as other boundary and  ...  FCN in FoveaNet FoveaNet is based on the fully convolutional network (FCN) [25] for parsing the images. As a deeper CNN model benefits more for the parsing performance, we here follow Chen et al.  ... 
arXiv:1708.02421v1 fatcat:r57jbwtnsrbb3oodjlgf7tmc44

MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing [article]

Jiashuo Yu, Ying Cheng, Rui-Wei Zhao, Rui Feng, Yuejie Zhang
2022 arXiv   pre-print
Since events may occur in auditory and visual modalities, multimodal detailed perception is essential for complete scene comprehension.  ...  However, they do not consider semantic information at multiple scales, which makes the model difficult to localize events in different lengths.  ...  Our model captures and integrates multimodal pyramid features in distinct temporal scales for comprehensive scene understanding.  ... 
arXiv:2111.12374v2 fatcat:urygwtn3bjgednej3vz5gvm77i

Nonparametric scene parsing with deep convolutional features and dense alignment

Chih-Hao Ma, Chiou-Ting Hsu, Benoit Huet
2015 2015 IEEE International Conference on Image Processing (ICIP)  
Index Terms -scene parsing; object window; deep convolutional network; SIFT flow 1.  ...  In this paper, we focus on improving scene parsing accuracy in these two issues.  ...  This demonstrates the effectiveness of using deep convolutional features as visual scene descriptor.  ... 
doi:10.1109/icip.2015.7351134 dblp:conf/icip/MaHH15 fatcat:6wbatlo6onfvvdcto2zhmhuz3y

Learning deep representations for semantic image parsing: a comprehensive overview

Lili Huang, Jiefeng Peng, Ruimao Zhang, Guanbin Li, Liang Lin
2018 Frontiers of Computer Science  
Specifically, we first review the general frameworks for each task and introduce the relevant variants. The advantages and limitations of each method are also discussed.  ...  Finally, we explore the future trends and challenges of semantic image parsing.  ...  Zhao et al. proposed a superior framework−−pyramid scene parsing network (PSPNet) [1] −−for scene parsing in complex scenes.  ... 
doi:10.1007/s11704-018-7195-8 fatcat:p5hvfwhl5rbork5vf4rpnx3h6u
« Previous Showing results 1 — 15 out of 5,220 results