A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Multiple Anchor Learning for Visual Object Detection
[article]
2019
arXiv
pre-print
In this paper, we propose a Multiple Instance Learning (MIL) approach that selects anchors and jointly optimizes the two modules of a CNN-based object detector. ...
Classification and localization are two pillars of visual object detectors. ...
Conclusion We have proposed an elegant and effective training approach, referred to as Multiple Anchor Learning (MAL), for visual object detection. ...
arXiv:1912.02252v1
fatcat:nwe35ue2nfhmjpbg772weinvtq
Multiple Anchor Learning for Visual Object Detection
2020
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
In this paper, we propose a Multiple Instance Learning (MIL) approach that selects anchors and jointly optimizes the two modules of a CNN-based object detector. ...
Classification and localization are two pillars of visual object detectors. ...
Conclusion We have proposed an elegant and effective training approach, referred to as Multiple Anchor Learning (MAL), for visual object detection. ...
doi:10.1109/cvpr42600.2020.01022
dblp:conf/cvpr/KeZHYLH20
fatcat:xhsmmkxvlrdsfmwgt63cfw6olu
FreeAnchor: Learning to Match Anchors for Visual Object Detection
[article]
2019
arXiv
pre-print
Modern CNN-based object detectors assign anchors for ground-truth objects under the restriction of object-anchor Intersection-over-Unit (IoU). ...
In this study, we propose a learning-to-match approach to break IoU restriction, allowing objects to match anchors in a flexible manner. ...
This provides a fresh insight for the visual object detection problem. Acnkowledgement. ...
arXiv:1909.02466v2
fatcat:fj2mh5q2ize53kzmohrkrhusuq
NL-FCOS: Improving FCOS through Non-Local Modules for Object Detection
[article]
2022
arXiv
pre-print
An object detection methodology closer to the natural model is anchor-free detection, where models like FCOS or Centernet have shown competitive results, but these have not yet exploited the concept of ...
In addition, using anchors to fit bounding boxes seems far from how our visual system does the same visual task. ...
Object detection is one of the computer vision tasks with multiple industry applications. Its goal is to localize and classify objects in an image or video. ...
arXiv:2203.15638v1
fatcat:zvxls4u3dffglhjzhb6vsfrw7a
Unifying Visual Perception by Dispersible Points Learning
[article]
2022
arXiv
pre-print
We present a conceptually simple, flexible, and universal visual perception head for variant visual tasks, e.g., classification, object detection, instance segmentation and pose estimation, and different ...
The method, called UniHead, views different visual perception tasks as the dispersible points learning via the transformer encoder architecture. ...
Then, for an anchor point, UniHead obtains multiple points via dispersible points learning. ...
arXiv:2208.08630v2
fatcat:qztzwohhb5df5jhj7kfcmhy77i
Learning the semantic structure of objects from Web supervision
[article]
2021
arXiv
pre-print
Recognizing object parts and attributes has been extensively studied before, yet learning large space of such concepts remains elusive due to the high cost of providing detailed object annotations for ...
We also show that the resulting embedding provides a visually-intuitive mechanism to navigate the learned concepts and their corresponding images. ...
We are grateful for support by XRCE and ERC StG 638009-IDIU. ...
arXiv:1607.01205v2
fatcat:zepwfyx3krft5ag4lxkbw2vt6m
Online Descriptor Enhancement via Self-Labelling Triplets for Visual Data Association
[article]
2021
arXiv
pre-print
We propose a self-supervised method for incrementally refining visual descriptors to improve performance in the task of object-level visual data association. ...
descriptors for the multi-object tracking task. ...
INTRODUCTION We are interested in matching visual object detections across temporally separated frames -a fundamental capability for a wide range of applications in robotics and computer vision such as ...
arXiv:2011.10471v2
fatcat:eaqtsmj77zeexl3sozttukway4
Learning the Structure of Objects from Web Supervision
[chapter]
2016
Lecture Notes in Computer Science
object annotations for supervision. ...
We also show that the resulting embedding provides a visually-intuitive mechanism to navigate the learned concepts and their corresponding images. ...
We would like to thank Xerox Research Center Europe and ERC 677195-IDIU for supporting this research. ...
doi:10.1007/978-3-319-49409-8_19
fatcat:rkpj4dmjdndfjg6ud5h44dsqke
re-OBJ: Jointly Learning the Foreground and Background for Object Instance Re-identification
[chapter]
2019
Lecture Notes in Computer Science
However, learning appearances of the objects alone might fail when there are multiple objects with similar appearance or multiple instances of same object class present in the scene. ...
We demonstrate the effectiveness of our joint visual feature in the re-identification of objects in the ScanNet dataset and show a relative improvement of around 28.25% in the rank-1 accuracy over the ...
Object Visual Encoding For each object of the input images, we create two sets of images F = {I f , I b }. ...
doi:10.1007/978-3-030-30645-8_37
fatcat:5m6fe377wzcstmkfwlhcs445va
Scale-aware Anchor-free Object Detection via Curriculum Learning for Remote Sensing Images
2021
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Index Terms-Remote sensing images, anchor-free object detection, feature pyramid structure, foreground attention, curriculum learning. ...
In this paper, to address the above challenges, we propose a novel RSI anchor-free object detection framework that consists of two key components: a cross-channel feature pyramid network (CFPN) and multiple ...
Besides, Fig. 10 shows the visual detection results of different methods on DIOR. It can be observed that our method shows better visualization results for the object detection task in RSIs. ...
doi:10.1109/jstars.2021.3115796
fatcat:2er7ee6wrncidj5i2nalqkleoq
Detect-and-describe: Joint learning framework for detection and description of objects
2019
MATEC Web of Conferences
DaD is a deep learning-based approach that extends object detection to object attribute prediction as well. We train our model on aPascal train set and evaluate our approach on aPascal test set. ...
We also show qualitative results for object attribute prediction on unseen objects, which demonstrate the effectiveness of our approach for describing unknown objects. ...
Joint end-to end learning has multiple advantages over distinct learning. Firstly, simultaneous detection and attribute inference provide additional information about the identified object. ...
doi:10.1051/matecconf/201927702028
fatcat:eiten2xplfg6dms7zvoqh5qgpa
Object Detection in Videos with Tubelet Proposal Networks
2017
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Different from object detection in static images, temporal information in videos is vital for object detection. ...
) network that incorporates temporal information from tubelet proposals for achieving high object detection accuracy in videos. ...
Object detection in videos. Since the introduction of the VID task by the ImageNet challenge, there have been multiple object detection systems for detecting objects in videos. ...
doi:10.1109/cvpr.2017.101
dblp:conf/cvpr/KangLXOYLW17
fatcat:rjjgoxmpfnejtc3llowizqw7qa
A Self Validation Network for Object-Level Human Attention Estimation
[article]
2019
arXiv
pre-print
Due to the foveated nature of the human vision system, people can focus their visual attention on a small region of their visual field at a time, which usually contains only a single object. ...
A straightforward solution for this problem is to pick the object whose bounding box is hit by the gaze, where eye gaze point estimation is obtained from a traditional eye gaze estimator and object candidates ...
Research, the College of Arts and Sciences, and the School of Informatics, Computing, and Engineering through the Emerging Areas of Research Project "Learning: Brains, Machines, and Children." ...
arXiv:1910.14260v2
fatcat:dr2xqgxy75cgngjzqkpheorcfa
Classifying All Interacting Pairs in a Single Shot
[article]
2020
arXiv
pre-print
In detail, interaction classification is achieved on a dense grid of anchors thanks to a joint multi-task network that learns three complementary tasks simultaneously: (i) prediction of the types of interaction ...
In this paper, we introduce a novel human interaction detection approach, based on CALIPSO (Classifying ALl Interacting Pairs in a Single shOt), a classifier of human-object interactions. ...
[23] learn a visual relation representation combining compositional representation for subject, target and predicate with a visual phrase representation for HOI detection. ...
arXiv:2001.04360v1
fatcat:wcbtmeonbreftcvjzm4kz4vfgu
Classifying All Interacting Pairs in a Single Shot
2020
2020 IEEE Winter Conference on Applications of Computer Vision (WACV)
In detail, interaction classification is achieved on a dense grid of anchors thanks to a joint multi-task network that learns three complementary tasks simultaneously: (i) prediction of the types of interaction ...
State-ofthe-art approaches adopt a multi-shot strategy based on a pairwise estimate of interactions for a set of human-object candidate pairs, which leads to a complexity depending, at least, on the number ...
[23] learn a visual relation representation combining compositional representation for subject, target and predicate with a visual phrase representation for HOI detection. ...
doi:10.1109/wacv45572.2020.9093509
dblp:conf/wacv/ChafikOAL20
fatcat:paiqazymsravlaioakbi3ew2sy
« Previous
Showing results 1 — 15 out of 52,609 results