Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








12,336 Hits in 4.1 sec

KDAS-ReID: Architecture Search for Person Re-Identification via Distilled Knowledge with Dynamic Temperature

Zhou Lei, Kangkang Yang, Kai Jiang, Shengbo Chen
2021 Algorithms  
In order to automatically design an effective Re-ID architecture, we propose a pedestrian re-identification algorithm based on knowledge distillation, called KDAS-ReID.  ...  When the knowledge of the teacher model is transferred to the student model, the importance of knowledge in the teacher model will gradually decrease with the improvement of the performance of the student  ...  KDAS-ReID will automatically search the CNN architecture suitable for Re-ID in its search space.  ... 
doi:10.3390/a14050137 doaj:87c5119db0c841d1ae92b765bc326981 fatcat:7xsvjxxi5zc3tjord4bcj27yfq

OVO: One-shot Vision Transformer Search with Online distillation [article]

Zimian Wei, Hengyue Pan, Xin Niu, Dongsheng Li
2023 arXiv   pre-print
OVO samples sub-nets for both teacher and student networks for better distillation results.  ...  Although some existing methods introduce a CNN as a teacher to guide the training process by distillation, the gap between teacher and student networks would lead to sub-optimal performance.  ...  In Section 3.2, we propose an online distillation method during the supernet training, which automatically samples the teacher network and student network for distillation.  ... 
arXiv:2212.13766v2 fatcat:sokp2chsjjc6znfqmlhkpzjoru

Search for Better Students to Learn Distilled Knowledge [article]

Jindong Gu, Volker Tresp
2020 arXiv   pre-print
In this work, instead of designing a good student architecture manually, we propose to search for the optimal student automatically.  ...  Knowledge Distillation, as a model compression technique, has received great attention. The knowledge of a well-performed teacher is distilled to a student with a small architecture.  ...  In this work, we propose to search for an architecture configuration for the student automatically, instead of designing student architecture manually.  ... 
arXiv:2001.11612v1 fatcat:ric4dtwpqjc4dd63rezjzz6s3u

Ultra-lightweight CNN design based on neural architecture search and knowledge distillation: a novel method to build the automatic recognition model of space target ISAR images

Hong Yang, Ya-sheng Zhang, Can-bin Yin, Wen-zhe Ding
2021 Defence Technology  
In this paper, a novel method of ultra-lightweight convolution neural network (CNN) design based on neural architecture search (NAS) and knowledge distillation (KD) is proposed.  ...  and knowledge distillation: A novel method to build the automatic recognition model of space target ISAR images, Defence Technology, https://doi.  ...  In summary, in order to achieve a lightweight design for the ISAR image recognition model for space targets, we propose a twostage design scheme based on automatic architecture search and knowledge distillation  ... 
doi:10.1016/j.dt.2021.04.014 fatcat:3jw72ohgjjdadlnf44lxkcrlhm

Differentiable Feature Aggregation Search for Knowledge Distillation [article]

Yushuo Guan, Pengyu Zhao, Bingxuan Wang, Yuanxing Zhang, Cong Yao, Kaigui Bian, Jian Tang
2020 arXiv   pre-print
Some recent works introduce multi-teacher distillation to provide more supervision to the student network.  ...  Knowledge distillation has become increasingly important in model compression.  ...  Bridge Loss for Feature Aggregation Search: To search for an appropriate feature aggregation for the knowledge distillation, we introduce the bridge loss to connect the teacher and student networks, where  ... 
arXiv:2008.00506v1 fatcat:yiftrm5f5zgqpl7nfkoiaydgbi

Towards Oracle Knowledge Distillation with Neural Architecture Search

Minsoo Kang, Jonghwan Mun, Bohyung Han
2020 PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE  
Specifically, we employ a neural architecture search technique to augment useful structures and operations, where the searched network is appropriate for knowledge distillation towards student models and  ...  We also introduce an oracle knowledge distillation loss to facilitate model search and distillation using an ensemble-based teacher model, where a student network is learned to imitate oracle performance  ...  Acknowledgments We truly thank Tackgeun You for helpful discussion. This work was partly supported by Sam-  ... 
doi:10.1609/aaai.v34i04.5866 fatcat:e52lnfxj5jdwlddqsoo7hflrta

Towards Oracle Knowledge Distillation with Neural Architecture Search [article]

Minsoo Kang, Jonghwan Mun, Bohyung Han
2019 arXiv   pre-print
Specifically, we employ a neural architecture search technique to augment useful structures and operations, where the searched network is appropriate for knowledge distillation towards student models and  ...  We also introduce an oracle knowledge distillation loss to facilitate model search and distillation using an ensemble-based teacher model, where a student network is learned to imitate oracle performance  ...  Acknowledgments We truly thank Tackgeun You for helpful discussion.  ... 
arXiv:1911.13019v1 fatcat:pqxlzkumibgxzewmszam6y63cq

AUTOKD: Automatic Knowledge Distillation Into A Student Architecture Family [article]

Roy Henha Eyono, Fabio Maria Carlucci, Pedro M Esperança, Binxin Ru, Phillip Torr
2021 arXiv   pre-print
While Knowledge Distillation (KD) theoretically enables small student models to emulate larger teacher models, in practice selecting a good student architecture requires considerable human expertise.  ...  In this paper, we propose to instead search for a family of student architectures sharing the property of being good at learning from a given teacher.  ...  for knowledge distillation.  ... 
arXiv:2111.03555v1 fatcat:bnk5rcz6ynh6xleg3u5swtcewe

DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization [article]

Xueqing Deng, Dawei Sun, Shawn Newsam, Peng Wang
2022 arXiv   pre-print
Yet, in KD, automatically searching an optimal distillation scheme has not yet been well explored.  ...  In the distillation stage, DistPro adopts the learned processes for knowledge distillation, which significantly improves the student accuracy especially when faster training is required.  ...  Introduction Knowledge distillation (KD) is proposed to effectively transfer knowledge from a well performing larger/teacher deep neural network (DNN) to a given smaller/ student network, where the learned  ... 
arXiv:2204.05547v1 fatcat:e2ek7xpdtzg27jafdiqorgpe6e

Two-Stage Model Compression and Acceleration: Optimal Student Network for Better Performance

Jialiang Tang, Ning Jiang, Wenxin Yu, Jinjia Zhou, Liuwei Mai
2020 IEEE Access  
The knowledge distillation framework mainly includes the teacher network and student network.  ...  But knowledge distillation requires a network with a specific optimized structure as a student network, and the knowledge distillation is difficult to extend to other neural network structures.  ...  For more information, see https://creativecommons.org/licenses/by/4.0/ Kernel matrix FIGURE 6.  ... 
doi:10.1109/access.2020.3040823 fatcat:ueed7mlhjnd4vk2lhtsx7fqanq

Teachers Do More Than Teach: Compressing Image-to-Image Models [article]

Qing Jin, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov
2021 arXiv   pre-print
In this work, we aim to address these issues by introducing a teacher network that provides a search space in which efficient network architectures can be found, in addition to performing knowledge distillation  ...  Finally, we propose to distill knowledge through maximizing feature similarity between teacher and student via an index named Global Kernel Alignment (GKA).  ...  For the layers used for knowledge distillation between teacher and student networks, we follow the same strategy as Li et al. [36] .  ... 
arXiv:2103.03467v2 fatcat:d3rjuwhsdbbsvfna53i3vikmbi

Scene-adaptive Knowledge Distillation for Sequential Recommendation via Differentiable Architecture Search [article]

Lei Chen, Fajie Yuan, Jiaxi Yang, Min Yang, Chengming Li
2022 arXiv   pre-print
Specifically, we introduce a target-oriented distillation loss to guide the structure search process for finding the student network architecture, and a cost-sensitive loss as constraints for model size  ...  To realize such a goal, we propose AdaRec, a knowledge distillation (KD) framework which compresses knowledge of a teacher model into a student model adaptively according to its recommendation scene by  ...  Specifically, we devise a target-oriented knowledge distillation loss to provide search hints for searching the architecture of student network, and an efficiency-aware loss as search constraints for constraining  ... 
arXiv:2107.07173v2 fatcat:gjveklueevdrrimfwtcbf5ixla

A lightweight network for photovoltaic cell defect detection in electroluminescence images based on neural architecture search and knowledge distillation [article]

Jinxia Zhang, Xinyi Chen, Haikun Wei, Kanjian Zhang
2023 arXiv   pre-print
To solve these problems, we propose a novel lightweight high-performance model for automatic defect detection of PV cells in electroluminescence(EL) images based on neural architecture search and knowledge  ...  To improve the overall performance of the searched lightweight model, we further transfer the knowledge learned by the existing pre-trained large-scale model based on knowledge distillation.  ...  Knowledge distillation is one of the most effective methods for model compression. It enables the transfer of knowledge from a teacher model to a student model.  ... 
arXiv:2302.07455v1 fatcat:ktmvxtmnibaftc6shug3ve2hpe

Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation

Changlin Li, Jiefeng Peng, Liuchun Yuan, Guangrun Wang, Xiaodan Liang, Liang Lin, Xiaojun Chang
2020 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)  
Neural Architecture Search (NAS), aiming at automatically designing network architectures by machines, is expected to bring about a new revolution in machine learning.  ...  Therefore, we propose to distill the neural architecture (DNA) knowledge from a teacher model to supervise our block-wise architecture search, which significantly improves the effectiveness of NAS.  ...  Acknowledgements We thank DarkMatter AI Research team for providing computational resources. C. Li  ... 
doi:10.1109/cvpr42600.2020.00206 dblp:conf/cvpr/LiPYWLLC20 fatcat:fvyx35ctwjbg7ai2xcpt5cbxjq

Blockwisely Supervised Neural Architecture Search with Knowledge Distillation [article]

Changlin Li, Jiefeng Peng, Liuchun Yuan, Guangrun Wang, Xiaodan Liang, Liang Lin, Xiaojun Chang
2020 arXiv   pre-print
Neural Architecture Search (NAS), aiming at automatically designing network architectures by machines, is hoped and expected to bring about a new revolution in machine learning.  ...  Moreover, we find that the knowledge of a network model lies not only in the network parameters but also in the network architecture.  ...  Acknowledgements We thank DarkMatter AI Research team for providing computational resources. C. Li  ... 
arXiv:1911.13053v2 fatcat:ui34d6v6xfbl7k3zij4c2kio74
« Previous Showing results 1 — 15 out of 12,336 results