A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
Filters
KDAS-ReID: Architecture Search for Person Re-Identification via Distilled Knowledge with Dynamic Temperature
2021
Algorithms
In order to automatically design an effective Re-ID architecture, we propose a pedestrian re-identification algorithm based on knowledge distillation, called KDAS-ReID. ...
When the knowledge of the teacher model is transferred to the student model, the importance of knowledge in the teacher model will gradually decrease with the improvement of the performance of the student ...
KDAS-ReID will automatically search the CNN architecture suitable for Re-ID in its search space. ...
doi:10.3390/a14050137
doaj:87c5119db0c841d1ae92b765bc326981
fatcat:7xsvjxxi5zc3tjord4bcj27yfq
OVO: One-shot Vision Transformer Search with Online distillation
[article]
2023
arXiv
pre-print
OVO samples sub-nets for both teacher and student networks for better distillation results. ...
Although some existing methods introduce a CNN as a teacher to guide the training process by distillation, the gap between teacher and student networks would lead to sub-optimal performance. ...
In Section 3.2, we propose an online distillation method during the supernet training, which automatically samples the teacher network and student network for distillation. ...
arXiv:2212.13766v2
fatcat:sokp2chsjjc6znfqmlhkpzjoru
Search for Better Students to Learn Distilled Knowledge
[article]
2020
arXiv
pre-print
In this work, instead of designing a good student architecture manually, we propose to search for the optimal student automatically. ...
Knowledge Distillation, as a model compression technique, has received great attention. The knowledge of a well-performed teacher is distilled to a student with a small architecture. ...
In this work, we propose to search for an architecture configuration for the student automatically, instead of designing student architecture manually. ...
arXiv:2001.11612v1
fatcat:ric4dtwpqjc4dd63rezjzz6s3u
Ultra-lightweight CNN design based on neural architecture search and knowledge distillation: a novel method to build the automatic recognition model of space target ISAR images
2021
Defence Technology
In this paper, a novel method of ultra-lightweight convolution neural network (CNN) design based on neural architecture search (NAS) and knowledge distillation (KD) is proposed. ...
and knowledge distillation: A novel method to build the automatic recognition model of space target ISAR images, Defence Technology, https://doi. ...
In summary, in order to achieve a lightweight design for the ISAR image recognition model for space targets, we propose a twostage design scheme based on automatic architecture search and knowledge distillation ...
doi:10.1016/j.dt.2021.04.014
fatcat:3jw72ohgjjdadlnf44lxkcrlhm
Differentiable Feature Aggregation Search for Knowledge Distillation
[article]
2020
arXiv
pre-print
Some recent works introduce multi-teacher distillation to provide more supervision to the student network. ...
Knowledge distillation has become increasingly important in model compression. ...
Bridge Loss for Feature Aggregation Search: To search for an appropriate feature aggregation for the knowledge distillation, we introduce the bridge loss to connect the teacher and student networks, where ...
arXiv:2008.00506v1
fatcat:yiftrm5f5zgqpl7nfkoiaydgbi
Towards Oracle Knowledge Distillation with Neural Architecture Search
2020
PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE TWENTY-EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE
Specifically, we employ a neural architecture search technique to augment useful structures and operations, where the searched network is appropriate for knowledge distillation towards student models and ...
We also introduce an oracle knowledge distillation loss to facilitate model search and distillation using an ensemble-based teacher model, where a student network is learned to imitate oracle performance ...
Acknowledgments We truly thank Tackgeun You for helpful discussion. This work was partly supported by Sam- ...
doi:10.1609/aaai.v34i04.5866
fatcat:e52lnfxj5jdwlddqsoo7hflrta
Towards Oracle Knowledge Distillation with Neural Architecture Search
[article]
2019
arXiv
pre-print
Specifically, we employ a neural architecture search technique to augment useful structures and operations, where the searched network is appropriate for knowledge distillation towards student models and ...
We also introduce an oracle knowledge distillation loss to facilitate model search and distillation using an ensemble-based teacher model, where a student network is learned to imitate oracle performance ...
Acknowledgments We truly thank Tackgeun You for helpful discussion. ...
arXiv:1911.13019v1
fatcat:pqxlzkumibgxzewmszam6y63cq
AUTOKD: Automatic Knowledge Distillation Into A Student Architecture Family
[article]
2021
arXiv
pre-print
While Knowledge Distillation (KD) theoretically enables small student models to emulate larger teacher models, in practice selecting a good student architecture requires considerable human expertise. ...
In this paper, we propose to instead search for a family of student architectures sharing the property of being good at learning from a given teacher. ...
for knowledge distillation. ...
arXiv:2111.03555v1
fatcat:bnk5rcz6ynh6xleg3u5swtcewe
DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization
[article]
2022
arXiv
pre-print
Yet, in KD, automatically searching an optimal distillation scheme has not yet been well explored. ...
In the distillation stage, DistPro adopts the learned processes for knowledge distillation, which significantly improves the student accuracy especially when faster training is required. ...
Introduction Knowledge distillation (KD) is proposed to effectively transfer knowledge from a well performing larger/teacher deep neural network (DNN) to a given smaller/ student network, where the learned ...
arXiv:2204.05547v1
fatcat:e2ek7xpdtzg27jafdiqorgpe6e
Two-Stage Model Compression and Acceleration: Optimal Student Network for Better Performance
2020
IEEE Access
The knowledge distillation framework mainly includes the teacher network and student network. ...
But knowledge distillation requires a network with a specific optimized structure as a student network, and the knowledge distillation is difficult to extend to other neural network structures. ...
For more information, see https://creativecommons.org/licenses/by/4.0/ Kernel matrix FIGURE 6. ...
doi:10.1109/access.2020.3040823
fatcat:ueed7mlhjnd4vk2lhtsx7fqanq
Teachers Do More Than Teach: Compressing Image-to-Image Models
[article]
2021
arXiv
pre-print
In this work, we aim to address these issues by introducing a teacher network that provides a search space in which efficient network architectures can be found, in addition to performing knowledge distillation ...
Finally, we propose to distill knowledge through maximizing feature similarity between teacher and student via an index named Global Kernel Alignment (GKA). ...
For the layers used for knowledge distillation between teacher and student networks, we follow the same strategy as Li et al. [36] . ...
arXiv:2103.03467v2
fatcat:d3rjuwhsdbbsvfna53i3vikmbi
Scene-adaptive Knowledge Distillation for Sequential Recommendation via Differentiable Architecture Search
[article]
2022
arXiv
pre-print
Specifically, we introduce a target-oriented distillation loss to guide the structure search process for finding the student network architecture, and a cost-sensitive loss as constraints for model size ...
To realize such a goal, we propose AdaRec, a knowledge distillation (KD) framework which compresses knowledge of a teacher model into a student model adaptively according to its recommendation scene by ...
Specifically, we devise a target-oriented knowledge distillation loss to provide search hints for searching the architecture of student network, and an efficiency-aware loss as search constraints for constraining ...
arXiv:2107.07173v2
fatcat:gjveklueevdrrimfwtcbf5ixla
A lightweight network for photovoltaic cell defect detection in electroluminescence images based on neural architecture search and knowledge distillation
[article]
2023
arXiv
pre-print
To solve these problems, we propose a novel lightweight high-performance model for automatic defect detection of PV cells in electroluminescence(EL) images based on neural architecture search and knowledge ...
To improve the overall performance of the searched lightweight model, we further transfer the knowledge learned by the existing pre-trained large-scale model based on knowledge distillation. ...
Knowledge distillation is one of the most effective methods for model compression. It enables the transfer of knowledge from a teacher model to a student model. ...
arXiv:2302.07455v1
fatcat:ktmvxtmnibaftc6shug3ve2hpe
Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation
2020
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Neural Architecture Search (NAS), aiming at automatically designing network architectures by machines, is expected to bring about a new revolution in machine learning. ...
Therefore, we propose to distill the neural architecture (DNA) knowledge from a teacher model to supervise our block-wise architecture search, which significantly improves the effectiveness of NAS. ...
Acknowledgements We thank DarkMatter AI Research team for providing computational resources. C. Li ...
doi:10.1109/cvpr42600.2020.00206
dblp:conf/cvpr/LiPYWLLC20
fatcat:fvyx35ctwjbg7ai2xcpt5cbxjq
Blockwisely Supervised Neural Architecture Search with Knowledge Distillation
[article]
2020
arXiv
pre-print
Neural Architecture Search (NAS), aiming at automatically designing network architectures by machines, is hoped and expected to bring about a new revolution in machine learning. ...
Moreover, we find that the knowledge of a network model lies not only in the network parameters but also in the network architecture. ...
Acknowledgements We thank DarkMatter AI Research team for providing computational resources. C. Li ...
arXiv:1911.13053v2
fatcat:ui34d6v6xfbl7k3zij4c2kio74
« Previous
Showing results 1 — 15 out of 12,336 results