ABSTRACT
The rapid development of AI applications powered by deep learning in edge devices boosts the opportunity for real-time health monitoring. To address the potential privacy concern in the inference phase, homomorphic encryption (HE) is an alternative solution that encrypts inference data without exposing raw data and has several distinct advantages, (i.e., single-round communication, lightweight bandwidth consumption, and non-interactive computation). However, the computational overhead on the current HE-based privacy-preserving inference necessitates a substantial amount of time, which is not feasible for some real-time applications on edge devices. To address this issue, we propose CNN-guardian, a unified and compact neural network structure for real-time inference in HE-based inference on edge GPU. CNN-guardian designs a HE-friendly neural network and GPU engine that optimizes HE operations to accelerate the inference in the HE domain.
- Ahmad Al Badawi, Chao Jin, Jie Lin, Chan Fook Mun, Sim Jun Jie, B. Tan, Xiao Nan, Khin Mi Mi Aung, and Vijay Ramaseshan Chandrasekhar. 2021. Towards the AlexNet Moment for Homomorphic Encryption: HCNN, the First Homomorphic CNN on Encrypted Data With GPUs. IEEE TETIC 9 (2021), 1330--1343.Google Scholar
- Alon Brutzkus, Oren Elisha, and Ran Gilad-Bachrach. 2018. Low Latency Privacy Preserving Inference. ArXiv abs/1812.10659 (2018).Google Scholar
- Nathan Dowlin, Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter, Michael Naehrig, and John Robert Wernsing. 2016. CryptoNets: applying neural networks to encrypted data with high throughput and accuracy. In TCML. https://api.semanticscholar.org/CorpusID:217485587Google Scholar
- Hao Yang, Shiyu Shen, Siyang Jiang, Lu Zhou, Wangchen Dai, and Yunlei Zhao. 2023. XNET: A Real-Time Unified Secure Inference Framework Using Homomorphic Encryption. Cryptology ePrint Archive, Paper 2023/1428.Google Scholar
- Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, and Bingbing Ni. 2023. MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification. Scientific Data 10, 1 (2023), 41.Google ScholarCross Ref
- Zhihe Zhao, Neiwen Ling, Nan Guan, and Guoliang Xing. 2022. Aaron: Compile-time Kernel Adaptation for Multi-DNN Inference Acceleration on Edge GPU. In Sensys. 802--803.Google ScholarDigital Library
- Zhihe Zhao, Neiwen Ling, Nan Guan, and Guoliang Xing. 2023. Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on Edge GPU. arXiv preprint arXiv:2307.04339 (2023).Google Scholar
- Zhihe Zhao, Kai Wang, Neiwen Ling, and Guoliang Xing. 2021. Edgeml: An automl framework for real-time deep learning on the edge. In IOTDI. 133--144.Google ScholarDigital Library
Recommendations
Neural acceleration for GPU throughput processors
MICRO-48: Proceedings of the 48th International Symposium on MicroarchitectureGraphics Processing Units (GPUs) can accelerate diverse classes of applications, such as recognition, gaming, data analytics, weather prediction, and multimedia. Many of these applications are amenable to approximate execution. This application ...
Multifold Acceleration of Neural Network Computations Using GPU
ICANN '09: Proceedings of the 19th International Conference on Artificial Neural Networks: Part IWith emergence of graphics processing units (GPU) of the latest generation, it became possible to undertake neural network based computations using GPU on serially produced video display adapters. In this study, NVIDIA CUDA technology has been used to ...
An optimal parameter analysis and GPU acceleration of the image receptive fields neural network approach
IVCNZ '12: Proceedings of the 27th Conference on Image and Vision Computing New ZealandThe Image Receptive Fields Neural Networks (IRFNN) algorithm is a recent approach for image classification that is as accurate and an order of magnitude faster than using a traditional feed-forward neural network (multi-layer perceptron), with a linear ...
Comments