short-paper

Poster Abstract: CNN-guardian: Secure Neural Network Inference Acceleration on Edge GPU

Authors:
Qipeng Xie

Hong Kong University of Science and Technology (Guang zhou), Guang zhou, China

Hong Kong University of Science and Technology (Guang zhou), Guang zhou, China

https://orcid.org/0000-0002-5500-0249
View Profile

,
Hao Yang

Zhejiang Lab, Hang zhou, China

Zhejiang Lab, Hang zhou, China

https://orcid.org/0000-0002-9735-255X
View Profile

,
Linshan Jiang

National University of Singapore, Queenstown, Singapore

National University of Singapore, Queenstown, Singapore

https://orcid.org/0000-0001-8501-9488
View Profile

,
Zhihe Zhao

The Chinese University of Hong Kong, Hong Kong, China

The Chinese University of Hong Kong, Hong Kong, China

https://orcid.org/0000-0001-7997-1798
View Profile

,
Siyang Jiang

The Chinese University of Hong Kong, Hong Kong, China

The Chinese University of Hong Kong, Hong Kong, China

https://orcid.org/0000-0002-7700-2552
View Profile

,
Shiyu Shen

Fudan University, Shanghai, China

Fudan University, Shanghai, China

https://orcid.org/0000-0002-8274-0090
View Profile

,
Salabat Khan

Hong Kong University of Science and Technology (Guang zhou), Guangzhou, China

Hong Kong University of Science and Technology (Guang zhou), Guangzhou, China

https://orcid.org/0000-0003-1470-0529
View Profile

,
Zhe Liu

Zhejiang Lab, Zhejiang Lab, China

Zhejiang Lab, Zhejiang Lab, China

https://orcid.org/0000-0001-8578-2635
View Profile

,
Kaishun Wu

The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China

The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China

https://orcid.org/0000-0002-4972-0848
View Profile

SenSys '23: Proceedings of the 21st ACM Conference on Embedded Networked Sensor SystemsNovember 2023Pages 524–525https://doi.org/10.1145/3625687.3628394

Published:26 April 2024Publication History

SenSys '23: Proceedings of the 21st ACM Conference on Embedded Networked Sensor Systems

Pages 524–525

ABSTRACT

The rapid development of AI applications powered by deep learning in edge devices boosts the opportunity for real-time health monitoring. To address the potential privacy concern in the inference phase, homomorphic encryption (HE) is an alternative solution that encrypts inference data without exposing raw data and has several distinct advantages, (i.e., single-round communication, lightweight bandwidth consumption, and non-interactive computation). However, the computational overhead on the current HE-based privacy-preserving inference necessitates a substantial amount of time, which is not feasible for some real-time applications on edge devices. To address this issue, we propose CNN-guardian, a unified and compact neural network structure for real-time inference in HE-based inference on edge GPU. CNN-guardian designs a HE-friendly neural network and GPU engine that optimizes HE operations to accelerate the inference in the HE domain.

References

Ahmad Al Badawi, Chao Jin, Jie Lin, Chan Fook Mun, Sim Jun Jie, B. Tan, Xiao Nan, Khin Mi Mi Aung, and Vijay Ramaseshan Chandrasekhar. 2021. Towards the AlexNet Moment for Homomorphic Encryption: HCNN, the First Homomorphic CNN on Encrypted Data With GPUs. IEEE TETIC 9 (2021), 1330--1343.Google Scholar
Alon Brutzkus, Oren Elisha, and Ran Gilad-Bachrach. 2018. Low Latency Privacy Preserving Inference. ArXiv abs/1812.10659 (2018).Google Scholar
Nathan Dowlin, Ran Gilad-Bachrach, Kim Laine, Kristin E. Lauter, Michael Naehrig, and John Robert Wernsing. 2016. CryptoNets: applying neural networks to encrypted data with high throughput and accuracy. In TCML. https://api.semanticscholar.org/CorpusID:217485587Google Scholar
Hao Yang, Shiyu Shen, Siyang Jiang, Lu Zhou, Wangchen Dai, and Yunlei Zhao. 2023. XNET: A Real-Time Unified Secure Inference Framework Using Homomorphic Encryption. Cryptology ePrint Archive, Paper 2023/1428.Google Scholar
Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bilian Ke, Hanspeter Pfister, and Bingbing Ni. 2023. MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification. Scientific Data 10, 1 (2023), 41.Google ScholarCross Ref
Zhihe Zhao, Neiwen Ling, Nan Guan, and Guoliang Xing. 2022. Aaron: Compile-time Kernel Adaptation for Multi-DNN Inference Acceleration on Edge GPU. In Sensys. 802--803.Google ScholarDigital Library
Zhihe Zhao, Neiwen Ling, Nan Guan, and Guoliang Xing. 2023. Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on Edge GPU. arXiv preprint arXiv:2307.04339 (2023).Google Scholar
Zhihe Zhao, Kai Wang, Neiwen Ling, and Guoliang Xing. 2021. Edgeml: An automl framework for real-time deep learning on the edge. In IOTDI. 133--144.Google ScholarDigital Library

Recommendations

Neural acceleration for GPU throughput processors
MICRO-48: Proceedings of the 48th International Symposium on Microarchitecture

Graphics Processing Units (GPUs) can accelerate diverse classes of applications, such as recognition, gaming, data analytics, weather prediction, and multimedia. Many of these applications are amenable to approximate execution. This application ...
Read More
Multifold Acceleration of Neural Network Computations Using GPU
ICANN '09: Proceedings of the 19th International Conference on Artificial Neural Networks: Part I

With emergence of graphics processing units (GPU) of the latest generation, it became possible to undertake neural network based computations using GPU on serially produced video display adapters. In this study, NVIDIA CUDA technology has been used to ...
Read More
An optimal parameter analysis and GPU acceleration of the image receptive fields neural network approach
IVCNZ '12: Proceedings of the 27th Conference on Image and Vision Computing New Zealand

The Image Receptive Fields Neural Networks (IRFNN) algorithm is a recent approach for image classification that is as accurate and an order of magnitude faster than using a traditional feed-forward neural network (multi-layer perceptron), with a linear ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SenSys '23: Proceedings of the 21st ACM Conference on Embedded Networked Sensor Systems
November 2023
574 pages
ISBN:9798400704147
DOI:10.1145/3625687
General Chair:
Rasit Eskicioglu,
Program Chair:
Polly Huang,
Program Co-chair:
Neal Patwari
Copyright © 2023 Copyright is held by the owner/author(s). Publication rights licensed to ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 April 2024
Check for updates
Author Tags
secure CNN inference
GPU acceleration
real-time system
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate174of867submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 17
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Poster Abstract: CNN-guardian: Secure Neural Network Inference Acceleration on Edge GPU

SenSys '23: Proceedings of the 21st ACM Conference on Embedded Networked Sensor Systems

ABSTRACT

References

Cited By

Recommendations

Neural acceleration for GPU throughput processors

Multifold Acceleration of Neural Network Computations Using GPU

An optimal parameter analysis and GPU acceleration of the image receptive fields neural network approach