Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








1,663 Hits in 6.2 sec

PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections

Haojie Wang, Jidong Zhai, Mingyu Gao, Zixuan Ma, Shizhi Tang, Liyan Zheng, Yuanzhi Li, Kaiyuan Rong, Yuanyong Chen, Zhihao Jia
2021 USENIX Symposium on Operating Systems Design and Implementation  
We propose PET, the first DNN framework that optimizes tensor programs with partially equivalent transformations and automated corrections.  ...  optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels.  ...  This work is partially supported by National Natural Science Foundation of China (U20A20226, 62072262) and Beijing Natural Science Foundation (4202031).  ... 
dblp:conf/osdi/WangZGMTZLRCJ21 fatcat:4sf4vbr4avfffhwfdedpidq7oa

DeepEvolution: A Search-Based Testing Approach for Deep Neural Networks [article]

Houssem Ben Braiek, Foutse khomh
2019 arXiv   pre-print
One common denominator of these testing techniques is the automated generation of test cases, e.g., new inputs transformed from the original training data with the aim to optimize some test adequacy criteria  ...  So far, the effectiveness of these approaches has been hindered by their reliance on random fuzzing or transformations that do not always produce test cases with a good diversity.  ...  One common denominator of these testing techniques is the automated generation of test cases, e.g., new inputs transformed from the original training data with the aim to optimize some test adequacy criteria  ... 
arXiv:1909.02563v1 fatcat:afz7fvaz3zbera235glfvjueeq

Sommelier: Curating DNN Models for the Masses

Peizhen Guo, Bo Hu, Wenjun Hu
2022 Proceedings of the 2022 International Conference on Management of Data  
In this paper, we present Sommelier, an indexing and query system above typical DNN model repositories to interface directly with inference serving or other use cases.  ...  Motivated by manual iterative model search processes and typical model design strategies that generate model variants or models with common segments, Sommelier organizes DNN models based on their semantic  ...  CNS-1815115 and 2112562, and a Google Faculty Research Award.  ... 
doi:10.1145/3514221.3526173 fatcat:evlxt33ogbdevbeswanfswtvj4

Special Session: Towards an Agile Design Methodology for Efficient, Reliable, and Secure ML Systems [article]

Shail Dave, Alberto Marchisio, Muhammad Abdullah Hanif, Amira Guesmi, Aviral Shrivastava, Ihsen Alouani, Muhammad Shafique
2022 arXiv   pre-print
and secure ML systems based on user-defined constraints and objectives.  ...  This article summarizes the main challenges in agile development of efficient, reliable and secure ML systems, and then presents an outline of an agile design methodology to generate efficient, reliable  ...  Graph Optimizations: Graph-level optimizations help improve computational and memory efficiency by applying various transformations.  ... 
arXiv:2204.09514v1 fatcat:ho7auszvmferrn36evs7oqdpt4

NeuRI: Diversifying DNN Generation via Inductive Rule Inference [article]

Jiawei Liu, Jinjun Peng, Yuyao Wang, Lingming Zhang
2023 arXiv   pre-print
As such, the recent wave of research has been studying the automated synthesis of test-cases (i.e., DNN models and their inputs) for fuzzing DL systems.  ...  NeuRI finds 100 new bugs for PyTorch and TensorFlow in four months, with 81 already fixed or confirmed.  ...  Consequently, it is crucial to harness the correctness of DL systems via extensive and automated testing.  ... 
arXiv:2302.02261v3 fatcat:l3wkyicuuzccbdcbmgj2ndg66i

Refactoring Neural Networks for Verification [article]

David Shriver, Dong Xu, Sebastian Elbaum, Matthew B. Dwyer
2019 arXiv   pre-print
Unlike with traditional code refactoring, DNN refactoring does not guarantee functional equivalence of the two networks, but rather it aims to preserve the accuracy of the original network while producing  ...  A DNN refactoring defines (a) the transformation of the DNN's architecture, i.e., the number and size of its layers, and (b) the distillation of the learned relationships between the input features and  ...  First, the configuration language provides a simple and unified way to specify architectural transformations, and R4V removes potential errors associated with the transformation itself by automating this  ... 
arXiv:1908.08026v1 fatcat:j4bgxitgcjdupimxmzxsh4appq

TAP: Accelerating Large-Scale DNN Training Through Tensor Automatic Parallelisation [article]

Ziji Shi, Le Jiang, Ang Wang, Jie Zhang, Xianyan Jia, Yong Li, Chencan Wu, Jialin Li, Wei Lin
2023 arXiv   pre-print
Experiments show that TAP is 20×- 160× faster than the state-of-the-art automatic parallelism framework, and the performance of its discovered schedules is competitive with the expert-engineered ones.  ...  In this work, we present a model parallelism framework TAP that automatically searches for the best data and tensor parallel schedules.  ...  However, Y 0 is just a partial result, and it needs to be added with Y 1 from device 1 to get Y . As such, an AllReduceSum communication is required to sum them up for mathematical equivalence.  ... 
arXiv:2302.00247v1 fatcat:edwlqi7pqng2dd4vjxvwih2lse

Joint Program and Layout Transformations to Enable Convolutional Operators on Specialized Hardware Based on Constraint Programming

Dennis Rieber, Axel Acosta, Holger Fröning
2022 ACM Transactions on Architecture and Code Optimization (TACO)  
The success of Deep Artificial Neural Networks (DNNs) in many domains created a rich body of research concerned with hardware accelerators for compute-intensive DNN operators.  ...  However, implementing such operators efficiently with complex hardware intrinsics such as matrix multiply is a task not yet automated gracefully.  ...  The produced results also motivate research into the interplay of data layouts, their transformation, and how this interacts with the overall DNN and accelerator architecture.  ... 
doi:10.1145/3487922 fatcat:gnuvco7rffcdzcuirotonjnssi

HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation [article]

Hanchen Ye, Xiaofan Zhang, Zhize Huang, Gengsheng Chen, Deming Chen
2020 arXiv   pre-print
To speedup Deep Neural Networks (DNN) accelerator design and enable effective implementation, we propose HybridDNN, a framework for building high-performance hybrid DNN accelerators and delivering FPGA-based  ...  This demonstrates that HybridDNN is flexible and scalable and can target both cloud and embedded hardware platforms with vastly different resource constraints.  ...  ACKNOWLEDGMENTS This work is supported in part by the IBM-Illinois Center for Cognitive Computing Systems Research (C3SR) and XMotors.ai.  ... 
arXiv:2004.03804v1 fatcat:2r7ymftbordw5odrfndowsuxg4

Towards Proof Synthesis Guided by Neural Machine Translation for Intuitionistic Propositional Logic [article]

Taro Sekiyama, Akifumi Imanishi, Kohei Suenaga
2017 arXiv   pre-print
We implement the whole framework and empirically observe that a generated proof term is close to a correct proof in terms of the tree edit distance of AST.  ...  Inspired by the recent evolution of deep neural networks (DNNs) in machine learning, we explore their application to PL-related topics.  ...  We also appreciate Akihiro Yamamoto; the discussion with him leads to the evaluation metrics used in this paper. This paper is partially supported by JST PRESTO Grant Number JP-MJPR15E5, Japan.  ... 
arXiv:1706.06462v1 fatcat:7xho4a42dvfhrn4tu7tridyy7a

Joint Program and Layout Transformations to enable Convolutional Operators on Specialized Hardware based on Constraint Programming [article]

Dennis Rieber, Axel Acosta, Holger Fröning
2021 arXiv   pre-print
The success of Deep Artificial Neural Networks (DNNs) in many domains created a rich body of research concerned with hardware accelerators for compute-intensive DNN operators.  ...  However, implementing such operators efficiently with complex hardware intrinsics such as matrix multiply is a task not yet automated gracefully.  ...  This optimization can reduce the number of nodes and edges in a graph significantly, without loosing correctness of representation. • Nodes with labels with only an outgoing self-edge, or no outgoing edges  ... 
arXiv:2104.04731v4 fatcat:g4uzwaivtjhajjwaimdct7y5wu

Automated Deep Learning: Neural Architecture Search Is Not the End [article]

Xuanyi Dong, David Jacob Kedziora, Katarzyna Musial, Bogdan Gabrys
2022 arXiv   pre-print
It requires grappling with problem formulation and context understanding, data engineering, model development, deployment, continuous monitoring and maintenance, and so on.  ...  Consequently, in response to these issues, a new field has emerged over the last few years: automated deep learning (AutoDL).  ...  Acknowledgments: XD and DJK acknowledge financial support secured by BG, which funded their participation in this study and the ongoing "Automated and Autonomous Machine Learning" project as part of the  ... 
arXiv:2112.09245v3 fatcat:dujfh7pzmzbrtdyoshkl4kpbsm

Mistify: Automating DNN Model Porting for On-Device Inference at the Edge

Peizhen Guo, Bo Hu, Wenjun Hu
2021 Symposium on Networked Systems Design and Implementation  
This often necessitates fitting a model originally designed and trained in the cloud to edge devices with a range of hardware capabilities, which so far has relied on time-consuming manual effort.  ...  Extensive evaluation shows that Mistify reduces the DNN porting time needed by over 10⇥ to cater to a wide spectrum of edge deployment scenarios, incurring orders of magnitude less manual effort.  ...  Acknowledgments We thank the anonymous reviewers and our shepherd, Eric Schkufza, for their insightful comments. Julia McClellan helped with early exploration of the work.  ... 
dblp:conf/nsdi/GuoHH21 fatcat:yf7dnug7wbcyzen7ttlxn6dhme

Exocompilation for productive programming of hardware accelerators

Yuka Ikarashi, Gilbert Louis Bernstein, Alex Reinking, Hasan Genc, Jonathan Ragan-Kelley
2022 Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation  
Schedules are defined as composable rewrites within the language, and we develop a set of effect analyses which guarantee program equivalence and memory safety through these transformations.  ...  are commonly still coded and optimized by hand, at great expense, in low-level C and assembly.  ...  This work was partially supported by the Applications Driving Architectures (ADA) center, one of six centers of JUMP, a Semiconductor Research Corporation program co-sponsored by DARPA.  ... 
doi:10.1145/3519939.3523446 fatcat:ak3wlrnzvvacrktvdzjqfa5pdm

A Survey on the Optimization of Neural Network Accelerators for Micro-AI On-Device Inference

Arnab Neelim Mazumder, Jian Meng, Hasib-Al Rashid, Utteja Kallakuri, Xin Zhang, Jae-sun Seo, Tinoosh Mohsenin
2021 IEEE Journal on Emerging and Selected Topics in Circuits and Systems  
techniques, and the realization of the micro-AI models on resource-constrained hardware and different design considerations associated with it.  ...  The efficacy of DNNs coincides with the fact that they can provide state-ofthe-art inference accuracy for these applications.  ...  In this case, the transformation matrices are replaced with FFT of both the image and the kernel.  ... 
doi:10.1109/jetcas.2021.3129415 fatcat:nknpy4eernaeljz2hpqafe7sja
« Previous Showing results 1 — 15 out of 1,663 results