Optimizing DNNs With Partially Equivalent Transformations and Automated Corrections.

We propose PET, the first DNN framework that optimizes tensor programs with partially equivalent transformations and automated corrections. ... optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. ... This work is partially supported by National Natural Science Foundation of China (U20A20226, 62072262) and Beijing Natural Science Foundation (4202031). ...

dblp:conf/osdi/WangZGMTZLRCJ21 fatcat:4sf4vbr4avfffhwfdedpidq7oa

One common denominator of these testing techniques is the automated generation of test cases, e.g., new inputs transformed from the original training data with the aim to optimize some test adequacy criteria ... So far, the effectiveness of these approaches has been hindered by their reliance on random fuzzing or transformations that do not always produce test cases with a good diversity. ... One common denominator of these testing techniques is the automated generation of test cases, e.g., new inputs transformed from the original training data with the aim to optimize some test adequacy criteria ...

arXiv:1909.02563v1 fatcat:afz7fvaz3zbera235glfvjueeq

In this paper, we present Sommelier, an indexing and query system above typical DNN model repositories to interface directly with inference serving or other use cases. ... Motivated by manual iterative model search processes and typical model design strategies that generate model variants or models with common segments, Sommelier organizes DNN models based on their semantic ... CNS-1815115 and 2112562, and a Google Faculty Research Award. ...

doi:10.1145/3514221.3526173 fatcat:evlxt33ogbdevbeswanfswtvj4

and secure ML systems based on user-defined constraints and objectives. ... This article summarizes the main challenges in agile development of efficient, reliable and secure ML systems, and then presents an outline of an agile design methodology to generate efficient, reliable ... Graph Optimizations: Graph-level optimizations help improve computational and memory efficiency by applying various transformations. ...

arXiv:2204.09514v1 fatcat:ho7auszvmferrn36evs7oqdpt4

As such, the recent wave of research has been studying the automated synthesis of test-cases (i.e., DNN models and their inputs) for fuzzing DL systems. ... NeuRI finds 100 new bugs for PyTorch and TensorFlow in four months, with 81 already fixed or confirmed. ... Consequently, it is crucial to harness the correctness of DL systems via extensive and automated testing. ...

arXiv:2302.02261v3 fatcat:l3wkyicuuzccbdcbmgj2ndg66i

Multiple Versions

Unlike with traditional code refactoring, DNN refactoring does not guarantee functional equivalence of the two networks, but rather it aims to preserve the accuracy of the original network while producing ... A DNN refactoring defines (a) the transformation of the DNN's architecture, i.e., the number and size of its layers, and (b) the distillation of the learned relationships between the input features and ... First, the configuration language provides a simple and unified way to specify architectural transformations, and R4V removes potential errors associated with the transformation itself by automating this ...

arXiv:1908.08026v1 fatcat:j4bgxitgcjdupimxmzxsh4appq

Experiments show that TAP is 20×- 160× faster than the state-of-the-art automatic parallelism framework, and the performance of its discovered schedules is competitive with the expert-engineered ones. ... In this work, we present a model parallelism framework TAP that automatically searches for the best data and tensor parallel schedules. ... However, Y 0 is just a partial result, and it needs to be added with Y 1 from device 1 to get Y . As such, an AllReduceSum communication is required to sum them up for mathematical equivalence. ...

arXiv:2302.00247v1 fatcat:edwlqi7pqng2dd4vjxvwih2lse

Open Access

The success of Deep Artificial Neural Networks (DNNs) in many domains created a rich body of research concerned with hardware accelerators for compute-intensive DNN operators. ... However, implementing such operators efficiently with complex hardware intrinsics such as matrix multiply is a task not yet automated gracefully. ... The produced results also motivate research into the interplay of data layouts, their transformation, and how this interacts with the overall DNN and accelerator architecture. ...

doi:10.1145/3487922 fatcat:gnuvco7rffcdzcuirotonjnssi

To speedup Deep Neural Networks (DNN) accelerator design and enable effective implementation, we propose HybridDNN, a framework for building high-performance hybrid DNN accelerators and delivering FPGA-based ... This demonstrates that HybridDNN is flexible and scalable and can target both cloud and embedded hardware platforms with vastly different resource constraints. ... ACKNOWLEDGMENTS This work is supported in part by the IBM-Illinois Center for Cognitive Computing Systems Research (C3SR) and XMotors.ai. ...

arXiv:2004.03804v1 fatcat:2r7ymftbordw5odrfndowsuxg4

We implement the whole framework and empirically observe that a generated proof term is close to a correct proof in terms of the tree edit distance of AST. ... Inspired by the recent evolution of deep neural networks (DNNs) in machine learning, we explore their application to PL-related topics. ... We also appreciate Akihiro Yamamoto; the discussion with him leads to the evaluation metrics used in this paper. This paper is partially supported by JST PRESTO Grant Number JP-MJPR15E5, Japan. ...

arXiv:1706.06462v1 fatcat:7xho4a42dvfhrn4tu7tridyy7a

The success of Deep Artificial Neural Networks (DNNs) in many domains created a rich body of research concerned with hardware accelerators for compute-intensive DNN operators. ... However, implementing such operators efficiently with complex hardware intrinsics such as matrix multiply is a task not yet automated gracefully. ... This optimization can reduce the number of nodes and edges in a graph significantly, without loosing correctness of representation. • Nodes with labels with only an outgoing self-edge, or no outgoing edges ...

arXiv:2104.04731v4 fatcat:g4uzwaivtjhajjwaimdct7y5wu

Multiple Versions

It requires grappling with problem formulation and context understanding, data engineering, model development, deployment, continuous monitoring and maintenance, and so on. ... Consequently, in response to these issues, a new field has emerged over the last few years: automated deep learning (AutoDL). ... Acknowledgments: XD and DJK acknowledge financial support secured by BG, which funded their participation in this study and the ongoing "Automated and Autonomous Machine Learning" project as part of the ...

arXiv:2112.09245v3 fatcat:dujfh7pzmzbrtdyoshkl4kpbsm

Multiple Versions

This often necessitates fitting a model originally designed and trained in the cloud to edge devices with a range of hardware capabilities, which so far has relied on time-consuming manual effort. ... Extensive evaluation shows that Mistify reduces the DNN porting time needed by over 10⇥ to cater to a wide spectrum of edge deployment scenarios, incurring orders of magnitude less manual effort. ... Acknowledgments We thank the anonymous reviewers and our shepherd, Eric Schkufza, for their insightful comments. Julia McClellan helped with early exploration of the work. ...

dblp:conf/nsdi/GuoHH21 fatcat:yf7dnug7wbcyzen7ttlxn6dhme

Schedules are defined as composable rewrites within the language, and we develop a set of effect analyses which guarantee program equivalence and memory safety through these transformations. ... are commonly still coded and optimized by hand, at great expense, in low-level C and assembly. ... This work was partially supported by the Applications Driving Architectures (ADA) center, one of six centers of JUMP, a Semiconductor Research Corporation program co-sponsored by DARPA. ...

doi:10.1145/3519939.3523446 fatcat:ak3wlrnzvvacrktvdzjqfa5pdm

techniques, and the realization of the micro-AI models on resource-constrained hardware and different design considerations associated with it. ... The efficacy of DNNs coincides with the fact that they can provide state-ofthe-art inference accuracy for these applications. ... In this case, the transformation matrices are replaced with FFT of both the image and the kernel. ...

doi:10.1109/jetcas.2021.3129415 fatcat:nknpy4eernaeljz2hpqafe7sja

PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections

Preserved Fulltext

DeepEvolution: A Search-Based Testing Approach for Deep Neural Networks [article]

Preserved Fulltext

Sommelier: Curating DNN Models for the Masses

Preserved Fulltext

Special Session: Towards an Agile Design Methodology for Efficient, Reliable, and Secure ML Systems [article]

Preserved Fulltext

NeuRI: Diversifying DNN Generation via Inductive Rule Inference [article]

Preserved Fulltext

Other Versions

Refactoring Neural Networks for Verification [article]

Preserved Fulltext

TAP: Accelerating Large-Scale DNN Training Through Tensor Automatic Parallelisation [article]

Preserved Fulltext

Joint Program and Layout Transformations to Enable Convolutional Operators on Specialized Hardware Based on Constraint Programming

Preserved Fulltext

HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation [article]

Preserved Fulltext

Towards Proof Synthesis Guided by Neural Machine Translation for Intuitionistic Propositional Logic [article]

Preserved Fulltext

Joint Program and Layout Transformations to enable Convolutional Operators on Specialized Hardware based on Constraint Programming [article]

Preserved Fulltext

Other Versions

Automated Deep Learning: Neural Architecture Search Is Not the End [article]

Preserved Fulltext

Other Versions

Mistify: Automating DNN Model Porting for On-Device Inference at the Edge

Preserved Fulltext

Exocompilation for productive programming of hardware accelerators

Preserved Fulltext

A Survey on the Optimization of Neural Network Accelerators for Micro-AI On-Device Inference

Preserved Fulltext