A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Mapping of Neural Network Models onto Systolic Arrays
2000
Journal of Parallel and Distributed Computing
This paper presents a mapping scheme for the proposed implementation of neural network models on systolic arrays. ...
systolic array. ...
The same mapping strategy has been used in [8] for mapping the hidden Markov model (HMM) and the recursive back-propagation network (RBP) onto the ring systolic array. ...
doi:10.1006/jpdc.2000.1634
fatcat:p3pelrmeonh7bnm6blbcm3w3f4
REACT
2022
Proceedings of the 59th ACM/IEEE Design Automation Conference
Unlike conventional dynamic NoCs, REACT's NoCs have no buffer queues, flow control or routing, as they are entirely configured by software for each neural network. ...
On-chip training improves model accuracy on personalised user data and preserves privacy. ...
Tushar Krishna and Sheng-Chun (Felix) Kao from Georgia Institute of Technology for their help using the SCALE-Sim and MAESTRO tools. ...
doi:10.1145/3489517.3530406
fatcat:odymeerkc5cfhl4bt7bhybeyum
Trends in systolic and cellular computation
1991
Journal of Computational and Applied Mathematics
Using annotated lists of recent references, snapshots of active research areas are given on systolic linear solvers, the singular value decomposition, artificial neural networks, and the simulated annealing ...
The focus of the presentation is on linear algebraic techniques. A systolization process is illustrated by matching an arbitrary gaxpy operation onto a fixed-size square array processor. ...
Acknowledgements The author is grateful to colleagues at Hughes Research Laboratories, Greg Nash, Wojtek Przytula and David Schwartz, who provided much of the information contained in this paper. ...
doi:10.1016/0377-0427(91)90178-m
fatcat:ncp26oglknet3bjejr6klqhioa
A fuzzy clustering neural networks (FCNs) system design methodology
2000
IEEE Transactions on Neural Networks
Two mapping strategies both from FCN model to system architecture and from the given architecture to systolic arrays are described. ...
A system design methodology for fuzzy clustering neural networks (FCNs) is presented. ...
, and mapping the neural networks onto the corresponding systolic arrays (see Fig. 1 ). ...
doi:10.1109/72.870048
pmid:18249843
fatcat:zevwhbabxbhkrhuywug6hfpls4
SYSTOLIC ARRAY METHODOLOGY FOR A NEURAL MODEL TO SOLVE THE MIXTURE PROBLEM
[chapter]
2002
Series in Machine Perception and Artificial Intelligence
The proposed method is supported by a linear recurrent neural network based on the Hopfield model (HRNN). ...
Both structures are used for realising the iterative process of the Neural Network. ...
Alejandro Curado Fuentes for his linguistic revision of this paper. ...
doi:10.1142/9789812778086_0002
fatcat:xvzov3catnb4pee53z66ox6vea
Design and Scaffolded Training of an Efficient DNN Operator for Computer Vision on the Edge
[article]
2021
arXiv
pre-print
Further, by combining NOS with NAS, we design networks that define state-of-the-art models improving on both accuracy and latency on systolic arrays. ...
The resultant computation efficiently maps to systolic arrays. The optimal dataflow, called Spatial-Tiled Output Stationary (ST-OS), maximizes the efficiency of FuSeConv on systolic arrays. ...
We thank Surya Selvam for his contributions to an earlier version of this research effort [49] . ...
arXiv:2108.11441v1
fatcat:duhrhvp3g5dapl4b43fkokfiku
FuSeConv: Fully Separable Convolutions for Fast Inference on Systolic Arrays
[article]
2021
arXiv
pre-print
With FuSeConv, we achieve a significant speed-up of 3x-7x with the MobileNet family of networks on a systolic array of size 64x64, with comparable accuracy on the ImageNet dataset. ...
Both efficient neural networks and hardware accelerators are being explored to speed up DNN inference on edge devices. ...
We thank Gokulan for his help in modeling systolic-arrays. Finally, we thank the anonymous reviewers for their insightful comments and suggestions towards improving the work. ...
arXiv:2105.13434v1
fatcat:gjnzf7mnabeoti2cc5zol47iaq
A Full-stack Accelerator Search Technique for Vision Applications
[article]
2021
arXiv
pre-print
Although FAST can be used on any number and type of deep learning workload, in this paper we focus on optimizing for a single or small set of vision models, resulting in significantly faster and more power-efficient ...
The rapidly-changing ML model landscape presents a unique opportunity for building hardware accelerators optimized for specific datacenter-scale workloads. ...
However, depthwise-separable convolutions do not map well onto TPUs due to poor systolic array utilization and operational intensity. ...
arXiv:2105.12842v1
fatcat:mtunvjdcdrcr5pc5bpyfye7mea
Array Aware Training/Pruning: Methods for Efficient Forward Propagation on Array-based Neural Network Accelerators
2020
2020 IEEE 31st International Conference on Application-specific Systems, Architectures and Processors (ASAP)
Our goal is to compress the model based on the size of the array so as to reduce the number of computation cycles. ...
In practice, due to the mismatch between matrix and array sizes, the computation does not map on the array exactly. ...
CONCLUSION We addressed the problem of optimizing the size of DNN weight matrices to suit the hardware specifications for efficient forward propagation on array-based neural network accelerators. ...
doi:10.1109/asap49362.2020.00016
dblp:conf/asap/Chitty-VenkataS20
fatcat:ossyqk2wdffstohgwgnh2azmrm
systolic array exploitation of a neural network inherent parallelism solving the nearest neighbor problem
1997
Nonlinear Analysis
The inherent parallelism of the proposed neural network model is further exploited by the its efficient parallel implementation, based on a systolic network architecture. ...
In this paper, we suggest the employment of artificial neural networks for solving the problem of document clustering and nearest neighbor matching, in a dynamic and self-adapting way. ...
One is in mapping the systolic algorithms for neural networks onto parallel computers, such as Warp, MasPar, MP-1, and Transputer arrays [10] [11] [12] [13] [14] , another is in designing programmable ...
doi:10.1016/s0362-546x(97)00345-3
fatcat:alz7hgjjbjevfpzwzbawlhoqo4
AIRCHITECT: Learning Custom Architecture Design and Mapping Space
[article]
2021
arXiv
pre-print
We use three case studies involving the optimal array design, SRAM buffer sizing, mapping, and schedule determination for systolic-array-based custom architecture design and mapping space. ...
In this paper we investigate the possibility of learning the optimization task using machine learning and hence using the learnt model to predict optimal parameters for the design and mapping space of ...
CONCLUSION This paper presents AIRCHITECT, a recommendation neural network for learning the architecture design and mapping space of systolic array based accelerators. ...
arXiv:2108.08295v1
fatcat:oynpbnf6mzc4vive7em5egfexy
FireFly: A High-Throughput and Reconfigurable Hardware Accelerator for Spiking Neural Networks
[article]
2023
arXiv
pre-print
With the introduction of the backpropagation algorithm and surrogate gradient, the structure of spiking neural networks has become more complex, and the performance gap with artificial neural networks ...
Spiking neural networks (SNNs) have been widely used due to their strong biological interpretability and high energy efficiency. ...
Input spike map channels are split into multiple tiles to fit into the height of the systolic array. Output spike map channels are calculated N at a time according to the width of the systolic array. ...
arXiv:2301.01905v2
fatcat:ngzoen5ohrabxfiztnlc32zkfq
SCALE-Sim: Systolic CNN Accelerator Simulator
[article]
2019
arXiv
pre-print
However, the research community lacks tools to insights on both the design trade-offs and efficient mapping strategies for systolic-array based accelerators. ...
This is the first systolic-array simulator tuned for running DNNs to the best of our knowledge. ...
As mentioned earlier there any many possible ways of mapping the compute onto the array. Each such mapping is called a data-flow. ...
arXiv:1811.02883v2
fatcat:x252tx3qgnffhiolmllj4jhmsa
SalvageDNN: salvaging deep neural network accelerators with permanent faults through saliency-driven fault-aware mapping
2019
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
Deep neural networks (DNNs) have proliferated in most of the application domains that involve data processing, predictive analysis and knowledge inference. ...
We also present novel modifications in a systolic array design to further improve the yield of the accelerators while ensuring reliable DNN execution using 'SalvageDNN' and negligible overheads in terms ...
This Work is supported in parts by the German Research Foundation (DFG) as part of the GetSURE project in the scope of SPP-1500 (http://spp1500.itec.kit.edu) priority program, 'Dependable Embedded Systems ...
doi:10.1098/rsta.2019.0164
pmid:31865875
pmcid:PMC6939235
fatcat:5xw2nvfzgzbnfjnl3pfkqq3o7a
Sparse Winograd Convolutional neural networks on small-scale systolic arrays
[article]
2018
arXiv
pre-print
In this paper, we implement an accelerator on FPGA by combining the sparse Winograd convolution, clusters of small-scale systolic arrays, and a tailored memory layout design. ...
We also provide an analytical model analysis for the general Winograd convolution algorithm as a design reference. ...
layout down to small blocks then map these blocks onto small-scale systolic arrays to perform multiplications of submatrices, and share these submatrices among working arrays to reduce required memory ...
arXiv:1810.01973v1
fatcat:g44643olkvh23fkdp5lsraipe4
« Previous
Showing results 1 — 15 out of 1,631 results