A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
Efficient Realization of BCD Multipliers Using FPGAs
2017
International Journal of Reconfigurable Computing
Pipelined BCD multipliers were implemented for 4 × 4, 8 × 8, and 16 × 16-digit multipliers. ...
The main highlight of the proposed architecture is the generation of the partial products and parallel binary operations based on 2-digit columns. 1 × 1-digit multipliers used for the partial product generation ...
Pipelined Multipliers. Based on the architecture of the BCD multiplier, a 4-stage pipelined BCD multiplier is illustrated in Figure 11 . ...
doi:10.1155/2017/2410408
fatcat:vfmxlsxzzzg7decvavupaffvpu
Low Power Fir Filter Design Using Truncated Multiplier
English
2014
International Journal of Engineering Trends and Technoloy
English
Multiple constant multiplication/accumulation in a pipelined direct FIR structure is implemented using an improved version of truncated multipliers. ...
Index Terms-Digital signal processing (DSP), faithful rounding, truncated multipliers, FIR filter design. ...
The correction term that is generated is based on the following arguments, 1) The biggest column in the entire partial product array of a full-width multiplier is the Nth column. 2) The Nth column contributes ...
doi:10.14445/22315381/ijett-v10p208
fatcat:c4tlwsccp5cbhlt7lsoyivvm7m
An Efficient Pipeline Architecture and Memory Bit-Width Analysis for Discrete Wavelet Transform of the 9/7 Filter for JPEG 2000
2009
Journal of Signal Processing Systems
In general, using pipeline architectures can increase the processing speed of 1-D column processor, but more pipeline registers also increase the internal memory size of row processor for 2-D DWT [3] [ ...
The proposed 1-D column processor requires less pipeline registers to achieve about the same critical path compared with other liftingbased architectures. ...
Direct mapping architecture requires more pipeline registers to archive one multiplier delay (T M ) for the 1-D column processor. ...
doi:10.1007/s11265-009-0375-y
fatcat:5g5cbmtzkvcf5a3jo3snw7igne
Prototyping design of a flexible DSP block with pipeline structure for FPGA
2016
IEICE Electronics Express
To alleviate timing degeneration caused by the more congestion routing, we implement a pipelined design in the Compressor Array. ...
Considering the situation in Fig. 2(a) , if a (4:2) compressor is used to compress data in column n. Compression result will be 3 bits in column n + 1 and 1 bit in column n in the next stage. ...
To perform multioperand addition, signals in a rectangular region from column 0 to column 18 should be compressed. ...
doi:10.1587/elex.13.20160676
fatcat:uhyt6yhvlndk3biz66uefvncv4
A deeply-pipelined FPGA-based SpMV accelerator with a hardware-friendly storage scheme
2015
IEICE Electronics Express
the nested block compression and variable-bit-width column-index encoding schemes. ...
Based on the proposed compression scheme, a deeply-pipelined SpMV accelerator is implemented on a Xilinx Virtex XC7VX485T FPGA platform, which can handle sparse matrices with arbitrary size and sparsity ...
the redundant computations and memory accesses by exploiting nested block compression and column indices compression. (2) Based on the proposed compression scheme, a deeply-pipelined SpMV accelerator ...
doi:10.1587/elex.12.20150161
fatcat:p3g7mtuutnf7zkymhg4lbbopxm
Design of Low Power 2-D Dct Architecture Using Reconfigurable Architecture
2012
IOSR Journal of Electronics and Communication Engineering
This area efficient and low error DCT is obtained by using shifters and adders in place of multipliers. ...
Pipelining technique is also introduced here which reduces the processing time. Design Unit Frequency Report 2-D IDCT 115.85MHz ...
In this paper we present VLSI Implementation of fully pipelined multiplier less architecture of 8x8 2D DCT/IDCT. This architecture is used as the core of JPEG compression hardware. ...
doi:10.9790/2834-0312025
fatcat:vspk7bd66bepxfma5phazwkcgm
Reconfigurable Fixed Point Dense and Sparse Matrix-Vector Multiply/Add Unit
2006
IEEE 17th International Conference on Application-specific Systems, Architectures and Processors (ASAP'06)
In this paper, we propose a reconfigurable hardware accelerator for fixed-point-matrix-vector-multiply/add operations, capable to work on dense and sparse matrices formats. ...
Table 3 . 3 Matrix-Vector Multiply/Add Unit. (Pipelines stages: time delay -hardware use)
Xilinx XC2VP100
Partial Multiply
Multiple-Addition. ...
File Registers: time delay -hardware use)Control: refers to the control register that holds in the pipeline the Column and EOR information. ...
doi:10.1109/asap.2006.58
dblp:conf/asap/CalderonV06
fatcat:34zdegtga5glzic3eh75z2phma
Design and Implementation of High-Speed and Energy-Efficient Variable-Latency Speculating Booth Multiplier (VLSBM)
2013
IEEE Transactions on Circuits and Systems Part 1: Regular Papers
modified Booth multiplier when . ...
1.0 to 1.4 times, and reduces the cycle count ratio by approximately 1.3 to 1.8 times in comparison to the fastest conventional two-stage pipelined Booth multiplier. ...
Because all partial product bits within each column are summed in parallel, the Wallace tree compression is superior during the second step. ...
doi:10.1109/tcsi.2013.2248851
fatcat:37uyvarjivg5ha4xludkr3mp6a
High speed VLSI architectures for DWT in biometric image compression: A study
2010
Procedia Computer Science
Image compression is a vital part of the process. ...
This paper studies various techniques that help in realizing the fast operation of the transform stage of the image compression processes. ...
This architecture uses 4 multipliers and 6.3 CSAs. The pipelining architecture proposed by Mansouri et al. ...
doi:10.1016/j.procs.2010.11.028
fatcat:rgslstx6sbc7jlpkjrp4qphmxa
Low Latency CMOS Hardware Acceleration for Fully Connected Layers in Deep Neural Networks
[article]
2020
arXiv
pre-print
We have achieved this considerable improvement by fully utilizing the HBM units for storing and reading out column-specific FClayer weights in 1 cycle with a novel colum-row-column schedule, and implementing ...
The FC accelerator, FC-ACCL, is based on 128 8x8 or 16x16 processing elements (PEs) for matrix-vector multiplication, and 128 multiply-accumulate (MAC) units integrated with 128 High Bandwidth Memory ( ...
The recently described EIE ASIC [12] accelerates both CONV and FC layers by using compression to derive a compressed network model. ...
arXiv:2011.12839v1
fatcat:luyzr74a75eavhimgc6iikio2m
High performance compressive sensing reconstruction hardware with QRD process
2012
2012 IEEE International Symposium on Circuits and Systems
This paper presents a high performance architecture for the reconstruction of compressive sampled signals using Orthogonal Matching Pursuit (OMP) algorithm. ...
In this paper, multiply and add is divided into 3 pipeline stages that will decrease the delay of this block. Multipliation takes place in the first stage of pipeline. ...
It uses a single multiplier and is pipelined to perform one multiplication per cycle thus producing the result in 6 clock cycles.
E. ...
doi:10.1109/iscas.2012.6271921
dblp:conf/iscas/StanislausM12
fatcat:ymn6wh7pyzaelexodnhtkaevha
Optimizing multiplier design for enhanced processor performance
2024
Applied and Computational Engineering
The crux of multiplier design lies in reducing the count of partial products and compressing them. ...
This paper presents the design of a multiplier that utilizes the Booth algorithm and the Wallace tree structure for optimization, along with the incorporation of registers for secondary pipeline processing ...
of vertical expansion calculation of the base 10 column as shown in Figure 1 . ...
doi:10.54254/2755-2721/38/20230564
fatcat:kdebriqq3bdj3j4qor2ymjaile
Pipeline Architecture of 2d Dct for High Efficiency Video Coding
2017
International Journal of Engineering Research and
Pipelining technique is introduced to reduce the processing time. ...
12.74%, and it reduced the execution time of DCT operations in HEVC HM software encoder up to 37.27%.Currently different types of transform techniques are used by different video codes to achieve data compression ...
Pipelining technique is introduced to reduce the processing time. ...
doi:10.17577/ijertv6is050522
fatcat:eggb67ai7vd4fe54kazii27nie
A framework for propagation of uncertainties in the Kepler data analysis pipeline
2010
Software and Cyberinfrastructure for Astronomy
We describe the POU Framework and SVD compression scheme and its implementation in the Kepler SOC pipeline. ...
We present a novel framework used to implement standard propagation of uncertainties (POU) in the Kepler Science Operations Center (SOC) data processing pipeline. ...
Some of the metadata is compressible across cadences in a lossless fashion. ...
doi:10.1117/12.857758
fatcat:6quggsa325gvhiwfjtwunpsw7a
Page 2113 of American Society of Civil Engineers. Collected Journals Vol. 109, Issue 9
[page]
1986
American Society of Civil Engineers. Collected Journals
Duncan (9) considered multiply and con- tinuously loaded columns but scaled all loads to use only one load vari- able. ...
Willers (16) studied the buckling of heavy columns with movably hinged lower end and compressive end load. ...
« Previous
Showing results 1 — 15 out of 14,361 results