Abstract
While the promise of achieving speedup and additional benefits such as high performance per watt with FPGAs continues to expand, chief among the challenges with the emerging paradigm of reconfigurable computing is the complexity in application design and implementation. Before a lengthy development effort is undertaken to map a given application to hardware, it is important that a high-level parallel algorithm crafted for that application first be analyzed relative to the target platform, so as to ascertain the likelihood of success in terms of potential speedup. This article presents the RC Amenability Test, or RAT, a methodology and model developed for this purpose, supporting rapid exploration and prediction of strategic design tradeoffs during the formulation stage of application development.
- Alexandrov, A., Ionescu, M. F., Schauser, K. E., and Scheiman, C. 1997. Loggp: Incorporating long messages into the logp model for parallel computation. J. Paral. Distrib. Comput. 44, 1, 71--79. Google ScholarDigital Library
- Allen, M. P. and Tildesley, D. J. 1987. Computer Simulation of Liquids. Oxford University Press, Oxford, UK. Google ScholarDigital Library
- Banerjee, P., Bagchi, D., Haldar, M., Nayak, A., Kim, V., and Uribe, R. 2003. Automated conversion of floating point matlab programs into fixed point FPGA based hardware design. In Proceedings of the 11th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). 263--264. Google ScholarDigital Library
- Bondalapati, K. and Prasanna, V. 1999. Dynamic precision management for loop computations on reconfigurable architectures. In Proceedings of the 7th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). 249--258. Google ScholarDigital Library
- Bondalapati, K. K. 2001. Modeling and mapping for dynamically reconfigurable hybrid architectures. Ph.D. thesis, University of Southern California, Los Angeles, CA. Google ScholarDigital Library
- Bosque, J. L. and Perez, L. P. 2004. HLogGP: A new parallel computational model for heterogeneous clusters. In Proceedings of the IEEE Symposium on Cluster Computing and the Grid. Google ScholarDigital Library
- Brown, S. D., Rose, J., and Vranesic, Z. G. 1993. A stochastic model to predict the routability of field-programmable gate arrays. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 12, 12, 1827--1838.Google ScholarDigital Library
- Buttari, A., Dongarra, J., Kurzak, J., Langou, J., Langou, J., Luszczek, P., and Tomov, S. 2007. High Performance Computing and Grids in Action. IOS Press.Google Scholar
- Chang, M. and Hauck, S. 2002. Precis: A design-time precision analysis tool. In Proceedings of the 10th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). 229--238. Google ScholarDigital Library
- Culler, D., Karp, R., Patterson, D., Sahay, A., Schauser, K. E., Santos, E., Subramonian, R., and von Eicken, T. 1993. Logp: Towards a realistic model of parallel computation. In Proceedings of the 4th ACM Symposium on Principles and Practice of Parallel Programming. 1--12. Google ScholarDigital Library
- Degalahal, V. and Tuan, T. 2005. Methodology for high level estimation of FPGA power consumption. In Proceedings of the ACM Conference on Asia South Pacific Design Automation (ASP-DAC). 657--660. Google ScholarDigital Library
- Enzler, R., Plessl, C., and Platzner, M. 2005. System-level performance evaluation of reconfigurable processors. Microprocess. Microsyst. 29, 2-3, 63--75.Google ScholarCross Ref
- Fang, W. M. and Rose, J. 2008. Modeling routing demand for early-stage FPGA architecture development. In Proceedings of the ACM Symposium on Field Programmable Gate Arrays (FPGA). 139--148. Google ScholarDigital Library
- Fortune, S. and Wyllie, J. 1978. Parallelism in random access machines. In Proceedings of the 10th ACM Symposium on Theory of Computing. 114--118. Google ScholarDigital Library
- Fu, W. and Compton, K. 2006. A simulation platform for reconfigurable computing research. In Proceedings of the International Conference on Field Programmable Logic and Applications (FPL'06). 1--7.Google Scholar
- Gaffar, A., Mencer, O., Luk, W., Cheung, P., and Shirazi, N. 2002. Floating-point bitwidth analysis via automatic differentiation. In Proceedings of the IEEE International Conference on Field-Programmable Technology (FPT). 158--165.Google Scholar
- Grobelny, E., Bueno, D., Troxel, I., George, A., and Vetter, J. 2007. Fase: A framework for scalable performance prediction of hpc systems and applications. Simulation: Trans. Soc. Model. Simul. Int. 83, 10, 721--745. Google ScholarDigital Library
- Grobelny, E., Reardon, C., Jacobs, A., and George, A. 2007. Simulation framework for performance prediction in the engineering of rc systems and applications. In Proceedings of the Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA).Google Scholar
- Herbordt, M. C., VanCourt, T., Gu, Y., Sukhwani, B., Conti, A., Model, J., and DiSabello, D. 2007. Achieving high performance with FPGA-based computing. IEEE Computer 40, 3, 50--57. Google ScholarDigital Library
- Maidee, P. and Bazargan, K. 2006. Defect-tolerant FPGA architecture exploration. In Proceedings of the 13th IEEE Conference on Field Programmable Logic and Applications (FPL). 1--6.Google Scholar
- Manohararajah, V., Chiu, G. R., Singh, D. P., and Brown, S. D. 2006. Difficulty of predicting interconnect delay in a timing driven FPGA CAD flow. In Proceedings of the ACM Workshop on System-Level Interconnect Prediction (SLIP). 3--8. Google ScholarDigital Library
- Nagarajan, K., Holland, B., Slatton, C., and George, A. D. 2008. Scalable and por architecture for probability density function estimation on FPGAs. In Proceedings of the 16th IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). Google ScholarDigital Library
- Nelson, M., Humphrey, W., Gursoy, A., Dalke, A., Kal, L., Skeel, R. D., and Schulten, K. 1996. Namd---a parallel, object-oriented molecular dynamics program. Int. J. Supercomp. Appl. High Perform. Comput. 10, 4, 251--268.Google ScholarDigital Library
- Pearlman, D. A., Case, D. A., Caldwell, J. W., Ross, W. S., Thomas E. Cheatham, I., DeBolt, S., Ferguson, D., Seibel, G., and Kollman, P. 1995. Amber, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules. Comput. Phys. Comm. 91, 1-3, 1--41.Google ScholarCross Ref
- Perri, S., Corsonello, P., Iachino, M. A., Lanuzza, M., and Cocorullo, G. 2004. Variable precision arithmetic circuits for FPGA-based multimedia processors. IEEE Trans. (VLSI) 12, 9, 995--999. Google ScholarDigital Library
- Quinn, H., Leeser, M., and King, L. S. 2007. Dynamo: A runtime partitioning system for FPGA-based hw/sw image processing systems. J. Real-Time Image Process. 2, 4, 179--190.Google ScholarCross Ref
- Shih, K., Balachandran, A., Nagarajan, K., Holland, B., Slatton, C., and George, A. 2008. Fast realtime lidar processing on FPGAs. In Proceedings of the Conference on Engineering of Reconfigurable Systems and Algorithms (ERSA). Las Vegas, NV.Google Scholar
- Singh, A. and Marek-Sadowska, M. 2002. Fpga interconnect planning. In Proceedings of the ACM Workshop on System-Level Interconnect Prediction (SLIP). 23--30. Google ScholarDigital Library
- Singh, D. P., Manohararajah, V., and Brown, S. D. 2005. Two-stage physical synthsis for FPGAs. In Proceedings of the 13th IEEE Conference on Custom Integrated Circuits. 171--178.Google Scholar
- Smith, M. and Peterson, G. 2005. Parallel application performance on shared high performance reconfigurable computing resources. Perform. Eval. 60, 107--125. Google ScholarDigital Library
- Steffen, C. 2007. Parameterization of algorithms and FPGA accelerators to predict performance. Reconfigurable System Summer Institute (RSSI). Urbana, IL.Google Scholar
- Tschoke, S., Lubling, R., and Monien, B. 1995. Solving the traveling salesman problem with a distributed branch-and-bound algorithm on a 1024 processor network. In Proceedings of the Symposium on Parallel Processing. Google ScholarDigital Library
- Valiant, L. G. 1990. A bridging model for parallel computation. Comm. ACM 33, 8, 103--111. Google ScholarDigital Library
- Wang, X., Braganza, S., and Leeser, M. 2006. Advanced components in the variable precision floating-point library. In Proceedings of the Conference on Field-Programmable Custom Computing Machines (FCCM). Google ScholarDigital Library
- Xu, M. and Kurdahi, F. 1999. Accurate prediction of quality metrics for logic level design targeted towards lookup-table-based FPGA's. IEEE Trans. VLSI Syst. 7, 4, 411--418. Google ScholarDigital Library
Index Terms
- RAT: RC Amenability Test for Rapid Performance Prediction
Recommendations
An analytical model for multilevel performance prediction of Multi-FPGA systems
Power limitations in semiconductors have made explicitly parallel device architectures such as Field-Programmable Gate Arrays (FPGAs) increasingly attractive for use in scalable systems. However, mitigating the significant cost of FPGA development ...
Design Assurance Strategy and Toolset for Partially Reconfigurable FPGA Systems
The growth of the Reconfigurable Computing (RC) systems community exposes diverse requirements with regard to functionality of Electronic Design Automation (EDA) tools. Low-level design tools are increasingly required for RC bitstream debugging and IP ...
Comments