Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








1,336 Hits in 3.4 sec

Auto-Parallelizing Large Models with Rhino: A Systematic Approach on Production AI Platform [article]

Shiwei Zhang, Lansong Diao, Siyu Wang, Zongyan Cao, Yiliang Gu, Chang Si, Ziji Shi, Zhen Zheng, Chuan Wu, Wei Lin
2023 arXiv   pre-print
Aiming to efficiently search for a near-optimal parallel execution plan, our analysis of production clusters reveals general heuristics to speed up the strategy search.  ...  We present Rhino, a system for accelerating tensor programs with automatic parallelization on AI platform for real production environment.  ...  Based on the observations and statistics from optimizing common models in a production cluster, we design a three-level subgraph merging algorithm for searching SPMD strategies efficiently and near-optimally  ... 
arXiv:2302.08141v1 fatcat:6mnkaxvlivah7ekzwdp2xclgle

Gate-Level Simulation with GPU Computing

Debapriya Chatterjee, Andrew Deorio, Valeria Bertacco
2011 ACM Transactions on Design Automation of Electronic Systems  
Noting the vast available parallelism in the hardware of modern GPUs, and the inherently parallel structures of gate-level netlists, we propose novel algorithms for the efficient mapping of complex designs  ...  The experimental results show that our GPU-based simulator is capable of handling the validation of industrial-size designs while delivering more than an order-of-magnitude performance improvements on  ...  The baseline clustering algorithm (a) groups cones of logic by degree of logic sharing, while profiling (b) is based on the activation frequency of logic cones.  ... 
doi:10.1145/1970353.1970363 fatcat:msv44q4wffh6vcmzwplmsoi254

Mapping-Aware Constrained Scheduling for LUT-Based FPGAs

Mingxing Tan, Steve Dai, Udit Gupta, Zhiru Zhang
2015 Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '15  
In this paper, we propose MAPS, a mapping-aware constrained scheduling algorithm for LUT-based FPGAs.  ...  We also present an efficient incremental scheduling technique for MAPS to effectively handle resource constraints.  ...  Algorithm 2 lists the pseudo-code for our incremental scheduling algorithm.  ... 
doi:10.1145/2684746.2689063 dblp:conf/fpga/TanDGZ15 fatcat:tbjwxp6slrhnxazejnyvtnga3a

A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems

M. Mezmaz, N. Melab, Y. Kessaci, Y.C. Lee, E.-G. Talbi, A.Y. Zomaya, D. Tuyttens
2011 Journal of Parallel and Distributed Computing  
In terms of completion time, the obtained schedules are also shorter than those of other algorithms. Furthermore, our study demonstrates the potential of DVS.  ...  Our new method is based on dynamic voltage scaling (DVS) to minimize energy consumption.  ...  We would like to thank the technical staffs of the Grid'5000 and the clusters of University of Mons for making their clusters accessible and fully operational.  ... 
doi:10.1016/j.jpdc.2011.04.007 fatcat:qggy2h7qtrahzna26gjfviwhkq

A parallel insertion heuristic for vehicle routing with side constraints

M.W.P. Savelsbergh
1990 Statistica neerlandica (Print)  
In this paper, we discuss some of the strong and weak points of this heuristic, and take its basic ideas to develop a new parallel insertion heuristic for the vehicle routing and scheduling problem that  ...  In the early eighties, Fisher and Jaikumar developed a generalized assignment heuristic for vehicle routing problems.  ...  In developing construction procedures for vehicle routing and scheduling problems based on insertion, the following three key questions serve as guidelines: (a) How is the set of m initial routes, each  ... 
doi:10.1111/j.1467-9574.1990.tb01278.x fatcat:kxqvydcatvgajj23wwholjaj4u

Learning to Schedule Heuristics for the Simultaneous Stochastic Optimization of Mining Complexes [article]

Yassine Yaakoubi, Roussos Dimitrakopoulos
2022 arXiv   pre-print
This work proposes a data-driven framework for heuristic scheduling in a fully self-managed hyper-heuristic to solve the SSOMC.  ...  The proposed learn-to-perturb (L2P) hyper-heuristic is a multi-neighborhood simulated annealing algorithm.  ...  all heuristics 2: Add all heuristics to a list L H (list of not yet selected heuristics) 3: while number of heuristics in L H > 0 do 4: Choose a heuristic h i randomly from the list and generate a new  ... 
arXiv:2202.12866v1 fatcat:a7oa3bgy7zg6djbr4yeqqmb25a

An improved meta-heuristic approach to extraction sequencing and block routing

Y.A. Sari
2016 Journal of the Southern African Institute of Mining and Metallurgy  
In this paper, a new approach based on a meta-heuristic is proposed. Meta-heuristic approaches use processing, inference, and memory at the same time in order to learn how to improve the solution.  ...  Simulated annealing, mine production scheduling, open pit mining, mine planning, heuristic memory.  ...  The ranked positional weight (RPW) algorithm is a heuristic algorithm that draws a downward cone from each block and the block gains a score according to the economic values of the blocks in the downward  ... 
doi:10.17159/2411-9717/2016/v116n7a9 fatcat:3ede2cfyejborj6f5kmgu6sujq

Combinatorial optimization and vehicle fleet planning: Perspectives and prospects

T. L. Magnanti
1981 Networks  
ACKNOWLEDGEMENTS I am grateful to Bruce Golden, Steve Graves, Jan Hammond, and Paul Mireault for their comments on an earlier version of this paper.  ...  The cluster point for each cone is placed along the ray from the depot bisecting that cone and at a radial distance from the depot equal to that of some demand point so that (approximately) 25% of all  ...  And, what are prospects for future breakthroughs in optimization based methods for vehicle routing and scheduling?  ... 
doi:10.1002/net.3230110209 fatcat:kmmvt63wbjcppep5ak6zvhc7em

Scan chain clustering for test power reduction

Melanie Elm, Hans-Joachim Wunderlich, Michael E. Imhof, Christian G. Zoellin, Jens Leenstra, Nicolas Maeding
2008 Proceedings of the 45th annual conference on Design automation - DAC '08  
An effective technique to save power during scan based test is to switch off unused scan chains.  ...  In this paper, a new method to cluster flip-flops into scan chains is presented, which minimizes the power consumption during test.  ...  Chapter 3 presents an efficient hyper-graph based partitioning algorithm.  ... 
doi:10.1145/1391469.1391680 dblp:conf/dac/ElmWIZLM08 fatcat:byctuduk2rat3istfk5ttneb3a

Sparse Beamforming and User-Centric Clustering for Downlink Cloud Radio Access Network [article]

Binbin Dai, Wei Yu
2014 arXiv   pre-print
BS cluster is fixed for each user and we jointly optimize the user scheduling and the beamforming vector to account for the backhaul constraints.  ...  This paper shows that the proposed dynamic clustering algorithm can achieve significant performance gain over existing naive clustering schemes.  ...  In Section IV, the user scheduling and beamforming vectors are jointly optimized under fixed BS clustering and two heuristic static clustering algorithms are proposed.  ... 
arXiv:1410.5020v1 fatcat:d6ompw63zfc3vgfhqprnb774ua

Drone-Aided Delivery Methods, Challenge, and the Future: A Methodological Review

Xueping Li, Jose Tupayachi, Aliza Sharmin, Madelaine Martinez Ferguson
2023 Drones  
We then categorize the literature according to the characteristics and objectives of the problems and thoroughly analyze them based on mathematical formulations and solution techniques.  ...  With the increasing interest in this technology, it is crucial for researchers and practitioners to understand the current state of the art in drone delivery.  ...  It is based on an Integer Programming (IP) formulation and a clustered and generalized TSP, where the LKH heuristic and cross-entropy (CE) meta-heuristic algorithms are employed.  ... 
doi:10.3390/drones7030191 fatcat:inuqdjb5m5e6dk2qvxtdlioepe

Module Allocation for Dynamically Reconfigurable Systems [chapter]

Xue-jie Zhang, Kam-wing Ng
2000 Lecture Notes in Computer Science  
We propose a con guration bundling driven module allocation technique that can be used for component clustering.  ...  The synthesis of dynamically recon gurable systems poses some new challenges for high-level synthesis tools.  ...  It is based on a con guration bundling heuristic that tries to allocate con gurable logic resources by maintaining a global view of the resource requirements of all temporal templates.  ... 
doi:10.1007/3-540-45591-4_128 fatcat:3nspw37pcfhjxfahlf3zog25km

An integer linear programming approach for identifying instruction-set extensions

Kubilay Atasu, Günhan Dündar, Can Özturan
2005 Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis - CODES+ISSS '05  
A selection algorithm that ranks the generated templates based on isomorphism testing and potential evaluation is described.  ...  An algorithm that iteratively generates and solves a set of ILP problems in order to generate a set of templates is proposed.  ...  In [10] , a simulated annealing based algorithm is employed to generate clusters based on schedule time and resource usage of dataflow graph nodes. The work of Choi et al.  ... 
doi:10.1145/1084834.1084880 dblp:conf/codes/AtasuDO05 fatcat:5oqno6jpcbcj3oo77qke2vp5oa

Scalable Real-Time Shadows using Clustering and Metric Trees

François Deves, Frédéric Mora, Lilian Aveneau, Djamchid Ghazanfarpour
2018 Eurographics Symposium on Rendering  
Real-time shadow algorithms based on geometry generally produce high quality shadows. Recent works have considerably improved their efficiency.  ...  We present a new real-time shadow algorithm for non-deformable models that scales the geometric complexity.  ...  In this paper, we propose a new geometry based algorithm for rendering pixel accurate hard shadows which is both fast and scalable.  ... 
doi:10.2312/sre.20181175 dblp:conf/rt/DevesMAG18 fatcat:7gnjykyvp5cr3coejncxejepta

A Broad Review on Various VLSI CAD Algorithms for Circuit Partitioning Problems

R. Manikandan et al., R. Manikandan et al.,
2018 International Journal of Mechanical and Production Engineering Research and Development  
algorithms, and nature-based heuristics.  ...  With the main objective of minimizing the cutsize, numerous algorithms have been proposed for circuit partition which includes genetic and evolutionary algorithms, probability-based algorithms, clustering  ...  The discrete firefly algorithm is a swarm based heuristic algorithm that helps in solving the min cut circuit partitioning.  ... 
doi:10.24247/ijmperdfeb2018115 fatcat:kbnon3xtwncjbelzmumefqpbra
« Previous Showing results 1 — 15 out of 1,336 results