A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2023; you can also visit the original URL.
The file type is application/pdf
.
Filters
Auto-Parallelizing Large Models with Rhino: A Systematic Approach on Production AI Platform
[article]
2023
arXiv
pre-print
Aiming to efficiently search for a near-optimal parallel execution plan, our analysis of production clusters reveals general heuristics to speed up the strategy search. ...
We present Rhino, a system for accelerating tensor programs with automatic parallelization on AI platform for real production environment. ...
Based on the observations and statistics from optimizing common models in a production cluster, we design a three-level subgraph merging algorithm for searching SPMD strategies efficiently and near-optimally ...
arXiv:2302.08141v1
fatcat:6mnkaxvlivah7ekzwdp2xclgle
Gate-Level Simulation with GPU Computing
2011
ACM Transactions on Design Automation of Electronic Systems
Noting the vast available parallelism in the hardware of modern GPUs, and the inherently parallel structures of gate-level netlists, we propose novel algorithms for the efficient mapping of complex designs ...
The experimental results show that our GPU-based simulator is capable of handling the validation of industrial-size designs while delivering more than an order-of-magnitude performance improvements on ...
The baseline clustering algorithm (a) groups cones of logic by degree of logic sharing, while profiling (b) is based on the activation frequency of logic cones. ...
doi:10.1145/1970353.1970363
fatcat:msv44q4wffh6vcmzwplmsoi254
Mapping-Aware Constrained Scheduling for LUT-Based FPGAs
2015
Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - FPGA '15
In this paper, we propose MAPS, a mapping-aware constrained scheduling algorithm for LUT-based FPGAs. ...
We also present an efficient incremental scheduling technique for MAPS to effectively handle resource constraints. ...
Algorithm 2 lists the pseudo-code for our incremental scheduling algorithm. ...
doi:10.1145/2684746.2689063
dblp:conf/fpga/TanDGZ15
fatcat:tbjwxp6slrhnxazejnyvtnga3a
A parallel bi-objective hybrid metaheuristic for energy-aware scheduling for cloud computing systems
2011
Journal of Parallel and Distributed Computing
In terms of completion time, the obtained schedules are also shorter than those of other algorithms. Furthermore, our study demonstrates the potential of DVS. ...
Our new method is based on dynamic voltage scaling (DVS) to minimize energy consumption. ...
We would like to thank the technical staffs of the Grid'5000 and the clusters of University of Mons for making their clusters accessible and fully operational. ...
doi:10.1016/j.jpdc.2011.04.007
fatcat:qggy2h7qtrahzna26gjfviwhkq
A parallel insertion heuristic for vehicle routing with side constraints
1990
Statistica neerlandica (Print)
In this paper, we discuss some of the strong and weak points of this heuristic, and take its basic ideas to develop a new parallel insertion heuristic for the vehicle routing and scheduling problem that ...
In the early eighties, Fisher and Jaikumar developed a generalized assignment heuristic for vehicle routing problems. ...
In developing construction procedures for vehicle routing and scheduling problems based on insertion, the following three key questions serve as guidelines: (a) How is the set of m initial routes, each ...
doi:10.1111/j.1467-9574.1990.tb01278.x
fatcat:kxqvydcatvgajj23wwholjaj4u
Learning to Schedule Heuristics for the Simultaneous Stochastic Optimization of Mining Complexes
[article]
2022
arXiv
pre-print
This work proposes a data-driven framework for heuristic scheduling in a fully self-managed hyper-heuristic to solve the SSOMC. ...
The proposed learn-to-perturb (L2P) hyper-heuristic is a multi-neighborhood simulated annealing algorithm. ...
all heuristics 2: Add all heuristics to a list L H (list of not yet selected heuristics) 3: while number of heuristics in L H > 0 do 4: Choose a heuristic h i randomly from the list and generate a new ...
arXiv:2202.12866v1
fatcat:a7oa3bgy7zg6djbr4yeqqmb25a
An improved meta-heuristic approach to extraction sequencing and block routing
2016
Journal of the Southern African Institute of Mining and Metallurgy
In this paper, a new approach based on a meta-heuristic is proposed. Meta-heuristic approaches use processing, inference, and memory at the same time in order to learn how to improve the solution. ...
Simulated annealing, mine production scheduling, open pit mining, mine planning, heuristic memory. ...
The ranked positional weight (RPW) algorithm is a heuristic algorithm that draws a downward cone from each block and the block gains a score according to the economic values of the blocks in the downward ...
doi:10.17159/2411-9717/2016/v116n7a9
fatcat:3ede2cfyejborj6f5kmgu6sujq
Combinatorial optimization and vehicle fleet planning: Perspectives and prospects
1981
Networks
ACKNOWLEDGEMENTS I am grateful to Bruce Golden, Steve Graves, Jan Hammond, and Paul Mireault for their comments on an earlier version of this paper. ...
The cluster point for each cone is placed along the ray from the depot bisecting that cone and at a radial distance from the depot equal to that of some demand point so that (approximately) 25% of all ...
And, what are prospects for future breakthroughs in optimization based methods for vehicle routing and scheduling? ...
doi:10.1002/net.3230110209
fatcat:kmmvt63wbjcppep5ak6zvhc7em
Scan chain clustering for test power reduction
2008
Proceedings of the 45th annual conference on Design automation - DAC '08
An effective technique to save power during scan based test is to switch off unused scan chains. ...
In this paper, a new method to cluster flip-flops into scan chains is presented, which minimizes the power consumption during test. ...
Chapter 3 presents an efficient hyper-graph based partitioning algorithm. ...
doi:10.1145/1391469.1391680
dblp:conf/dac/ElmWIZLM08
fatcat:byctuduk2rat3istfk5ttneb3a
Sparse Beamforming and User-Centric Clustering for Downlink Cloud Radio Access Network
[article]
2014
arXiv
pre-print
BS cluster is fixed for each user and we jointly optimize the user scheduling and the beamforming vector to account for the backhaul constraints. ...
This paper shows that the proposed dynamic clustering algorithm can achieve significant performance gain over existing naive clustering schemes. ...
In Section IV, the user scheduling and beamforming vectors are jointly optimized under fixed BS clustering and two heuristic static clustering algorithms are proposed. ...
arXiv:1410.5020v1
fatcat:d6ompw63zfc3vgfhqprnb774ua
Drone-Aided Delivery Methods, Challenge, and the Future: A Methodological Review
2023
Drones
We then categorize the literature according to the characteristics and objectives of the problems and thoroughly analyze them based on mathematical formulations and solution techniques. ...
With the increasing interest in this technology, it is crucial for researchers and practitioners to understand the current state of the art in drone delivery. ...
It is based on an Integer Programming (IP) formulation and a clustered and generalized TSP, where the LKH heuristic and cross-entropy (CE) meta-heuristic algorithms are employed. ...
doi:10.3390/drones7030191
fatcat:inuqdjb5m5e6dk2qvxtdlioepe
Module Allocation for Dynamically Reconfigurable Systems
[chapter]
2000
Lecture Notes in Computer Science
We propose a con guration bundling driven module allocation technique that can be used for component clustering. ...
The synthesis of dynamically recon gurable systems poses some new challenges for high-level synthesis tools. ...
It is based on a con guration bundling heuristic that tries to allocate con gurable logic resources by maintaining a global view of the resource requirements of all temporal templates. ...
doi:10.1007/3-540-45591-4_128
fatcat:3nspw37pcfhjxfahlf3zog25km
An integer linear programming approach for identifying instruction-set extensions
2005
Proceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis - CODES+ISSS '05
A selection algorithm that ranks the generated templates based on isomorphism testing and potential evaluation is described. ...
An algorithm that iteratively generates and solves a set of ILP problems in order to generate a set of templates is proposed. ...
In [10] , a simulated annealing based algorithm is employed to generate clusters based on schedule time and resource usage of dataflow graph nodes. The work of Choi et al. ...
doi:10.1145/1084834.1084880
dblp:conf/codes/AtasuDO05
fatcat:5oqno6jpcbcj3oo77qke2vp5oa
Scalable Real-Time Shadows using Clustering and Metric Trees
2018
Eurographics Symposium on Rendering
Real-time shadow algorithms based on geometry generally produce high quality shadows. Recent works have considerably improved their efficiency. ...
We present a new real-time shadow algorithm for non-deformable models that scales the geometric complexity. ...
In this paper, we propose a new geometry based algorithm for rendering pixel accurate hard shadows which is both fast and scalable. ...
doi:10.2312/sre.20181175
dblp:conf/rt/DevesMAG18
fatcat:7gnjykyvp5cr3coejncxejepta
A Broad Review on Various VLSI CAD Algorithms for Circuit Partitioning Problems
2018
International Journal of Mechanical and Production Engineering Research and Development
algorithms, and nature-based heuristics. ...
With the main objective of minimizing the cutsize, numerous algorithms have been proposed for circuit partition which includes genetic and evolutionary algorithms, probability-based algorithms, clustering ...
The discrete firefly algorithm is a swarm based heuristic algorithm that helps in solving the min cut circuit partitioning. ...
doi:10.24247/ijmperdfeb2018115
fatcat:kbnon3xtwncjbelzmumefqpbra
« Previous
Showing results 1 — 15 out of 1,336 results