Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








287 Hits in 4.3 sec

Efficient data layouts for a three-dimensional electrostatic Particle-in-Cell code

Yann Barsamian, Sever A. Hirstoaga, Éric Violard
2018 Journal of Computational Science  
To accurately solve realistic problems, the method requires to use trillions of particles and therefore, there is a strong demand for high performance code on modern architectures.  ...  The Particle-in-Cell (PIC) method is a widely used tool in plasma physics.  ...  We thus explored several space-filling curves for the data layout of E and ρ, and different loop transformations for the particle loops, with the aim of improving the cache reuse and achieving efficient  ... 
doi:10.1016/j.jocs.2018.06.004 fatcat:eztadpep6vd6llqy2r6v4cfibe

Enhancing locality for recursive traversals of recursive structures

Youngjoon Jo, Milind Kulkarni
2011 SIGPLAN notices  
We develop a novel optimization called point blocking, inspired by the classic tiling loop transformation, and show that it can substantially enhance temporal locality in traversal codes.  ...  In this paper, we argue that, for a class of irregular applications we call traversal codes, there exists substantial data reuse and hence opportunity for locality exploitation.  ...  We would also like to thank the anonymous referees for providing insightful comments. This research was supported in part with funding from Intel and a grant from the Purdue Research Foundation.  ... 
doi:10.1145/2076021.2048104 fatcat:qaflx5pc7nbbtmhc3hurh2v2dq

Enhancing locality for recursive traversals of recursive structures

Youngjoon Jo, Milind Kulkarni
2011 Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications - OOPSLA '11  
We develop a novel optimization called point blocking, inspired by the classic tiling loop transformation, and show that it can substantially enhance temporal locality in traversal codes.  ...  In this paper, we argue that, for a class of irregular applications we call traversal codes, there exists substantial data reuse and hence opportunity for locality exploitation.  ...  We would also like to thank the anonymous referees for providing insightful comments. This research was supported in part with funding from Intel and a grant from the Purdue Research Foundation.  ... 
doi:10.1145/2048066.2048104 dblp:conf/oopsla/JoK11 fatcat:6cudnrjtffbdfkcfc7w3s5iaee

Mesh Layouts for Block-Based Caches

Sung-eui Yoon, Peter Lindstrom
2006 IEEE Transactions on Visualization and Computer Graphics  
Index Terms-Mesh and graph layouts, cache-aware and cache-oblivious layouts, metrics for cache coherence, data locality. ! • The authors are with the Lawrence  ...  In addition to guiding the layout process, our metrics can be used to quantify the quality of a layout, e.g. for comparing different layouts of the same mesh and for determining whether a given layout  ...  This work was supported by the LOCAL LLNL LDRD project (05-ERD-018) and was performed under the auspices of the U.S.  ... 
doi:10.1109/tvcg.2006.162 pmid:17080854 fatcat:zevggpjkw5dqhcyxljiahj3jhq

Optimizing an MPI weather forecasting model via processor virtualization

Eduardo R. Rodrigues, Philippe O. A. Navaux, Jairo Panetta, Celso L. Mendes, Laxmikant V. Kale
2010 2010 International Conference on High Performance Computing  
These models are typically executed in parallel machines and a major obstacle for their scalability is load imbalance.  ...  In this paper, we demonstrate the effectiveness of processor virtualization for dynamically balancing the load in BRAMS, a mesoscale weather forecasting model based on MPI parallelization.  ...  Because the Hilbert curve preserves spatial locality, threads corresponding to sub-domains that are close in space are likely to be assigned to the same processor.  ... 
doi:10.1109/hipc.2010.5713171 dblp:conf/hipc/RodriguesNPMK10 fatcat:m2mmh3iiu5dolgy6yalavutoee

Assignment as a location-based service in outsourced databases

Ahmet Salih BÜYÜKKAYHAN, Taflan İmre GÜNDEM
2017 Turkish Journal of Electrical Engineering and Computer Sciences  
A recent work [12], proposes to use Moore curves and asymmetric cryptography together for KNN queries.  ...  Assignment query processing needs global data, since a small local change could totally change the result.  ...  Conclusions In this paper, we propose a novel method to protect location privacy not only for static local queries but also for dynamic global queries such as assignment queries.  ... 
doi:10.3906/elk-1511-190 fatcat:tan4xx6v3ve5xavifyfbuiieuu

KOLAM: a cross-platform architecture for scalable visualization and tracking in wide-area imagery

Joshua Fraser, Anoop Haridas, Guna Seetharaman, Raghuveer M. Rao, Kannappan Palaniappan, Matthew F. Pellechia, Richard J. Sorensen, Kannappan Palaniappan
2013 Geospatial InfoFusion III  
KOLAM is an open, cross-platform, interoperable, scalable and extensible framework supporting a novel multiscale spatiotemporal dual-cache data structure for big data visualization and visual analytics  ...  The KOLAM software architecture was extended to support airborne wide-area motion imagery by organizing spatiotemporal tiles in very large format video frames using a temporal cache of tiled pyramid cached  ...  Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon.  ... 
doi:10.1117/12.2018162 fatcat:ux7qxsmlqjgxtiosglmoubvs2i

Efficient query processing on unstructured tetrahedral meshes

Stratos Papadomanolakis, Anastassia Ailamaki, Julio C. Lopez, Tiankai Tu, David R. O'Hallaron, Gerd Heber
2006 Proceedings of the 2006 ACM SIGMOD international conference on Management of data - SIGMOD '06  
Finally, we present a new data layout approach for tetrahedral mesh datasets that provides better performance compared to the traditional space filling curves.  ...  We develop Directed Local Search (DLS), an efficient indexing algorithm based on mesh topology information that is practically insensitive to the geometric properties of meshes.  ...  , an IBM faculty partnership award and a NASA AISR fund.  ... 
doi:10.1145/1142473.1142535 dblp:conf/sigmod/PapadomanolakisALTOH06 fatcat:2cyruhfwznh65nho4tsuzk5jia

Using Evolutionary Algorithms to Find Cache-Friendly Generalized Morton Layouts for Arrays [article]

Stephen Nicholas Swatman, Ana-Lucia Varbanescu, Andy D. Pimentel, Andreas Salzburger, Attila Krasznahorkay
2024 arXiv   pre-print
To this end, we propose a chromosomal representation for such layouts as well as a methodology for estimating the fitness of array layouts using cache simulation.  ...  The layout of multi-dimensional data can have a significant impact on the efficacy of hardware caches and, by extension, the performance of applications.  ...  Many of the experimental results shown in this paper were gathered on the Advanced School for Computing and Imaging (ASCI) DAS-6 compute cluster [11] .  ... 
arXiv:2309.07002v2 fatcat:igplhgez5rhrnn5gf2i7pqbnoe

Efficient band approximation of Gram matrices for large scale kernel methods on GPUs

Mohamed Hussein, Wael Abd-Almageed
2009 Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis - SC '09  
Our method relies on the locality preserving properties of space filling curves, and the special structure of Gram matrices. Our approach has several important merits.  ...  We introduce a novel method to approximate a Gram matrix with a band matrix.  ...  However, this locality preserving property varies from one type of curves to another. The family of Hilbert curves [3] is known to have good locality preserving properties.  ... 
doi:10.1145/1654059.1654091 dblp:conf/sc/HusseinA09 fatcat:ujfljhpjazbb3ns3y2bkdl7gwe

Advanced optimization strategies in the Rice dHPF compiler

J. Mellor-Crummey, V. Adve, B. Broom, D. Chavarría-Miranda, R. Fowler, G. Jin, K. Kennedy, Q. Yi
2002 Concurrency and Computation  
For loop nests with complex data dependences, such as the example of Figure 4 , we have developed an algorithm to eliminate inner-loop communication without excessive loss of cache reuse.  ...  Communication is vectorized out of any loop as long as doing so will not cause any loop-carried or loop-independent data dependence to be violated. • The compiler can coalesce messages for arbitrary affine  ...  For example, adaptive distributions based on space-filling (for instance, Hilbert) curves [25] would enable many irregular applications to be implemented efficiently in HPF.  ... 
doi:10.1002/cpe.647 fatcat:taze6xqwpzhw3hw27yglleqnei

A divide-and-conquer/cellular-decomposition framework for million-to-billion atom simulations of chemical reactions

Aiichiro Nakano, Rajiv K. Kalia, Ken-ichi Nomura, Ashish Sharma, Priya Vashishta, Fuyuki Shimojo, Adri C.T. van Duin, William A. Goddard, Rupak Biswas, Deepak Srivastava
2007 Computational materials science  
Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently  ...  The EDC and THCD frameworks expose maximal data localities, and consequently the isogranular parallel efficiency on 1920 processors is as high as 0.953.  ...  Cells are traversed along either a Morton curve (Fig. 3) or a Hilbert curve, instead of the traditional raster-scan order.  ... 
doi:10.1016/j.commatsci.2006.04.012 fatcat:muocll6lfvazrj4qmy2geazy6m

2PCP: Two-phase CP decomposition for billion-scale dense tensors

Xinsheng Li, Shengyu Huang, K. Selcuk Candan, Maria Luisa Sapino
2016 2016 IEEE 32nd International Conference on Data Engineering (ICDE)  
In this paper, we introduce 2PCP, a two-phase, block-based CP decomposition system with intelligent buffer sensitive task scheduling and buffer management mechanisms. 2PCP aims to reduce I/O costs in the  ...  Tensors are multi-dimensional arrays -consequently, tensor decomposition operations (CP and Tucker) are the bases for many high-dimensional data analysis tasks, from clustering, trend detection, anomaly  ...  As we see here, Hilbert traversal relies on "U" shaped curvesegments (as opposed to the "Z" shaped curve-segments of the Z-order traversal) and this helps better preserve the adjacency property (i.e.,  ... 
doi:10.1109/icde.2016.7498294 dblp:conf/icde/LiHCS16 fatcat:gbltoqe3bnhhfpijqkyfrzjknu

High-Performance and Scalable Agent-Based Simulation with BioDynaMo [article]

Lukas Breitwieser, Ahmad Hesam, Fons Rademakers, Juan Gómez Luna, Onur Mutlu
2023 arXiv   pre-print
To overcome this limitation, we present a novel high-performance simulation engine. We identify three key challenges for which we present the following solutions.  ...  Second, we reduce the memory access latency with a NUMA-aware agent iterator, agent sorting with a space-filling curve, and a custom heap memory allocator.  ...  Acknowledgments We want to thank the CERN Knowledge Transfer office (https: //kt.cern/) for supporting this work.  ... 
arXiv:2301.06984v1 fatcat:vrwqse7vxfg2hfk2tzqpaanjay

Cost-Based Predictive Spatiotemporal Join

Wook-Shin Han, Jaehwa Kim, Byung Suk Lee, Yufei Tao, R. Rantzau, V. Markl
2009 IEEE Transactions on Knowledge and Data Engineering  
In this paper we present CoPST, the first and foremost algorithm for such a join using two spatio-temporal indexes.  ...  CoPST adapts gracefully to large scale databases, by dynamically switching between main-memory buffering and disk-based buffering, through a novel probabilistic cost model.  ...  It is not possible to show the results for a varying cache size (in the same manner as in Figures 13 to 19 ) because the cache size is fixed for a given PC hardware.  ... 
doi:10.1109/tkde.2008.159 fatcat:lbbdro7rhzcttb6kxrxbdmgkwy
« Previous Showing results 1 — 15 out of 287 results