Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








51 Hits in 3.0 sec

NURA

Sina Darabi, Negin Mahani, Hazhir Baxishi, Ehsan Yousefzadeh, Mohammad Sadrosadati, Hamid Sarbazi-Azad
2022 Abstract Proceedings of the 2022 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems  
Some pieces of prior work (e.g. spatial multitasking) have limited opportunity to improve resource utilization, while others, e.g. simultaneous multi-kernel, provide fine-grained resource sharing at the  ...  In terms of fairness, NURA has almost similar results to spatial multitasking, while it outperforms simultaneous multi-kernel by 76%, on average.  ...  There are two known approaches of multitasking in GPUs: (1) spatial multi-tasking [3] , and (2) simultaneous multi-kernel (SMK) execution [2] .  ... 
doi:10.1145/3489048.3522656 fatcat:xcmtppre3rer3etvsjnrvzjtei

Dynamic Resource Management for Efficient Utilization of Multitasking GPUs

Jason Jong Kyu Park, Yongjun Park, Scott Mahlke
2017 SIGARCH Computer Architecture News  
Recent proposals on multitasking GPUs have focused on either spatial multitasking, which partitions GPU resource at a streaming multiprocessor (SM) granularity, or simultaneous multikernel (SMK), which  ...  In this paper, we propose GPU Maestro that performs dynamic resource management for efficient utilization of multitasking GPUs.  ...  Acknowledgments We would like to thank the anonymous reviewers as well as the fellow members of CCCP research group for their valuable comments and feedbacks.  ... 
doi:10.1145/3093337.3037707 fatcat:7vikinfjtbbmnperrnxretampm

Dynamic Resource Management for Efficient Utilization of Multitasking GPUs

Jason Jong Kyu Park, Yongjun Park, Scott Mahlke
2017 ACM SIGOPS Operating Systems Review  
Recent proposals on multitasking GPUs have focused on either spatial multitasking, which partitions GPU resource at a streaming multiprocessor (SM) granularity, or simultaneous multikernel (SMK), which  ...  In this paper, we propose GPU Maestro that performs dynamic resource management for efficient utilization of multitasking GPUs.  ...  Acknowledgments We would like to thank the anonymous reviewers as well as the fellow members of CCCP research group for their valuable comments and feedbacks.  ... 
doi:10.1145/3093315.3037707 fatcat:5xjasiupnrcctp6oarqwrtbukm

Dynamic Resource Management for Efficient Utilization of Multitasking GPUs

Jason Jong Kyu Park, Yongjun Park, Scott Mahlke
2017 Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS '17  
Recent proposals on multitasking GPUs have focused on either spatial multitasking, which partitions GPU resource at a streaming multiprocessor (SM) granularity, or simultaneous multikernel (SMK), which  ...  In this paper, we propose GPU Maestro that performs dynamic resource management for efficient utilization of multitasking GPUs.  ...  Acknowledgments We would like to thank the anonymous reviewers as well as the fellow members of CCCP research group for their valuable comments and feedbacks.  ... 
doi:10.1145/3037697.3037707 dblp:conf/asplos/ParkPM17 fatcat:flmnbk4x3je4loearqmn2uzwkm

Dynamic Resource Management for Efficient Utilization of Multitasking GPUs

Jason Jong Kyu Park, Yongjun Park, Scott Mahlke
2017 SIGPLAN notices  
Recent proposals on multitasking GPUs have focused on either spatial multitasking, which partitions GPU resource at a streaming multiprocessor (SM) granularity, or simultaneous multikernel (SMK), which  ...  In this paper, we propose GPU Maestro that performs dynamic resource management for efficient utilization of multitasking GPUs.  ...  Acknowledgments We would like to thank the anonymous reviewers as well as the fellow members of CCCP research group for their valuable comments and feedbacks.  ... 
doi:10.1145/3093336.3037707 fatcat:xkqmepvakbh23gfyrnkjot6uge

Characterizing Fine-Grained Resource Utilization for Multitasking GPGPU in Cloud Systems

Kyungwoon Cho, Hyokyung Bahn
2021 IEEE Access  
In this article, we show that efficient resource sharing in GPGPU is possible without run-time profiling if resource usage characteristics of workloads are analyzed down to a fine-grained unit level.  ...  To determine the co-location of workloads, previous studies have shown that run-time performance profiling and dynamic relocation of workloads is necessary due to interference between workloads.  ...  ., device memory and shared memory. The coarse-grained utilization columns in the tables are calculated by the maximum value of the fine-grained utilizations within the same resource classifications.  ... 
doi:10.1109/access.2021.3132492 fatcat:y5yucr4qbbeevcmwkfpydzrqbe

A Survey of Multi-Tenant Deep Learning Inference on GPU [article]

Fuxun Yu, Di Wang, Longfei Shangguan, Minjia Zhang, Chenchen Liu, Xiang Chen
2022 arXiv   pre-print
With such strong computing scaling of GPUs, multi-tenant deep learning inference by co-locating multiple DL models onto the same GPU becomes widely deployed to improve resource utilization, enhance serving  ...  This survey aims to summarize and categorize the emerging challenges and optimization opportunities for multi-tenant DL inference on GPU.  ...  However, as we introduced before, achieving fine-grained resource partitioning is non-achievable until recently GPU vendors release a series of resource sharing and partitioning support like multistreams  ... 
arXiv:2203.09040v3 fatcat:utvpoyvvajfhfghgpf45nxnbne

On the Performance and Isolation of Asymmetric Microkernel Design for Lightweight Manycores

Pedro Henrique Penna, Joao Vicente Souto, Davidson Francis Lima, Marcio Castro, Francois Broquedis, Henrique Freitas, Jean-Francois Mehaut
2019 2019 IX Brazilian Symposium on Computing Systems Engineering (SBESC)  
Multikernel operating systems (OSs) were introduced to match the architectural characteristics of lightweight manycores.  ...  While several multikernel OS designs are possible, in this work we argue on one that is structured in asymmetric microkernel instances.  ...  Notwithstanding, due to a lack of context information, this is not enough for the kernel to either run a fine-grain data prefetch algorithm or a coherency protocol.  ... 
doi:10.1109/sbesc49506.2019.9046080 dblp:conf/sbesc/PennaSLCBFM19 fatcat:oe576cky2jfp3h4hynrszytmbq

AMOEBA: A Coarse Grained Reconfigurable Architecture for Dynamic GPU Scaling [article]

Xianwei Cheng, Hui Zhao, Mahmut Kandemir, Beilei Jiang, Gayatri Mehta
2019 arXiv   pre-print
A GPU consists of several StreamingMulti-processors (SMs) that collectively determine how shared resources are partitioned and accessed.  ...  However, neither scaling up nor scaling out can meet the scalability requirement of all applications running on a given GPU system, which inevitably results in performance degradation and resource under-utilization  ...  Dhar et al. proposed fine grained and coarse grained reconfigurations of SMs in GPUs in order to reduce the underutilization of resources and power consumption [15] .  ... 
arXiv:1911.03364v1 fatcat:tcmbakgikjhtdl2khxzei3zf74

Concurrent query processing in a GPU-based database system

Hao Li, Yi-Cheng Tu, Bo Zeng, Rashid Mehmood
2019 PLoS ONE  
The unrivaled computing capabilities of modern GPUs meet the demand of processing massive amounts of data seen in many application domains.  ...  Comparing to earlier studies of enabling concurrent tasks support on GPU such as MultiQx-GPU, we use a different approach that is to control the launching parameters of multiple GPU kernels as provided  ...  fine-grained context switches.  ... 
doi:10.1371/journal.pone.0214720 pmid:30990851 pmcid:PMC6467383 fatcat:4u2hmql235c4fkxajva5mcx6m4

AtomNAS: Fine-Grained End-to-End Neural Architecture Search [article]

Jieru Mei, Yingwei Li, Xiaochen Lian, Xiaojie Jin, Linjie Yang, Alan Yuille, Jianchao Yang
2020 arXiv   pre-print
We propose a fine-grained search space comprised of atomic blocks, a minimal search unit that is much smaller than the ones used in recent NAS algorithms.  ...  Instead of a search-and-retrain two-stage paradigm, our method simultaneously searches and trains the target architecture.  ...  This perspective enables a much larger and more fine-grained search space.  ... 
arXiv:1912.09640v2 fatcat:qyjyg33dkvc53pglyppjrnee44

Classification-Driven Search for Effective SM Partitioning in Multitasking GPUs

Xia Zhao, Zhiying Wang, Lieven Eeckhout
2018 Proceedings of the 2018 International Conference on Supercomputing - ICS '18  
Spatial multitasking in which independent applications co-execute on different sets of SMs is a promising solution to share GPU resources.  ...  Graphics processing units (GPUs) feature an increasing number of streaming multiprocessors (SMs) with each successive generation.  ...  We thank CalcUA for letting us use the NVIDIA P100 GPU. This work is supported by the European Research Council (ERC) Ad-  ... 
doi:10.1145/3205289.3205311 dblp:conf/ics/ZhaoWE18 fatcat:b4hzqrztwrabbjcydmgka3kqcq

A State-of-the-Art Survey on Real-Time Issues in Embedded Systems Virtualization

Zonghua Gu, Qingling Zhao
2012 Journal of Software Engineering and Applications  
A State-of-the-Art Survey on Real-Time Issues in Embedded Systems Virtualization 278 that are not specific to any virtualization approach, but we believe they are of sufficient importance to dedicate separate  ...  fine-grained control, e.g., an important task in the GPOS can be assigned a higher priority than a less important task in the RTOS.  ...  [75] presented a redesign of SPUMONE as a multikernel architecture. (Multikernel means that a separate copy of the kernel or VMM runs on each core on a multicore processor.)  ... 
doi:10.4236/jsea.2012.54033 fatcat:iiqszwe3brhjxl4ycyf6ynbp2u

Algorithms for Preemptive Co-scheduling of Kernels on GPUs

Lionel Eyraud-Dubois, Cristiana Bentes
2020 2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)  
Currently, the decision on the simultaneous execution of kernels is performed by the hardware, which can lead to unreasonable use of resources.  ...  In this work, we tackle the problem of co-scheduling for GPUs in high competition scenarios.  ...  [23] proposed Simultaneous Multikernel (SMK) that allows fine-grain sharing within each SM, kernels with complimentary resource usage are co-scheduled in the same SM to achieve resource fairness and  ... 
doi:10.1109/hipc50609.2020.00033 fatcat:piwx3puyyfdhpbdtxkk7hpzllq

Elastic Multi-resource Fairness: Balancing Fairness and Efficiency in Coupled CPU-GPU Architectures

Shanjiang Tang, BingSheng He, Shuhao Zhang, Zhaojie Niu
2016 SC16: International Conference for High Performance Computing, Networking, Storage and Analysis  
We show that EMRF satisfies fairness properties of sharing incentive, envy-freeness and pareto efficiency.  ...  Heterogeneous computing poses new challenging issues on the fair allocation of computational resources among users due to the availability of different kinds of computing devices (e.g., CPU and GPU).  ...  Shuhao Zhang's work is partially funded by the Economic Development Board and the National Research Foundation of Singapore.  ... 
doi:10.1109/sc.2016.74 dblp:conf/sc/TangHZN16 fatcat:l4hcko577favfa35c7jyxudvka
« Previous Showing results 1 — 15 out of 51 results