Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1555271.1555273acmconferencesArticle/Chapter ViewAbstractPublication PagesicacConference Proceedingsconference-collections
research-article

Automatic exploration of datacenter performance regimes

Authors Info & Claims
Published:19 June 2009Publication History

ABSTRACT

Horizontally scalable Internet services present an opportunity to use automatic resource allocation strategies for system management in the datacenter. In most of the previous work, a controller employs a performance model of the system to make decisions about the optimal allocation of resources. However, these models are usually trained offline or on a small-scale deployment and will not accurately capture the performance of the controlled application. To achieve accurate control of the web application, the models need to be trained directly on the production system and adapted to changes in workload and performance of the application. In this paper we propose to train the performance model using an exploration policy that quickly collects data from different performance regimes of the application. The goal of our approach for managing the exploration process is to strike a balance between not violating the performance SLAs and the need to collect sufficient data to train an accurate performance model, which requires pushing the system close to its capacity. We show that by using our exploration policy, we can train a performance model of a Web 2.0 application in less than an hour and then immediately use the model in a resource allocation controller.

References

  1. J. Allspaw. The Art of Capacity Planning: Scaling Web Resources. O'Reilly Media, Inc., 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. S. Babu, N. Borisov, S. Duan, H. Herodotou, and V. Thummala. Automated experiment-driven management of (database) systems. In HotOS, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. N. Bennani and D. A. Menasce. Resource allocation for autonomic data centers using analytic performance models. In ICAC, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. P. Bodík, G. Friedman, L. Biewald, H. Levine, G. Candea, K. Patel, G. Tolle, J. Hui, A. Fox, M. I. Jordan, and D. Patterson. Combining visualization and statistical analysis to improve operator confidence and efficiency for failure detection and localization. In ICAC, 2005.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Chase, D. C. Anderson, P. N. Thakar, A. M. Vahdat, and R. P. Doyle. Managing energy and server resources in hosting centers. In Symposium on Operating Systems Principles (SOSP), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Kusic, J. O. Kephart, J. E. Hanson, N. Kandasamy, and G. Jiang. Power and performance management of virtualized computing environments via lookahead control. In ICAC'08: Proceedings of the 2008 International Conference on Autonomic Computing, pages 3--12, Washington, DC, USA, 2008. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. X. Liu, J. Heo, L. Sha, and X. Zhu. Adaptive control of multi-tiered web applications using queueing predictor. Network Operations and Management Symposium, 2006. NOMS 2006. 10th IEEE/IFIP, pages 106--114, April 2006.Google ScholarGoogle Scholar
  8. P. Shivam, S. Babu, and J. Chase. Active sampling for accelerated learning of performance models. In SysML, 2006.Google ScholarGoogle Scholar
  9. P. Shivam, V. Marupadi, J. Chase, T. Subramaniam, and S. Babu. Cutting corners: Workbench automation for server benchmarking. In USENIX, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. W. Sobel, S. Subramanyam, A. Sucharitakul, J. Nguyen, H. Wong, S. Patil, A. Fox, and D. Patterson. Cloudstone: Multi-platform, multi-language benchmark and measurement tools for web 2.0, 2008.Google ScholarGoogle Scholar
  11. C. Stewart and K. Shen. Performance modeling and system management for multi-component online services. In NSDI, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). The MIT Press, March 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G. Tesauro, N. Jong, R. Das, and M. Bennani. A hybrid reinforcement learning aproach to autonomic resource allocation. In International Conference on Autonomic Computing (ICAC), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. B. Urgaonkar, P. Shenoy, A. Chandra, and P. Goyal. Dynamic provisioning of multi-tier internet applications. In ICAC, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. P. Vosshall. Amazon, Personal communication.Google ScholarGoogle Scholar
  16. L. Wasserman. All of Nonparametric Statistics (Springer Texts in Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic exploration of datacenter performance regimes

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in
              • Published in

                cover image ACM Conferences
                ACDC '09: Proceedings of the 1st workshop on Automated control for datacenters and clouds
                June 2009
                64 pages
                ISBN:9781605585857
                DOI:10.1145/1555271

                Copyright © 2009 ACM

                Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 19 June 2009

                Permissions

                Request permissions about this article.

                Request Permissions

                Check for updates

                Qualifiers

                • research-article

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader