Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2592798.2592820acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Algorithmic improvements for fast concurrent Cuckoo hashing

Published:14 April 2014Publication History

ABSTRACT

Fast concurrent hash tables are an increasingly important building block as we scale systems to greater numbers of cores and threads. This paper presents the design, implementation, and evaluation of a high-throughput and memory-efficient concurrent hash table that supports multiple readers and writers. The design arises from careful attention to systems-level optimizations such as minimizing critical section length and reducing interprocessor coherence traffic through algorithm re-engineering. As part of the architectural basis for this engineering, we include a discussion of our experience and results adopting Intel's recent hardware transactional memory (HTM) support to this critical building block. We find that naively allowing concurrent access using a coarse-grained lock on existing data structures reduces overall performance with more threads. While HTM mitigates this slowdown somewhat, it does not eliminate it. Algorithmic optimizations that benefit both HTM and designs for fine-grained locking are needed to achieve high performance.

Our performance results demonstrate that our new hash table design---based around optimistic cuckoo hashing---outperforms other optimized concurrent hash tables by up to 2.5x for write-heavy workloads, even while using substantially less memory for small key-value items. On a 16-core machine, our hash table executes almost 40 million insert and more than 70 million lookup operations per second.

References

  1. Intel® 64 and IA-32 Architectures Software Developer's Manual. Number 253665-047US. Intel Corporation, June 2013.Google ScholarGoogle Scholar
  2. Intel Threading Building Block. https://www.threadingbuildingblocks.org/.Google ScholarGoogle Scholar
  3. S. Chaudhry, R. Cypher, M. Ekman, M. Karlsson, A. Landin, S. Yip, H. Zeffer, and M. Tremblay. Rock: A High-Performance Sparc CMT Processor. IEEE Micro, 29(2):6--16, Mar. 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Christie, J.-W. Chung, S. Diestelhorst, M. Hohmuth, M. Pohlack, C. Fetzer, M. Nowack, T. Riegel, P. Felber, P. Marlier, and E. Rivière. Evaluation of AMD's advanced synchronization facility within a complete transactional memory stack. In Proc. 5th EuroSys, pages 27--40, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Chung, L. Yen, S. Diestelhorst, M. Pohlack, M. Hohmuth, D. Christie, and D. Grossman. ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory. In Proc. 43rd MICRO, pages 39--50, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. Dice, Y. Lev, M. Moir, and D. Nussbaum. Early Experience with a Commercial Hardware Transactional Memory Implementation. In Proc. 14th ASPLOS, pages 157--168, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. U. Erlingsson, M. Manasse, and F. McSherry. A Cool and Practical Alternative to Traditional Hash Tables. In Proc. 7th Workshop on Distributed Data and Structures (WDAS'06), Santa Clara, CA, Jan. 2006.Google ScholarGoogle Scholar
  8. B. Fan, D. G. Andersen, and M. Kaminsky. MemC3: Compact and Concurrent Memcache with Dumber Caching and Smarter Hashing. In Proc. 10th USENIX NSDI, Lombard, IL, Apr. 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Google SparseHash. https://code.google.com/p/sparsehash/.Google ScholarGoogle Scholar
  10. M. Herlihy and J. E. B. Moss. Transactional Memory: Architectural Support for Lock-free Data Structures. In Proc. 20th ISCA, pages 289--300, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Herlihy and N. Shavit. The Art of Multiprocessor Programming. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Intel Performance Counter Monitor. www.intel.com/software/pcm.Google ScholarGoogle Scholar
  13. C. Jacobi, T. Slegel, and D. Greiner. Transactional Memory Architecture and Implementation for IBM System Z. In Proc. 45th MICRO, pages 25--36, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. H. T. Kung and J. T. Robinson. On Optimistic Methods for Concurrency Control. ACM Trans. Database Syst., 6(2):213--226, June 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. libcuckoo. https://github.com/efficient/libcuckoo.Google ScholarGoogle Scholar
  16. Y. Mao, E. Kohler, and R. T. Morris. Cache craftiness for fast multicore key-value storage. In Proc. 7th EuroSys, pages 183--196, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. E. McKenney, D. Sarma, A. Arcangeli, A. Kleen, O. Krieger, and R. Russell. Read-Copy Update. In In Ottawa Linux Symposium, pages 338--367, 2001.Google ScholarGoogle Scholar
  18. R. Pagh and F. F. Rodler. Cuckoo Hashing. Journal of Algorithms, 51(2):122--144, May 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Triplett, P. E. McKenney, and J. Walpole. Resizable, Scalable, Concurrent Hash Tables via Relativistic Programming. In Proc. USENIX ATC, pages 11--11, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. TSX lock elision for glibc. https://github.com/andikleen/glibc.Google ScholarGoogle Scholar
  21. A. Wang, M. Gaudet, P. Wu, J. N. Amaral, M. Ohmacht, C. Barton, R. Silvera, and M. Michael. Evaluation of Blue Gene/Q Hardware Support for Transactional Memories. In Proc. 21st PACT, pages 127--136, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. M. Yoo, C. J. Hughes, K. Laiz, and R. Rajwar. Performance Evaluation of Intel Transactional Synchronization Extensions for High-Performance Computing. In Proc. SC, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    EuroSys '14: Proceedings of the Ninth European Conference on Computer Systems
    April 2014
    388 pages
    ISBN:9781450327046
    DOI:10.1145/2592798

    Copyright © 2014 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 14 April 2014

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    EuroSys '14 Paper Acceptance Rate27of147submissions,18%Overall Acceptance Rate241of1,308submissions,18%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader