Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1871437.1871622acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

BP-tree: an efficient index for similarity search in high-dimensional metric spaces

Published:26 October 2010Publication History

ABSTRACT

Similarity search in high-dimensional metric spaces is a key operation in many applications, such as multimedia databases, image retrieval, object recognition, and others. The high dimensionality of the data requires special index structures to facilitate the search. Most of existing indexes are constructed by partitioning the data set using distance-based criteria. However, those methods either produce disjoint partitions, but ignore the distribution properties of the data; or produce non-disjoint groups, which greatly affect the search performance. In this paper, we study the performance of a new index structure, called Ball-and-Plane tree (BP-tree), which overcomes the above disadvantages. BP-tree is constructed by recursively dividing the data set into compact clusters. Distinctive from other techniques, it integrates the advantages of both disjoint and non-disjoint paradigms in order to achieve a structure of tight and low overlapping clusters, yielding significantly improved performance. Results obtained from an extensive experimental evaluation with real-world data sets show that BP-tree consistently outperforms state-of-the-art solutions.

References

  1. C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Inc., 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. T. Bozkaya and Z. M. Özsoyoglu. Indexing large metric spaces for similarity search queries. ACM Trans. Database Syst., 24(3):361--404, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Brin. Near neighbor search in large metric spaces. In VLDB, pages 574--584, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. W. A. Burkhard and R. M. Keller. Some approaches to best-match file searching. Commun. ACM, 16(4):230--236, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. E. Chávez and G. Navarro. A compact space decomposition for effective metric indexing. Pattern Recognition Letters, 26(9):1363--1376, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Chávez, G. Navarro, R. A. Baeza-Yates, and J. L. Marroquín. Searching in metric spaces. ACM Comput. Surv., 33(3):273--321, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. P. Ciaccia, M. Patella, and P. Zezula. M-tree: An efficient access method for similarity search in metric spaces. In VLDB, pages 426--435, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. J.-M. Geusebroek, G. J. Burghouts, and A. W. M. Smeulders. The amsterdam library of object images. IJCV, 61(1):103--112, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. Griffin, A. Holub, and P. Perona. Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology, 2007.Google ScholarGoogle Scholar
  10. J. Huang, R. Kumar, M. Mitra, W.-J. Zhu, and R. Zabih. Image indexing using color correlograms. In CVPR, pages 762--768, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. G. Navarro. Searching in metric spaces by spatial approximation. VLDB J., 11(1):28--46, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Rocha, J. Almeida, M. A. Nascimento, R. Torres, and S. Goldenstein. Efficient and flexible cluster-and-search approach for cbir. In Int. Conf. Adv. Concepts Intell. Vision Syst., pages 77--88, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. M. J. Swain and B. H. Ballard. Color indexing. IJCV, 7(1):11--32, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. C. Traina Jr., A. J. M. Traina, C. Faloutsos, and B. Seeger. Fast indexing and visualization of metric data sets using slim-trees. IEEE Trans. Known. Data Eng., 14(2):244--260, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. J. K. Uhlmann. Satisfying general proximity/similarity queries with metric trees. Inf. Process. Lett., 40(4):175--179, 1991.Google ScholarGoogle ScholarCross RefCross Ref
  16. M. R. Vieira, C. Traina Jr., F. J. T. Chino, and A. J. M. Traina. DBM-tree: Trading height-balancing for performance in metric access methods. J. Braz. Comp. Soc., 11(3):37--52, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  17. P. N. Yianilos. Data structures and algorithms for nearest neighbor search in general metric spaces. In SODA, pages 311--321, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. BP-tree: an efficient index for similarity search in high-dimensional metric spaces

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
            October 2010
            2036 pages
            ISBN:9781450300995
            DOI:10.1145/1871437

            Copyright © 2010 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 26 October 2010

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • poster

            Acceptance Rates

            Overall Acceptance Rate1,861of8,427submissions,22%

            Upcoming Conference

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader