Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/354756.354832acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article
Free Access

High performance clustering based on the similarity join

Authors Info & Claims
Published:06 November 2000Publication History
First page image

References

  1. 1.Ankerst M., Breunig M. M., Kriegel H.-P., Sander J.: 'OP- TICS: Ordering Points To Identify the Clustering Structure', Proc. ACM SIGMOD'99 Int. Conf. on Management of Data, Philadelphia, PA, 1999, pp. 49-60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.Agrawal R., Gehrke J., Gunopulos D., Raghavan P.: 'Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications', Proc. ACM SIGMOD'98 Int. Conf. on Management of Data, Seattle, WA, 1998, pp. 94-105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.Agrawal R., Imielinski T., Swami A.: 'Mining Association Rules between Sets of Items in Large Databases', Proc. ACM SIGMOD'93 Int. Conf. on Management of Data, Washington, D.C., 1993, pp. 207-216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4.Berchtold S., Keim D., Kriegel H.-P.: 'The X-Tree: An Index Structure for High-Dimensional Data', 22nd Int. Conf. on Very Large DataBases, 1996, Bombay, India, pp. 28-39. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. 5.van den Bercken J., Seeger B., Widmayer P.:'A General Approach to Bulk Loading Multidimensional Index Structures', 23rd Conf. on Very Large Databases, 1997, Athens, Greece. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6.Berchtold S., B~hm C., Kriegel H.-P.: 'Improving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operations', 6th. Int. Conf. on Extending Database Technology, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.Breunig S., Kriegel H.-P., Ng R., Sander J.: 'LOF: Identifying Density-Based Local Outliers', ACM SIGMOD Int. Conf. on Management of Data, Dallas, TX, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8.Brinkhoff T., Kriegel H.-P., Seeger B.: 'Efficient Processing of Spatial Joins Using R-trees', Proc. ACM SIGMOD Int. Conf. on Management of Data, Washington D.C., 1993, pp. 237-246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.Brinkhoff T., Kriegel H.-P., Seeger B.: 'Parallel Processing of Spatial Joins Using R-trees', Proc. 12th Int. Conf. on Data Engineering, New Orleans, LA, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.Beckmann N., Kriegel H.-P., Schneider R., Seeger B.: 'The R*-tree: An Efficient and Robust Access Method for Points and Rectangles', Proc. ACM SIGMOD Int. Conf. on Management of Data, Atlantic City, NJ, 1990, pp. 322-331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.Ester M., Frommelt A., Kriegel H.-P., Sander J.: 'Algorithms for Characterization and Trend Detection in Spatial Data-bases', Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, New York, NY, 1998, pp. 44-50.Google ScholarGoogle Scholar
  12. 12.Ester M., Kriegel H.-P., Sander J., Wimmer M. Xu X.: 'Incremental Clustering for Mining in a Data Warehousing Environment', Proc. 24th Int. Conf. on Very Large Databases, New York, NY, 1998, pp. 323-333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.Ester M., Kriegel H.-P., Sander J., Xu X.: 'A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise', Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, AAAI Press, 1996, pp. 226-231.Google ScholarGoogle Scholar
  14. 14.Faloutsos C., Lin K.-I.: 'FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Data', Proc. ACM SIGMOD Int. Conf. on Management of Data, San Jose, CA, 1995, pp. 163-174. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15.Gaede V., G~nther O.:'Multidimensional Access Methods', ACM Computing Surveys, Vol. 30, No. 2, 1998, pp.170-231. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.Guha S., Rastogi R., Shim K.: 'CURE: An Efficient Clustering Algorithms for Large Databases', Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998, pp.73-84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.Guttman A.: 'R-trees: A Dynamic Index Structure for Spatial Searching', Proc. ACM SIGMOD Int. Conf. on Management of Data, Boston, MA, 1984, pp. 47-57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.Huang Y.-W., Jing N., Rundensteiner E. A.:'Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations', Proc. Int. Conf. on Very Large Databases, Athens, Greece, 1997, pp. 396-405. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19.Hinneburg A., Keim D.A.: 'An Efficient Approach to Clustering in Large Multimedia Databases with Noise', Proc. 4th Int. Conf. on Knowledge Discovery & Data Mining, New York City, NY, 1998, pp. 58-65.Google ScholarGoogle Scholar
  20. 20.Hattori K., Torii Y.: 'Effective algorithms for the nearest neighbor method in the clustering problem'. Pattern Recognition, 1993, Vol. 26, No. 5, pp. 741-746.Google ScholarGoogle ScholarCross RefCross Ref
  21. 21.Huang, Z.: 'A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining'. In Proc. SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Tech. Report 97-07, UBC, Dept. of CS, 1997.Google ScholarGoogle Scholar
  22. 22.Jagadish H. V.: 'A Retrieval Technique for Similar Shapes', Proc. ACM SIGMOD Int. Conf. on Management of Data, Denver, CO, 1991, pp. 208-217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. 23.Jain A. K., Dubes R. C.: 'Algorithms for Clustering Data', Prentice-Hall, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. 24.Keim D. A.: 'Visual Database Exploration Techniques', Proc. Tutorial Int. Conf. on Knowledge Discovery and Data Mining, Newport Beach, CA, 1997 (http://www.informatik.unihalle.de/~keim/PS/KDD97.pdf).Google ScholarGoogle Scholar
  25. 25.Koperski K., Han J.: 'Discovery of Spatial Association Rules in Geographic Information Databases', Proc. 4th Int. Symp. on Large Spatial Databases, Portland, ME, 1995, pp. 47-66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. 26.Knorr E.M., Ng R.T.: 'Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining', IEEE Trans. on Knowledge and Data Engineering, Vol. 8, No. 6, 1996, pp. 884-897. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. 27.Knorr E.M., Ng R.T.: 'Algorithms for Mining Distance- Based Outliers in Large Datasets', Proc. 24th Int. Conf. on Very Large DataBases, 1998, New York City, NY, pp. 392-403. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. 28.Kaufman L., Rousseeuw P. J.: 'Finding Groups in Data: An Introduction to Cluster Analysis', John Wiley & Sons, 1990.Google ScholarGoogle Scholar
  29. 29.Koudas N., Sevcik C.: 'Size Separation Spatial Join', Proc. ACM SIGMOD Int. Conf. on Management of Data, 1997, pp. 324-335. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. 30.Koudas N., Sevcik C.: 'High Dimensional Similarity Joins: Algorithms and Performance Evaluation', Proc. 14th Int. Conf on Data Engineering, Best Paper Award, Orlando, FL, 1998, pp. 466-475. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. 31.Kriegel H.-P., Seidl T.: 'Approximation-Based Similarity Search for 3-D Surface Segments', GeoInformatica Journal, Kluwer Academic Publishers, 1998, Vol.2, No. 2, pp. 113-147. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. 32.Korn F., Sidiropoulos N., Faloutsos C., Siegel E., Protopapas Z.: 'Fast Nearest Neighbor Search in Medical Image Databases', Proc. 22nd Int. Conf. on Very Large DataBases, Mumbai, India, 1996, pp. 215-226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. 33.Lin K., Jagadish H. V., Faloutsos C.: 'The TV-Tree: An Index Structure for High-Dimensional Data', VLDB Journal, 1995, Vol. 3, pp. 517-542. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. 34.Lo M.-L., Ravishankar C. V.: 'Spatial Joins Using Seeded Trees', Proc. ACM SIGMOD Int. Conf. on Management of Data, Denver, 1994, pp. 517-542 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. 35.Lo M.-L., Ravishankar C. V.: 'Spatial Hash Joins', Proc. ACM SIGMOD Int. Conf. on Management of Data, 1996, pp. 247-258. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. 36.MacQueen, J.: 'Some Methods for Classification and Analysis of Multivariate Observations', 5th Berkeley Symp. Math. Statist. Prob., Vol. 1, pp. 281-297.Google ScholarGoogle Scholar
  37. 37.Mitchell T.M.: 'Machine Learning', McCraw-Hill, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. 38.Murtagh F.: 'A Survey of Recent Advances in Hierarchical Clustering Algorithms', The Computer Journal Vol. 26, No. 4, 1983, pp.354-359.Google ScholarGoogle ScholarCross RefCross Ref
  39. 39.Ng R. T., Han J.: 'Efficient and Effective Clustering Methods for Spatial Data Mining', Proc. 20th Int. Conf. on Very Large DataBases, Santiago de Chile, Chile, 1994, pp. 144-155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. 40.Patel J.M., DeWitt D.J., 'Partition Based Spatial-Merge Join', Proc. ACM SIGMOD Int. Conf. on Management of Data, 1996, pp. 259-270. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. 41.Piatetsky-Shapiro G., Frawley W. J.: 'Knowledge Discovery in Databases', AAAI/MIT Press, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. 42.Richards A.J. 'Remote Sensing Digital Image Analysis. An Introduction', Berlin: Springer Verlag, 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. 43.Robinson J. T.: 'The K-D-B-tree: A Search Structure for Large Multidimensional Dynamic Indexes', Proc. ACM SIGMOD Int. Conf. on Management of Data, 1981, pp. 10-18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. 44.Sellis T., Roussopoulos N., Faloutsos C.: 'The R+-Tree: A Dynamic Index for Multi-Dimensional Objects', Proc. 13th Int. Conf. on Very Large Databases, Brighton, 1987, pp.507-518. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. 45.Sheikholeslami G., Chatterjee S., Zhang A.: 'WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases', Proc. Int. Conf. on Very Large DataBases, New York, NY, 1998, pp. 428 - 439. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. 46.Sibson R.: 'SLINK: an optimally efficient algorithm for the single-link cluster method', The Computer Journal Vol. 16, No. 1, 1973, pp.30-34.Google ScholarGoogle ScholarCross RefCross Ref
  47. 47.Shim K., Srikant R., Agrawal R.: 'The e-KDB tree: A Fast Index Structure for High-dimensional Similarity Joins', IEEE Int. Conf on Data Engineering, 1997, 301-311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. 48.Ullman J.D.: 'Database and Knowledge-Base System', Vol. II,Compute Science Press, Rockville, MD, 1989.Google ScholarGoogle Scholar

Index Terms

  1. High performance clustering based on the similarity join

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              CIKM '00: Proceedings of the ninth international conference on Information and knowledge management
              November 2000
              532 pages
              ISBN:1581133200
              DOI:10.1145/354756

              Copyright © 2000 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 6 November 2000

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              Overall Acceptance Rate1,861of8,427submissions,22%

              Upcoming Conference

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader