- 1.Ankerst M., Breunig M. M., Kriegel H.-P., Sander J.: 'OP- TICS: Ordering Points To Identify the Clustering Structure', Proc. ACM SIGMOD'99 Int. Conf. on Management of Data, Philadelphia, PA, 1999, pp. 49-60. Google ScholarDigital Library
- 2.Agrawal R., Gehrke J., Gunopulos D., Raghavan P.: 'Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications', Proc. ACM SIGMOD'98 Int. Conf. on Management of Data, Seattle, WA, 1998, pp. 94-105. Google ScholarDigital Library
- 3.Agrawal R., Imielinski T., Swami A.: 'Mining Association Rules between Sets of Items in Large Databases', Proc. ACM SIGMOD'93 Int. Conf. on Management of Data, Washington, D.C., 1993, pp. 207-216. Google ScholarDigital Library
- 4.Berchtold S., Keim D., Kriegel H.-P.: 'The X-Tree: An Index Structure for High-Dimensional Data', 22nd Int. Conf. on Very Large DataBases, 1996, Bombay, India, pp. 28-39. Google ScholarDigital Library
- 5.van den Bercken J., Seeger B., Widmayer P.:'A General Approach to Bulk Loading Multidimensional Index Structures', 23rd Conf. on Very Large Databases, 1997, Athens, Greece. Google ScholarDigital Library
- 6.Berchtold S., B~hm C., Kriegel H.-P.: 'Improving the Query Performance of High-Dimensional Index Structures Using Bulk-Load Operations', 6th. Int. Conf. on Extending Database Technology, 1998. Google ScholarDigital Library
- 7.Breunig S., Kriegel H.-P., Ng R., Sander J.: 'LOF: Identifying Density-Based Local Outliers', ACM SIGMOD Int. Conf. on Management of Data, Dallas, TX, 2000. Google ScholarDigital Library
- 8.Brinkhoff T., Kriegel H.-P., Seeger B.: 'Efficient Processing of Spatial Joins Using R-trees', Proc. ACM SIGMOD Int. Conf. on Management of Data, Washington D.C., 1993, pp. 237-246. Google ScholarDigital Library
- 9.Brinkhoff T., Kriegel H.-P., Seeger B.: 'Parallel Processing of Spatial Joins Using R-trees', Proc. 12th Int. Conf. on Data Engineering, New Orleans, LA, 1996. Google ScholarDigital Library
- 10.Beckmann N., Kriegel H.-P., Schneider R., Seeger B.: 'The R*-tree: An Efficient and Robust Access Method for Points and Rectangles', Proc. ACM SIGMOD Int. Conf. on Management of Data, Atlantic City, NJ, 1990, pp. 322-331. Google ScholarDigital Library
- 11.Ester M., Frommelt A., Kriegel H.-P., Sander J.: 'Algorithms for Characterization and Trend Detection in Spatial Data-bases', Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, New York, NY, 1998, pp. 44-50.Google Scholar
- 12.Ester M., Kriegel H.-P., Sander J., Wimmer M. Xu X.: 'Incremental Clustering for Mining in a Data Warehousing Environment', Proc. 24th Int. Conf. on Very Large Databases, New York, NY, 1998, pp. 323-333. Google ScholarDigital Library
- 13.Ester M., Kriegel H.-P., Sander J., Xu X.: 'A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise', Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, AAAI Press, 1996, pp. 226-231.Google Scholar
- 14.Faloutsos C., Lin K.-I.: 'FastMap: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Data', Proc. ACM SIGMOD Int. Conf. on Management of Data, San Jose, CA, 1995, pp. 163-174. Google ScholarDigital Library
- 15.Gaede V., G~nther O.:'Multidimensional Access Methods', ACM Computing Surveys, Vol. 30, No. 2, 1998, pp.170-231. Google ScholarDigital Library
- 16.Guha S., Rastogi R., Shim K.: 'CURE: An Efficient Clustering Algorithms for Large Databases', Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998, pp.73-84. Google ScholarDigital Library
- 17.Guttman A.: 'R-trees: A Dynamic Index Structure for Spatial Searching', Proc. ACM SIGMOD Int. Conf. on Management of Data, Boston, MA, 1984, pp. 47-57. Google ScholarDigital Library
- 18.Huang Y.-W., Jing N., Rundensteiner E. A.:'Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations', Proc. Int. Conf. on Very Large Databases, Athens, Greece, 1997, pp. 396-405. Google ScholarDigital Library
- 19.Hinneburg A., Keim D.A.: 'An Efficient Approach to Clustering in Large Multimedia Databases with Noise', Proc. 4th Int. Conf. on Knowledge Discovery & Data Mining, New York City, NY, 1998, pp. 58-65.Google Scholar
- 20.Hattori K., Torii Y.: 'Effective algorithms for the nearest neighbor method in the clustering problem'. Pattern Recognition, 1993, Vol. 26, No. 5, pp. 741-746.Google ScholarCross Ref
- 21.Huang, Z.: 'A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining'. In Proc. SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Tech. Report 97-07, UBC, Dept. of CS, 1997.Google Scholar
- 22.Jagadish H. V.: 'A Retrieval Technique for Similar Shapes', Proc. ACM SIGMOD Int. Conf. on Management of Data, Denver, CO, 1991, pp. 208-217. Google ScholarDigital Library
- 23.Jain A. K., Dubes R. C.: 'Algorithms for Clustering Data', Prentice-Hall, 1988. Google ScholarDigital Library
- 24.Keim D. A.: 'Visual Database Exploration Techniques', Proc. Tutorial Int. Conf. on Knowledge Discovery and Data Mining, Newport Beach, CA, 1997 (http://www.informatik.unihalle.de/~keim/PS/KDD97.pdf).Google Scholar
- 25.Koperski K., Han J.: 'Discovery of Spatial Association Rules in Geographic Information Databases', Proc. 4th Int. Symp. on Large Spatial Databases, Portland, ME, 1995, pp. 47-66. Google ScholarDigital Library
- 26.Knorr E.M., Ng R.T.: 'Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining', IEEE Trans. on Knowledge and Data Engineering, Vol. 8, No. 6, 1996, pp. 884-897. Google ScholarDigital Library
- 27.Knorr E.M., Ng R.T.: 'Algorithms for Mining Distance- Based Outliers in Large Datasets', Proc. 24th Int. Conf. on Very Large DataBases, 1998, New York City, NY, pp. 392-403. Google ScholarDigital Library
- 28.Kaufman L., Rousseeuw P. J.: 'Finding Groups in Data: An Introduction to Cluster Analysis', John Wiley & Sons, 1990.Google Scholar
- 29.Koudas N., Sevcik C.: 'Size Separation Spatial Join', Proc. ACM SIGMOD Int. Conf. on Management of Data, 1997, pp. 324-335. Google ScholarDigital Library
- 30.Koudas N., Sevcik C.: 'High Dimensional Similarity Joins: Algorithms and Performance Evaluation', Proc. 14th Int. Conf on Data Engineering, Best Paper Award, Orlando, FL, 1998, pp. 466-475. Google ScholarDigital Library
- 31.Kriegel H.-P., Seidl T.: 'Approximation-Based Similarity Search for 3-D Surface Segments', GeoInformatica Journal, Kluwer Academic Publishers, 1998, Vol.2, No. 2, pp. 113-147. Google ScholarDigital Library
- 32.Korn F., Sidiropoulos N., Faloutsos C., Siegel E., Protopapas Z.: 'Fast Nearest Neighbor Search in Medical Image Databases', Proc. 22nd Int. Conf. on Very Large DataBases, Mumbai, India, 1996, pp. 215-226. Google ScholarDigital Library
- 33.Lin K., Jagadish H. V., Faloutsos C.: 'The TV-Tree: An Index Structure for High-Dimensional Data', VLDB Journal, 1995, Vol. 3, pp. 517-542. Google ScholarDigital Library
- 34.Lo M.-L., Ravishankar C. V.: 'Spatial Joins Using Seeded Trees', Proc. ACM SIGMOD Int. Conf. on Management of Data, Denver, 1994, pp. 517-542 Google ScholarDigital Library
- 35.Lo M.-L., Ravishankar C. V.: 'Spatial Hash Joins', Proc. ACM SIGMOD Int. Conf. on Management of Data, 1996, pp. 247-258. Google ScholarDigital Library
- 36.MacQueen, J.: 'Some Methods for Classification and Analysis of Multivariate Observations', 5th Berkeley Symp. Math. Statist. Prob., Vol. 1, pp. 281-297.Google Scholar
- 37.Mitchell T.M.: 'Machine Learning', McCraw-Hill, 1997. Google ScholarDigital Library
- 38.Murtagh F.: 'A Survey of Recent Advances in Hierarchical Clustering Algorithms', The Computer Journal Vol. 26, No. 4, 1983, pp.354-359.Google ScholarCross Ref
- 39.Ng R. T., Han J.: 'Efficient and Effective Clustering Methods for Spatial Data Mining', Proc. 20th Int. Conf. on Very Large DataBases, Santiago de Chile, Chile, 1994, pp. 144-155. Google ScholarDigital Library
- 40.Patel J.M., DeWitt D.J., 'Partition Based Spatial-Merge Join', Proc. ACM SIGMOD Int. Conf. on Management of Data, 1996, pp. 259-270. Google ScholarDigital Library
- 41.Piatetsky-Shapiro G., Frawley W. J.: 'Knowledge Discovery in Databases', AAAI/MIT Press, 1991. Google ScholarDigital Library
- 42.Richards A.J. 'Remote Sensing Digital Image Analysis. An Introduction', Berlin: Springer Verlag, 1983. Google ScholarDigital Library
- 43.Robinson J. T.: 'The K-D-B-tree: A Search Structure for Large Multidimensional Dynamic Indexes', Proc. ACM SIGMOD Int. Conf. on Management of Data, 1981, pp. 10-18. Google ScholarDigital Library
- 44.Sellis T., Roussopoulos N., Faloutsos C.: 'The R+-Tree: A Dynamic Index for Multi-Dimensional Objects', Proc. 13th Int. Conf. on Very Large Databases, Brighton, 1987, pp.507-518. Google ScholarDigital Library
- 45.Sheikholeslami G., Chatterjee S., Zhang A.: 'WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases', Proc. Int. Conf. on Very Large DataBases, New York, NY, 1998, pp. 428 - 439. Google ScholarDigital Library
- 46.Sibson R.: 'SLINK: an optimally efficient algorithm for the single-link cluster method', The Computer Journal Vol. 16, No. 1, 1973, pp.30-34.Google ScholarCross Ref
- 47.Shim K., Srikant R., Agrawal R.: 'The e-KDB tree: A Fast Index Structure for High-dimensional Similarity Joins', IEEE Int. Conf on Data Engineering, 1997, 301-311. Google ScholarDigital Library
- 48.Ullman J.D.: 'Database and Knowledge-Base System', Vol. II,Compute Science Press, Rockville, MD, 1989.Google Scholar
Index Terms
- High performance clustering based on the similarity join
Recommendations
The k-Nearest Neighbour Join: Turbo Charging the KDD Process
The similarity join has become an important database primitive for supporting similarity searches and data mining. A similarity join combines two sets of complex objects such that the result contains all pairs of similar objects. Two types of the ...
String similarity join with different similarity thresholds based on novel indexing techniques
String similarity join is an essential operation of many applications that need to find all similar string pairs from two given collections. A quantitative way to determine whether two strings are similar is to compute their similarity based on a ...
High-Dimensional Similarity Joins
Many emerging data mining applications require a similarity join between points in a high-dimensional domain. We present a new algorithm that utilizes a new index structure, called the $\epsilon$ tree, for fast spatial similarity joins on high-...
Comments