Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








4,092 Hits in 4.9 sec

Estimating incremental dimensional algorithm with sequence data set

S. Adaekalavan
2013 2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering  
This method avoids the need to compute the distance of each data object to the cluster center. It saves running time.  ...  Hierarchical clustering is the grouping of objects of interest according to their similarity into a hierarchy, with different levels reflecting the degree of inter-object resemblance.  ...  Each level of a dendrogram can be evaluated by a cluster validation method and the best level and its corresponding clusters are returned. HAC algorithms are non-parametric.  ... 
doi:10.1109/icprime.2013.6496461 fatcat:djw3tm7tkne4hme7svhjilppou

Understanding outside collaborations of the Chinese Academy of Sciences using Jensen-Shannon divergence

Russell Duhon, Katy Börner, Jinah Park
2009 Visualization and Data Analysis 2009  
Applying the approach to data on the outside collaborations of the Chinese Academy of Sciences and visualizing the results reveals interesting structure relevant for science policy decisions.  ...  about how they collaborate with each other.  ...  Since it is a metric, techniques wellfounded in a metric space, such as agglomerative hierarchical clustering with Ward's method, can be brought to bear.  ... 
doi:10.1117/12.812383 dblp:conf/vda/Duhon09 fatcat:nbrp6lpy45cnvmwipzs5mjygzi

Algorithms for hierarchical clustering: an overview, II

Fionn Murtagh, Pedro Contreras
2017 Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery  
We survey agglomerative hierarchical clustering algorithms and discuss efficient implementations that are available in R and other software environments.  ...  We look at hierarchical self-organizing maps, and mixture models. We review grid-based clustering, focusing on hierarchical densitybased approaches.  ...  They also address the question of metrics: results are valid in a wide class of distances including those associated with the Minkowski metrics.  ... 
doi:10.1002/widm.1219 fatcat:4cdvfpypibe3petriqyuaiunk4

Happy and Immersive Clustering Segmentations of Biological Co-Expression Patterns [article]

Richard Tjörnhammar
2024 arXiv   pre-print
In this work, we present an approach for evaluating segmentation strategies and solving the biological problem of creating robust interpretable maps of biological data by employing wards agglomerative  ...  Finally, we find that the cluster representations and label annotations, in the case with clusters of high immersiveness, correspond to compositionally inferred labels with the highest specificity.  ...  Figure 3 : Synthetic data example employing agglomerative hierarchical clustering and complete linkage (maximum distance) method for cluster distance attribution.  ... 
arXiv:2402.06928v1 fatcat:2opogdohifckre2zc2x77wyicu

Ultrametric Component Analysis with Application to Analysis of Text and of Emotion [article]

Fionn Murtagh
2013 arXiv   pre-print
It is assumed that the data set, to begin with, is endowed with a metric, and we include discussion of how this can be brought about if a dissimilarity, only, holds.  ...  The basis for part of the metric-endowed data set being ultrametric is to consider triplets of the observables (vectors). We develop a novel consensus of hierarchical clusterings.  ...  Hierarchical agglomerative clustering algorithms are a general and widely-used class of algorithm for inducing an ultrametric on dissimilarity or distance input, or coordinate data on which a metric or  ... 
arXiv:1309.3611v1 fatcat:4sunhvuhwvcihgkp2nvspa73ua

Semi-supervised Hierarchical Clustering

Li Zheng, Tao Li
2011 2011 IEEE 11th International Conference on Data Mining  
In this paper, we propose a novel semi-supervised hierarchical clustering framework based on ultra-metric dendrogram distance.  ...  Semi-supervised clustering (i.e., clustering with knowledge-based constraints) has emerged as an important variant of the traditional clustering paradigms.  ...  Based on the way the clusters are generated, clustering methods can be divided into two categories: partitional clustering and hierarchical clustering [2] [3] .  ... 
doi:10.1109/icdm.2011.130 dblp:conf/icdm/ZhengL11 fatcat:7kwlsml3u5gkjcjc5zxcoaw5z4

Semi-supervised Hierarchical Co-clustering [chapter]

Feifei Huang, Yan Yang, Tao Li, Jinyuan Zhang, Tonny Rutayisire, Amjad Mahmood
2012 Lecture Notes in Computer Science  
In this paper, we propose a novel semi-supervised hierarchical clustering framework based on ultra-metric dendrogram distance.  ...  Semi-supervised clustering (i.e., clustering with knowledge-based constraints) has emerged as an important variant of the traditional clustering paradigms.  ...  Based on the way the clusters are generated, clustering methods can be divided into two categories: partitional clustering and hierarchical clustering [2] [3] .  ... 
doi:10.1007/978-3-642-31900-6_39 fatcat:misnzmthgnevdlrxm7g3g5k2wi

Partially Supervised Speaker Clustering

Hao Tang, S. M. Chu, M. Hasegawa-Johnson, T. S. Huang
2012 IEEE Transactions on Pattern Analysis and Machine Intelligence  
traditional speaker clustering methods based on the "bag of acoustic features" representation and statistical model based distance metrics, 2) our advocated use of the cosine distance metric yields consistent  ...  Our speaker clustering experiments on the GALE database clearly indicate that 1) our speaker clustering methods based on the GMM mean supervector representation and vector-based distance metrics outperform  ...  These two metrics are standard for evaluating (general) data clustering results [42] .  ... 
doi:10.1109/tpami.2011.174 pmid:21844626 fatcat:g7m7wki6pvcb3gotxydj4k6ewq

A Survey of Partitional and Hierarchical Clustering Algorithms [chapter]

Chandan K. Reddy, Bhanukiran Vinzamuri
2018 Data Clustering  
K-modes is a non-parametric clustering algorithm suitable for handling categorical data and optimizes a matching metric (L 0 loss function) without using any explicit distance metric.  ...  Ward's method chooses the initial centroids by using the sum of squared errors to evaluate the distance between two clusters.  ... 
doi:10.1201/9781315373515-4 fatcat:nv3tftuhyzcbdfi6g7invscl5u

Improving Test Distance for Failure Clustering with Hypergraph Modelling [article]

Gabin An, Juyeon Yoon, Joyce Jiyoung Whang, Shin Yoo
2021 arXiv   pre-print
We introduce a new test distance metric based on hypergraphs and evaluate their accuracy using multi-fault benchmarks that we have built on top of Defects4J and SIR.  ...  Results show that our technique, Hybiscus, can automatically achieve perfect clustering (i.e., the same number of clusters as the ground truth number of root causes, with all failing tests with the same  ...  Our empirical evaluation shows that, when used with Agglomerative Hierarchical Clustering (AHC) and a distance-based estimation of cluster numbers, Hybiscus can significantly outperform other failure clustering  ... 
arXiv:2104.10360v1 fatcat:cfa7r6wsonaphcwzoif5hxp5fy

Meta Clustering

Rich Caruana, Mohamed Elhawary, Nam Nguyen, Casey Smith
2006 IEEE International Conference on Data Mining. Proceedings  
We present methods for automatically generating a diverse set of alternate clusterings, as well as methods for grouping clusterings into meta clusters.  ...  We evaluate meta clustering on four test problems and two case studies. Surprisingly, clusterings that would be of most interest to users often are not very compact clusterings.  ...  Pedro Artigas, Anna Goldenberg, and Anton Likhodedov helped with early experiments in meta clustering as part of a class project at CMU.  ... 
doi:10.1109/icdm.2006.103 dblp:conf/icdm/CaruanaENS06 fatcat:t7fij6li3rdmhh23ulwwkx7yfq

Considerably Improving Clustering Algorithms Using UMAP Dimensionality Reduction Technique: A Comparative Study [chapter]

Mebarka Allaoui, Mohammed Lamine Kherfi, Abdelhakim Cheriet
2020 Lecture Notes in Computer Science  
We compare the results of many well-known clustering algorithms such ask-means, HDBSCAN, GMM and Agglomerative Hierarchical Clustering when they operate on the low-dimension feature space yielded by UMAP  ...  A series of experiments on several image datasets demonstrate that the proposed method allows each of the clustering algorithms studied to improve its performance on each dataset considered.  ...  Evaluation Metrics In order to validate the performance of unsupervised clustering algorithms, we use the two standard evaluation metrics, accuracy (ACC) and Normalized Mutual Information (NMI).  ... 
doi:10.1007/978-3-030-51935-3_34 fatcat:6yrc4jamwne7nhisg5mod4k3te

Russian News Clustering and Headline Selection Shared Task [article]

Ilya Gusev, Ivan Smurov
2021 arXiv   pre-print
As a part of it, we propose the tasks of Russian news event detection, headline selection, and headline generation. These tasks are accompanied by datasets and baselines.  ...  This paper presents the results of the Russian News Clustering and Headline Selection shared task.  ...  Acknowledgements We would like to thank the participants of all three tracks, especially Tatiana Shavrina, Ivan Bondarenko, and Nikita Yudin for helpful comments and valuable suggestions.  ... 
arXiv:2105.00981v3 fatcat:6oiewmaj7rd37ephovhlxeufsu

Identification and Investigation of the User Session for Lan Connectivity Via Enhanced Partition Approach of Clustering Techniques

Gunasekaran K
2012 International Journal of Computer Science Engineering and Information Technology  
This paper mainly presents some technical discussions on the identification and analyze of "LAN usersessions". The identification of a user-session is non trivial.  ...  We have defined a clustering based approach in detail, and also we discussed positive and negative of this approach, and we apply it to real traffic traces.  ...  To efficiently position, in our unidimensional metric space, the representatives at procedure start-up, we evaluate the distance between any two adjacent samples .According to the distance metric, we take  ... 
doi:10.5121/ijcseit.2012.2604 fatcat:6qkz4b7z7vhfhouftjgkcusiqa

KL divergence based agglomerative clustering for automated Vitiligo grading

Mithun Das Gupta, Srinidhi Srinivasa, J. Madhukara, Meryl Antony
2015 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)  
This leads to a very powerful yet elegant method for bottomup agglomerative clustering with strong theoretical guarantees.  ...  We introduce albedo and reflectance fields as features for the distance computations. We compare against other established methods to bring out possible pros and cons of the proposed method.  ...  A number of works, in the recent past, present agglomerative schemes for clustering with exponential families.  ... 
doi:10.1109/cvpr.2015.7298886 dblp:conf/cvpr/GuptaSMA15 fatcat:os3gvi6efvg3xgljdclfkzmnwa
« Previous Showing results 1 — 15 out of 4,092 results