Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








2,034 Hits in 2.4 sec

Cluster Ensemble Approach for High Dimensional Data

2018 Australian Journal of Basic and Applied Sciences  
This paper discusses one method of clustering a high dimensional dataset using dimensionality reduction and context dependency measures (CDM).  ...  In this paper, we address the problem of combining multiple weighted clusters which belong to different subspaces of the input space.  ...  In high dimensional data, clusters often exist in different subspaces. Ensemble clustering based on full space clustering algorithms fails to cluster such data.  ... 
doi:10.22587/ajbas.2018.12.1.9 fatcat:cnsct4lqarhp7ar66ggi53b4mq

Exploiting multi–core and many–core parallelism for subspace clustering

Amitava Datta, Amardeep Kaur, Tobias Lauer, Sami Chabbouh
2019 International Journal of Applied Mathematics and Computer Science  
Finding clusters in high dimensional data is a challenging research problem.  ...  Subspace clustering algorithms aim to find clusters in all possible subspaces of the dataset, where a subspace is a subset of dimensions of the data.  ...  Due to the curse of dimensionality, data points lose contrast in high-dimensional space, making it difficult to cluster data based on similarity measures (Steinbach et al., 2004; Aggarwal and Reddy, 2013  ... 
doi:10.2478/amcs-2019-0006 fatcat:3fytlizww5g2rgdl4fkkohtrw4

2020 Index IEEE Open Journal of Signal Processing Vol. 1

2020 IEEE Open Journal of Signal Processing  
., +, OJSP 2020 177-186 Feature selection A Compressive Classification Framework for High-Dimensional Data.  ...  Don- mez, M.A., +, OJSP 2020 77-89 Feature extraction A Compressive Classification Framework for High-Dimensional Data.  ... 
doi:10.1109/ojsp.2021.3053848 fatcat:23vjqfgxf5efjbgsc2mqni5m6m

M-Grid: a distributed framework for multidimensional indexing and querying of location based data

Shashank Kumar, Sanjay Madria, Mark Linderman
2017 Distributed and parallel databases  
We use Hilbert Space Filling Curve based linearization technique which preserves the data locality to eciently manage multi-dimensional data in a key-value store.  ...  Such applications require multi-attribute query processing, handling of high access scalability, support for millions of users, real time querying capability and analysis of large volumes of data.  ...  CAN supports multidimensional queries but it has a high routing cost for low dimensional data. Baton, P-Grid and P-ring supports one dimensional range queries.  ... 
doi:10.1007/s10619-017-7194-0 fatcat:5vz2aivp2nde7hfofqcvgsofda

Parallel Clustering of High-Dimensional Social Media Data Streams

Xiaoming Gao, Emilio Ferrara, Judy Qiu
2015 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing  
Due to the sparsity of the high-dimensional vectors, the size of centroids grows quickly as new data points are assigned to the clusters.  ...  Traditional synchronization that directly broadcasts cluster centroids becomes too expensive and limits the scalability of the parallel algorithm.  ...  For the problem of high-dimensional data stream clustering, techniques such as projected/subspace clustering [8] [9] [38] and density-based approaches [1] [17] [38] have been proposed and investigated  ... 
doi:10.1109/ccgrid.2015.19 dblp:conf/ccgrid/GaoFQ15 fatcat:tpwqlkuu6zgkfgqwmkaa3zg424

PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data

Daniele Apiletti, Elena Baralis, Tania Cerquitelli, Paolo Garza, Pietro Michiardi, Fabio Pulvirenti
2015 2015 IEEE International Conference on Data Mining Workshop (ICDMW)  
This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter.  ...  This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on the Carpenter algorithm.  ...  The other two datasets were synthetically generated and tuned to simulate use cases characterized by extremely high-dimensional data, i.e., with massive numbers of features.  ... 
doi:10.1109/icdmw.2015.18 dblp:conf/icdm/ApilettiBCGMP15 fatcat:wr5pbzfz2nb2pea4nsinhzthsi

Symmetry-Independent Stability Analysis of Synchronization Patterns

Yuanzhao Zhang, Adilson E. Motter
2020 SIAM Review  
Here, we establish a generalization of the MSF formalism that can characterize the stability of any cluster synchronization pattern, even when the oscillators and/or their interaction functions are nonidentical  ...  This leads to an algorithm that is error-tolerant and orders of magnitude faster than existing symmetry-based algorithms.  ...  The cluster synchronization subspace can be defined as an M d-dimensional subspace of the full nd-dimensional state space, in which oscillators from the same cluster have exactly the same dynamics.  ... 
doi:10.1137/19m127358x fatcat:b244hpb7pjcxvdaqsqjbjs6vaq

Symmetry-independent stability analysis of synchronization patterns [article]

Yuanzhao Zhang, Adilson E. Motter
2020 arXiv   pre-print
Here, we establish a generalization of the MSF formalism that can characterize the stability of any cluster synchronization pattern, even when the oscillators and/or their interactions are nonidentical  ...  The field of network synchronization has seen tremendous growth following the introduction of the master stability function (MSF) formalism, which enables the efficient stability analysis of synchronization  ...  The cluster synchronization subspace can be defined as an M d-dimensional subspace of the full nd-dimensional state space, in which oscillators from the same cluster have exactly the same dynamics.  ... 
arXiv:2003.05461v1 fatcat:w6acsvn7sncqtg5ajabwletnc4

Dynamic Sparse Subspace Clustering for Evolving High-Dimensional Data Streams

Jinping Sui, Zhen Liu, Li Liu, Alexander Jung, Xiang Li
2020 IEEE Transactions on Cybernetics  
It has been observed that high-dimensional data are usually distributed in a union of low-dimensional subspaces.  ...  In an era of ubiquitous large-scale evolving data streams, data stream clustering (DSC) has received lots of attention because the scale of the data streams far exceeds the ability of expert human analysts  ...  Particularly, the ORSC achieves state-of-the-art performance in accuracy and scalability based on the synchronization clustering theory [29] - [33] .  ... 
doi:10.1109/tcyb.2020.3023973 pmid:33232249 fatcat:jqjwgmevkffpzcn2gqtvla2ome

Scalable video summarization of cultural video documents in cross-media space based on data cube approach

Karina R. Perez-Daniel, Mariko Nakano Miyatake, Jenny Benois-Pineau, Sofian Maabout, Gabriel Sargent
2014 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)  
This paper proposes a scalable video summarization approach which provides multiple views and levels of details. Our method relies on the usage of cross media space and consensus clustering method.  ...  A video document is modelled as a data cube where the level of details is refined over nonconsensual features of the space.  ...  However, the strong requirements of those applications in terms of scale, time response and high dimensional information make the scalability a very challenging problem.  ... 
doi:10.1109/cbmi.2014.6849824 dblp:conf/cbmi/Perez-DanielNBMS14 fatcat:nvn7e226szdjlma2if7ejfgceq

SyMP

Hichem Frigui
2002 Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02  
We propose a new clustering algorithm, called SyMP, which is based on synchronization of pulse-coupled oscillators.  ...  The scalable version of SyMP uses an efficient incremental approach that requires a simple pass through the data set.  ...  Acknowledgments This material is based upon work supported by the National Science Foundation under Grant No. IIS-0133415. REFERENCES  ... 
doi:10.1145/775047.775121 dblp:conf/kdd/Frigui02 fatcat:ytt65l4juvdtnjvnpl4gn3kk4q

SyMP

Hichem Frigui
2002 Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '02  
We propose a new clustering algorithm, called SyMP, which is based on synchronization of pulse-coupled oscillators.  ...  The scalable version of SyMP uses an efficient incremental approach that requires a simple pass through the data set.  ...  Acknowledgments This material is based upon work supported by the National Science Foundation under Grant No. IIS-0133415. REFERENCES  ... 
doi:10.1145/775107.775121 fatcat:75tnmqkwqrfenov6tqv7q5lf7q

Asynchronous Parallel Solvers for Linear Systems arising in Computational Engineering

P.K. Jimack, M.A. Walkley
2011 Computational Technology Reviews  
A consequence of this is that over the next decade it will be necessary to develop and apply new numerical algorithms that are far more scalable than has historically been required.  ...  This chapter explores these challenges in the context of the solution of large systems of algebraic equations arising from the discretization of partial differential equations.  ...  Hence any practical, highly scalable, iterative algorithm must be based upon more advanced techniques such as Krylov subspace methods [46] .  ... 
doi:10.4203/ctr.3.1 fatcat:6ho4zlpt6vdqbkubltrfnm57cq

3D Grand Tour for Multidimensional Data and Clusters [chapter]

Li Yang
1999 Lecture Notes in Computer Science  
Grand tour is a method for viewing multidimensional data via linear projections onto a sequence of two dimensional subspaces and then moving continuously from one projection to the next.  ...  This paper extends the method to 3D grand tour where projections are made onto three dimensional subspaces. 3D cluster-guided tour is proposed where sequences of projections are determined by cluster centroids  ...  There is a loss of information in projecting high-dimensional data to low-dimensions.  ... 
doi:10.1007/3-540-48412-4_15 fatcat:mnfxhqh75rcxpbpdgfljsoil3u

Implicit Multidimensional Projection of Local Subspaces [article]

Rongzheng Bian, Yumeng Xue, Liang Zhou, Jian Zhang, Baoquan Chen, Daniel Weiskopf, Yunhai Wang
2020 arXiv   pre-print
The usefulness of our method is demonstrated using various multi- and high-dimensional benchmark datasets.  ...  Here, we understand the local subspace as the multidimensional local neighborhood of data points.  ...  manifold embedded in high-dimensional data.  ... 
arXiv:2009.03259v1 fatcat:6idkiz64hbcyfcfh4dwrukbqfe
« Previous Showing results 1 — 15 out of 2,034 results