Synchronization-based scalable subspace clustering of high-dimensional data.

This paper discusses one method of clustering a high dimensional dataset using dimensionality reduction and context dependency measures (CDM). ... In this paper, we address the problem of combining multiple weighted clusters which belong to different subspaces of the input space. ... In high dimensional data, clusters often exist in different subspaces. Ensemble clustering based on full space clustering algorithms fails to cluster such data. ...

doi:10.22587/ajbas.2018.12.1.9 fatcat:cnsct4lqarhp7ar66ggi53b4mq

Open Access

Finding clusters in high dimensional data is a challenging research problem. ... Subspace clustering algorithms aim to find clusters in all possible subspaces of the dataset, where a subspace is a subset of dimensions of the data. ... Due to the curse of dimensionality, data points lose contrast in high-dimensional space, making it difficult to cluster data based on similarity measures (Steinbach et al., 2004; Aggarwal and Reddy, 2013 ...

doi:10.2478/amcs-2019-0006 fatcat:3fytlizww5g2rgdl4fkkohtrw4

DOAJ Szczepanski

., +, OJSP 2020 177-186 Feature selection A Compressive Classification Framework for High-Dimensional Data. ... Don- mez, M.A., +, OJSP 2020 77-89 Feature extraction A Compressive Classification Framework for High-Dimensional Data. ...

doi:10.1109/ojsp.2021.3053848 fatcat:23vjqfgxf5efjbgsc2mqni5m6m

DOAJ

We use Hilbert Space Filling Curve based linearization technique which preserves the data locality to eciently manage multi-dimensional data in a key-value store. ... Such applications require multi-attribute query processing, handling of high access scalability, support for millions of users, real time querying capability and analysis of large volumes of data. ... CAN supports multidimensional queries but it has a high routing cost for low dimensional data. Baton, P-Grid and P-ring supports one dimensional range queries. ...

doi:10.1007/s10619-017-7194-0 fatcat:5vz2aivp2nde7hfofqcvgsofda

Due to the sparsity of the high-dimensional vectors, the size of centroids grows quickly as new data points are assigned to the clusters. ... Traditional synchronization that directly broadcasts cluster centroids becomes too expensive and limits the scalability of the parallel algorithm. ... For the problem of high-dimensional data stream clustering, techniques such as projected/subspace clustering [8] [9] [38] and density-based approaches [1] [17] [38] have been proposed and investigated ...

doi:10.1109/ccgrid.2015.19 dblp:conf/ccgrid/GaoFQ15 fatcat:tpwqlkuu6zgkfgqwmkaa3zg424

Multiple Versions

This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. ... This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on the Carpenter algorithm. ... The other two datasets were synthetically generated and tuned to simulate use cases characterized by extremely high-dimensional data, i.e., with massive numbers of features. ...

doi:10.1109/icdmw.2015.18 dblp:conf/icdm/ApilettiBCGMP15 fatcat:wr5pbzfz2nb2pea4nsinhzthsi

Here, we establish a generalization of the MSF formalism that can characterize the stability of any cluster synchronization pattern, even when the oscillators and/or their interaction functions are nonidentical ... This leads to an algorithm that is error-tolerant and orders of magnitude faster than existing symmetry-based algorithms. ... The cluster synchronization subspace can be defined as an M d-dimensional subspace of the full nd-dimensional state space, in which oscillators from the same cluster have exactly the same dynamics. ...

doi:10.1137/19m127358x fatcat:b244hpb7pjcxvdaqsqjbjs6vaq

Here, we establish a generalization of the MSF formalism that can characterize the stability of any cluster synchronization pattern, even when the oscillators and/or their interactions are nonidentical ... The field of network synchronization has seen tremendous growth following the introduction of the master stability function (MSF) formalism, which enables the efficient stability analysis of synchronization ... The cluster synchronization subspace can be defined as an M d-dimensional subspace of the full nd-dimensional state space, in which oscillators from the same cluster have exactly the same dynamics. ...

arXiv:2003.05461v1 fatcat:w6acsvn7sncqtg5ajabwletnc4

Multiple Versions

It has been observed that high-dimensional data are usually distributed in a union of low-dimensional subspaces. ... In an era of ubiquitous large-scale evolving data streams, data stream clustering (DSC) has received lots of attention because the scale of the data streams far exceeds the ability of expert human analysts ... Particularly, the ORSC achieves state-of-the-art performance in accuracy and scalability based on the synchronization clustering theory [29] - [33] . ...

doi:10.1109/tcyb.2020.3023973 pmid:33232249 fatcat:jqjwgmevkffpzcn2gqtvla2ome

This paper proposes a scalable video summarization approach which provides multiple views and levels of details. Our method relies on the usage of cross media space and consensus clustering method. ... A video document is modelled as a data cube where the level of details is refined over nonconsensual features of the space. ... However, the strong requirements of those applications in terms of scale, time response and high dimensional information make the scalability a very challenging problem. ...

doi:10.1109/cbmi.2014.6849824 dblp:conf/cbmi/Perez-DanielNBMS14 fatcat:nvn7e226szdjlma2if7ejfgceq

We propose a new clustering algorithm, called SyMP, which is based on synchronization of pulse-coupled oscillators. ... The scalable version of SyMP uses an efficient incremental approach that requires a simple pass through the data set. ... Acknowledgments This material is based upon work supported by the National Science Foundation under Grant No. IIS-0133415. REFERENCES ...

doi:10.1145/775047.775121 dblp:conf/kdd/Frigui02 fatcat:ytt65l4juvdtnjvnpl4gn3kk4q

We propose a new clustering algorithm, called SyMP, which is based on synchronization of pulse-coupled oscillators. ... The scalable version of SyMP uses an efficient incremental approach that requires a simple pass through the data set. ... Acknowledgments This material is based upon work supported by the National Science Foundation under Grant No. IIS-0133415. REFERENCES ...

doi:10.1145/775107.775121 fatcat:75tnmqkwqrfenov6tqv7q5lf7q

A consequence of this is that over the next decade it will be necessary to develop and apply new numerical algorithms that are far more scalable than has historically been required. ... This chapter explores these challenges in the context of the solution of large systems of algebraic equations arising from the discretization of partial differential equations. ... Hence any practical, highly scalable, iterative algorithm must be based upon more advanced techniques such as Krylov subspace methods [46] . ...

doi:10.4203/ctr.3.1 fatcat:6ho4zlpt6vdqbkubltrfnm57cq

Grand tour is a method for viewing multidimensional data via linear projections onto a sequence of two dimensional subspaces and then moving continuously from one projection to the next. ... This paper extends the method to 3D grand tour where projections are made onto three dimensional subspaces. 3D cluster-guided tour is proposed where sequences of projections are determined by cluster centroids ... There is a loss of information in projecting high-dimensional data to low-dimensions. ...

doi:10.1007/3-540-48412-4_15 fatcat:mnfxhqh75rcxpbpdgfljsoil3u

The usefulness of our method is demonstrated using various multi- and high-dimensional benchmark datasets. ... Here, we understand the local subspace as the multidimensional local neighborhood of data points. ... manifold embedded in high-dimensional data. ...

arXiv:2009.03259v1 fatcat:6idkiz64hbcyfcfh4dwrukbqfe

Multiple Versions

Cluster Ensemble Approach for High Dimensional Data

Preserved Fulltext

Exploiting multi–core and many–core parallelism for subspace clustering

Preserved Fulltext

2020 Index IEEE Open Journal of Signal Processing Vol. 1

Preserved Fulltext

M-Grid: a distributed framework for multidimensional indexing and querying of location based data

Preserved Fulltext

Parallel Clustering of High-Dimensional Social Media Data Streams

Preserved Fulltext

Other Versions

PaMPa-HD: A Parallel MapReduce-Based Frequent Pattern Miner for High-Dimensional Data

Preserved Fulltext

Symmetry-Independent Stability Analysis of Synchronization Patterns

Preserved Fulltext

Symmetry-independent stability analysis of synchronization patterns [article]

Preserved Fulltext

Other Versions

Dynamic Sparse Subspace Clustering for Evolving High-Dimensional Data Streams

Preserved Fulltext

Scalable video summarization of cultural video documents in cross-media space based on data cube approach

Preserved Fulltext

SyMP

Preserved Fulltext

SyMP

Preserved Fulltext

Asynchronous Parallel Solvers for Linear Systems arising in Computational Engineering

Preserved Fulltext

3D Grand Tour for Multidimensional Data and Clusters [chapter]

Preserved Fulltext

Implicit Multidimensional Projection of Local Subspaces [article]

Preserved Fulltext

Other Versions