Clustering is one of the fundamental operations in data mining. Clustering is widely used in solving business problems such as customer segmentation and fraud detection. In real applications of clustering, we ar
In this work we make use of the well-known algorithm DBSCAN (Ester et al.1996) for cluster detection and the recently developed manifold learning algorithm UMAP (McInnes et al.2018) to infer the topological structure of a dataset. To be specific, “inferring the topological structure” as we...
The goal of clustering in data mining is to distinguish objects into partitions/clusters based on given criteria. Visualization methods and techniques may provide users an intuitively appealing interpretation of cluster structures. Having good visually separated groups of the studied data is beneficial for...
challenge to their detection and characterization in cancer research9,10. Therefore, in addition to commonly used tools like Seurat11that comprehensively identify major cell types, developing specialized methods to accurately and effectively identify and characterize these rare cell types has become a majo...
KMC has also been used in the detection and removal of undesirable speckles in tumor affected area [32,31], because the appearance of speckles on restored image reduces the perceived quality of visualization. Histogram threshold and watershed segmentation algorithms are used in conjunction with KMC ...
Clustering can also be used for anomaly detection to find data points that are not part of any cluster, or outliers. Clustering is used to identify groups of similar objects in datasets with two or more variable quantities. In practice, this data may be collected from marketing, biomedical, ...
clusterDBSCAN assigns these detections to a single detection. The DBSCAN algorithm assumes that clusters are dense regions in data space separated by regions of lower density and that all dense regions have similar densities. To measure density at a point, the algorithm counts the number of data ...
This is often a ‘natural’ metric for biologically related graphs, such as gene interactions or evolutionary graphs of related species, and works well for community detection. We can also split the graph into two parts by cutting a carefully selected set of edges, so as to maximize the ...
for cluster detection in atom probe microscopy20. In the context of super-resolution microscopy as introduced here the Voronoi sites would correspond to the experimentally determined molecular coordinates of individual fluorophores. A Voronoi cell represents an area of influence of the data point it ...
All the experiments are conducted on three data sets: A real Hadoop Cluster Monitoring Dataset, Cover Type Dataset and TV News Channel Commercial Detection Dataset. The Hadoop Cluster Monitoring data are collected by a Ganglia system from a real Hadoop cluster. The cluster has 30 nodes and each...