The proposed IKMN+ algorithm, a modification of the incrementalKMN uses this best distance measure to obtain a partition-based clustering. Our findings revealed that IKNM+ could overcome the issue of initial ce
In this paper, three partition-based algorithms, PAM, CLARA and CLARANs are combined with k-medoid distance based outlier detection to improve the outlier detection and removal process. The experimental results prove that CLARANS clustering algorithm when combined with medoid distance based outlier ...
To solve the problem, this paper presents a spatial distance-based spatial clustering algorithm for sparse image data (SDBSCA-SID). Firstly, the imaging range of the image sensor constitutes a two-dimensional (2D) constraint space. Under the constraint, spatial clustering was carried out based ...
Furthermore, a clustering algorithm can be described as density-based if it operates based on the density of a region of the dataset or as similarity-based if it is based on the similarities among the members of a dataset [1], [2], [3], [4], [5], [6], [7]. On the other ...
K-means algorithm dependence on partition-based clustering technique is popular and widely used and applied to a variety of domains. K-means clustering results are extremely sensitive to the initial centroid; this is one of the major drawbacks of k-means algorithm. Due to such sensitivity; ...
[2] Ester, Martin, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.”In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 226–31. KDD’96. Portland, Oregon: ...
Huang H, Cheng Y, Zhao R (2008) A semi-supervised clustering algorithm based on must-link set. In: Proceedings of the international conference on advanced data mining and applications, pp 492–499 Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218 Article MATH Google...
BMC Bioinformatics (2020) 21:121 Page 6 of 14 Algorithm 1 Hellinger distance based stable sparse selec- tion (sssHD) 1: (X0, y0): predictor matrix and class label; 2: n0, n1: the size of the majority, minority respectively; 3: r0, r1: the ratio of subsampling from the majority, ...
However, the fundamental challenge to use distance-based clustering techniques is the “curse of dimensionality”. Since the geostatistical models contain a huge amount of data, they should be considered as a high dimensional data set. The high dimensional data may include a large number of ...
The parallel algorithm of Fig. 6.15 finds the distance function of binary objects by propagation. Note that, as in the computation of convex hulls, this algorithm performs a final pass in which nothing happens. Again, this is inevitable if it is to be certain that the process runs to comple...