The proposed IKMN+ algorithm, a modification of the incrementalKMN uses this best distance measure to obtain a partition-based clustering. Our findings revealed that IKNM+ could overcome the issue of initial ce
In this paper, three partition-based algorithms, PAM, CLARA and CLARANs are combined with k-medoid distance based outlier detection to improve the outlier detection and removal process. The experimental results prove that CLARANS clustering algorithm when combined with medoid distance based outlier ...
K-means algorithm dependence on partition-based clustering technique is popular and widely used and applied to a variety of domains. K-means clustering results are extremely sensitive to the initial centroid; this is one of the major drawbacks of k-means algorithm. Due to such sensitivity; ...
To solve the problem, this paper presents a spatial distance-based spatial clustering algorithm for sparse image data (SDBSCA-SID). Firstly, the imaging range of the image sensor constitutes a two-dimensional (2D) constraint space. Under the constraint, spatial clustering was carried out based ...
The parallel algorithm of Fig. 6.15 finds the distance function of binary objects by propagation. Note that, as in the computation of convex hulls, this algorithm performs a final pass in which nothing happens. Again, this is inevitable if it is to be certain that the process runs to comple...
[2] Ester, Martin, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise.”In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 226–31. KDD’96. Portland, Oregon: ...
Huang H, Cheng Y, Zhao R (2008) A semi-supervised clustering algorithm based on must-link set. In: Proceedings of the international conference on advanced data mining and applications, pp 492–499 Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218 Article MATH Google...
We propose a sequence clustering algorithm and compare the partition quality and execution time of the proposed algorithm with those of a popular existing algorithm. The proposed clustering algorithm uses a grammar-based distance metric to determine partitioning for a set of biological sequences. The ...
The appropriate measure should be chosen according to the requirement of the clustering algorithm, the type of data (continuous, ordinal, nominal, binary, count or mixed), and whether the data have outliers (e.g., Manhattan distance is less sensitive to outliers compared with Euclidean distance)...
14.1.4.1 K-Means Clustering In the K-means clustering algorithm, which is a hard-clustering algorithm, we partition the dataset points into K clusters based on their pairwise distances. We typically use the Euclidean distance, defined by Eq. (14.2), that is, for two data points xi = (xi...