In this paper, we introduce two undersampling strategies in which a clustering technique is used during the data preprocessing step. Specifically, the number of clusters in the majority class is set to be equal
Furthermore, the relatively large spread of values in these three measures shows that, except for SilhoutteOD, all methods have a low resilience concerning the type of data under analysis even when using their best possible parameter configurations. Overall, both the clustering-based and the non-...
样算法(random under-sampling algorithm,RUS算法)进 行了预处理,以降低数据集的不平衡度。由于本实验是 一个不平衡数据分类实验,所以传统算法的分类准确 率评价指标不能完全反映出分类器的性能。为有效进 行不平衡数据分类问题上的分类器性能评价,本文使 ...
This paper is based on research sponsored by the Air Force Research Laboratory and the Office of the Secretary of Defense (OSD) under agreement number FA8750-15-2-0116. Also, this work is partially funded through the National Science Foundation (NSF) under grant number 2000320. Funding This ...
In this step, the optimization is aimed towards the data base of the FCM model and based on the sampling data as Eq. (3). MATLAB-Fuzzy Logic Tool Box (genfis) is utilized to generate the FIS of FCM model. To improve the precision as well as reduce the loss of the interpretability,...
In addition, the dataset was maintained at a constant outcome ratio through stratified sampling for analysis, but this may not necessarily be the case with a new dataset. Nonetheless, a neural network-based cluster model was first applied to stroke patients from a real-world dataset. Second, ...
Under the premises above, let us introduce the MRTA clustering as (1)C={C1,C2,…,CNc},where Ch,h=1,…,Nc are disjoint clusters of robots and tasks verifying (2)⋃hCh=N,Cha∩Chb=0̸,∀Cha,Chb∈C,Cha≠Chb.Notice that the number of different clustering alternatives for C is upp...
flowEMMi slightly underes- timated the abundances of cell clusters which might be caused by the fact that manually set clusters do not fol- low statistical conditions e.g. confidence intervals. Cell clusters only containing a small number of cells typi- cally (at least for our data) do ...
Under the simulation condition, we show what extent our proposed DPC-GS-MND approach outperforms a basic density peaks clustering (DPC) algorithm and finally, we compare the approach to three other challengers from the literature: DPCG, MDPCA and DPC-DLP. The rest of the paper is organized ...
We first investigated the performance of the eight methods under different dropout rates, which are defined as the proportion of expressed genes being knocked out of read counts. To do this we varied the midpoint parameter of the dropout logistic function to generate datasets with different dropout...