一.什么是特征选择(Feature Selection ) 特征选择也叫特征子集选择 ( FSS , Feature Subset Selection ) 。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化。 需要区分特征选择与特征提取。特征提取 ( Feature extraction )是指利用已有的特征计算出一个抽象程度更高的特征集,也指计算得到某个...
(1) 什么是特征选择 特征选择 ( Feature Selection )也称特征子集选择( Feature Subset Selection, FSS ) ,或属性选择( Attribute Selection ) ,是指从全部特征中选取一个特征子集,使构造出来的模型更好。 (2) 为什么要做特征选择 在机器学习的实际应用中,特征数量往往较多,其中可能存在不相关的特征,特征之间也可...
特征选择包括两个环节:特征子集搜索(subset search)和特征子集评价(subset evaluation)。 特征子集搜索分为前向搜索策略、后向搜索策略和双向搜索策略。前向搜索逐步增加相关特征;后向搜索则每次尝试去掉一个特征;双向搜索结合前向搜索和后向搜索。上述策略都是贪心的,仅考虑本轮选择最优。 特征子集评价可以通过计算特征...
J. Novovicova´, P. Somol, and P. Pudil, "Oscillating Feature Subset Search Algorithm for Text Categorization," Lecture Notes in Computer Science, vol. 4225, pp. 572-587, Springer, 2006.Novovičova J., Somol P., Pudil P.: Oscillating feature subset search algorithm for text categori-...
data mining or machine learning application.Feature subset selection basically depends on selecting a criterion function for evaluation of the feature subset and a search strategy to find the best feature subset from a large number of feature subsets.Lots of techniques have been developed so far,...
This method can be integrated in many feature subset search algorithms. We have applied it with sequential search algorithms and have been able to reduce the number of quality calculations for finding accurate feature subsets by about 70%. We show these improvements by applying our approach to ...
This paper describes several known and some new methods for feature subset selection on large text data. Experimental comparison given on real-world data collected from Web users shows that characteristics of the problem domain and machine learning algor
gene subset where the intra-class distance is small and the inter-class distance is large. A higher local modularity of the gene subset corresponds to a greater discriminative of the gene subset. With the use of forward search strategy, a more informative gene subset as a group can be ...
This work focuses on inconsistency measure according to which a feature subset is inconsistent if there exist at least two instances with same feature values but with different class labels. We compare inconsistency measure with other measures and study different search strategies such as exhaustive, ...
子集搜索(subset search) 子集评价(subset evaluation) 过滤式选择 包裹式选择 嵌入式选择与L1正则化 子集搜索与评价 对一个学习任务来说,给定属性集,其中有些属性可能很关键、很有用,另一些属性则可能没什么用,我们将属性称为“特征”(feature),对当前学习任务有用的属性称为“相关特征”(relevant feature)、没什...