论文地址:Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) 作者是大名鼎鼎的random forrest的发明人Leo Breiman。这篇文章发表于2001年,他指出了当时出现在统计学中的另外一种文化,以及代表这种文化的两种模型,随机森林和svm,并指出这两个模型颠覆了人们对于模型多样性,模型复杂...
这是一篇关于两种统计文化的文章,来自UC, Berkeley的统计学教授Leo Breamin,他也是Random Forest的主要贡献者。在这篇非常有名的文章中,他提出在统计分析领域,存在着两种文化,我个人非常赞同作者观点! 一种…
the data:Prediction. To be able to predict what the responsesare going to be to future input variables;Information. To extract some information abouthow nature is associating the response variablesto the input variables.There are two different approaches toward thesegoals:The Data Modeling CultureThe...
There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown.
In Breiman's "Two Cultures" paper, he contrasted statistical modeling (such as logistic regression) with prediction algorithms (such as random forests) and argued that inferences about traditional statistical models are risky unless the models also demonstrate high predictive accuracy.
Occam: the conflict between simplicity and accuracy; Bellman: dimensionality-curse or blessing? 这里的三个问题分别是说: - 首先,面对同样的数据,可能存在同样好的多个模型 —— 纯粹只根据统计显著性分析,是不足够的 - 其次,作为一个经验法则(rule of thumb),奥卡姆剃刀提倡模型的简洁;但是这和预测的精确,还...
