如何在支持向量机中找到重要因素 [英] how to find important factors in support vector machine

查看:52
本文介绍了如何在支持向量机中找到重要因素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

原始数据很大,所以我不能在这里发布.问题是我在 R 中使用包 e1071 来做支持向量机分析.原始数据有100个因子,预测结果为1或0.例如,我生成了一个包含 10 个因子的随机数据框.

The original data are large, so I cannot post it here. The question is that I use the package e1071 in R to do the support vector machine analysis. The original data have 100 factors and the prediction results is 1 or 0. for example, I generate a random data frame with 10 factors.

for (i in 1:10){
    factor<-c(factor,runif(10,5,10))
}
value<-matrix(factor,nrow=10)
y<-sample(0:1,10,replace=T)
data<-as.data.frame(cbind(y,value))

我做了预测部分,但我想知道如何确定哪些因素(在 10 个因素中)对结果很重要(更相关).

I did the prediction pard, but I wonder how to determine which factors (among the 10 factors) are important (more related) to the results.

例如,结果可能是因子 2、4、5,10 对最终结果有贡献.

For example, The result might be factor 2,4,5, and 10 are contribute to the final results.

你能帮我解决这个问题吗?非常感谢.

Can you help me with this? Thank you so much.

推荐答案

要完整回答这个问题并不简单.以下是有关此主题的入门示例:

A complete answer to this question is not simple. Here is an example for getting started on this subject:

library(rpart)
library(e1071)

cat('Regression tree case:\n')
fit1 <- rpart(Species ~ ., data=iris)
print(fit1$variable.importance)

cat('SVM model case:\n')
fit2 <- svm(Species ~ ., data = iris)
w <- t(fit2$coefs) %*% fit2$SV                 # weight vectors
w <- apply(w, 2, function(v){sqrt(sum(v^2))})  # weight
w <- sort(w, decreasing = T)
print(w)

上面脚本的结果是:

Regression tree case:
 Petal.Width Petal.Length Sepal.Length  Sepal.Width 
    88.96940     81.34496     54.09606     36.01309 

SVM model case:
Petal.Length  Petal.Width Sepal.Length  Sepal.Width 
   12.160093    11.737364     6.623965     4.722632 

可以看到两个模型的结果变量重要性相似.

You can see the result variable importance of two models are similar.

这是解释 SVM 结果的众多方法之一.

This is one of many methods of interpreting SVM results.

有关更多信息,请参阅以下论文:变量和特征选择简介",http://jmlr.csail.mit.edu/papers/v3/guyon03a.html

See following paper for more information: "An Introduction to Variable and Feature Selection", http://jmlr.csail.mit.edu/papers/v3/guyon03a.html

这篇关于如何在支持向量机中找到重要因素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆