如何更重视机器学习中的某些特征? [英] How to put more weight on certain features in machine learning?

查看:42
本文介绍了如何更重视机器学习中的某些特征?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果使用像 scikit-learn 这样的库,我如何为输入中的某些特征分配更多的权重给像 SVM 这样的分类器?这是人们做的事情还是有其他解决方案可以解决我的问题?

If using a library like scikit-learn, how do I assign more weight on certain features in the input to a classifier like SVM? Is this something people do or is there another solution to my problem?

推荐答案

首先 - 你可能不应该这样做.机器学习的整个概念是使用统计分析分配最佳权重.您在这里干扰了整个概念,因此您需要非常有力的证据证明这对您尝试建模的过程至关重要,并且由于某种原因,您的模型目前缺少它.

First of all - you should probably not do it. The whole concept of machine learning is to use statistical analysis to assign optimal weights. You are interfering here with the whole concept, thus you need really strong evidence that this is crucial to the process you are trying to model, and for some reason your model is currently missing it.

话虽如此 - 没有通用的答案.这纯粹是特定于模型的,其中一些将允许您对特征进行加权 - 在随机森林中,您可以将采样特征的分布偏向于您感兴趣的特征进行分析;在 SVM 中,只需将给定的特征乘以一个常数就足够了 - 还记得你被告知在 SVM 中对特征进行归一化吗?这就是为什么 - 您可以使用特征的比例来引导"您的分类器朝向给定的特征.那些具有高价值的将被优先考虑.这实际上适用于任何权重范数正则化模型(正则化逻辑回归、岭回归、套索等).

That being said - there is no general answer. This is purely model specific, some of which will allow you to weight features - in random forest you could bias distribution from which you sample features to analyse towards the ones that you are interested in; in SVM it should be enough to just multiply given feature by a constant - remember when you were told to normalize your features in SVM? This is why - you can use the scale of features to 'steer' your classifier towards given features. The ones with high values will be preffered. This will actually work for any weight norm-regularized model (regularized logistic regression, ridge regression, lasso etc.).

这篇关于如何更重视机器学习中的某些特征?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆