多输出回归 [英] Multi-output regression

查看:118
本文介绍了多输出回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在过去的几周里,我一直在研究多元输出回归.我正在使用scikit学习包.我的机器学习问题的输入为3,需要预测两个输出变量. sklearn软件包中的某些ML模型可以自然地支持多输出回归.如果模型不支持此功能,则可以使用sklearn多输出回归算法进行转换. multioutput 类适合每个目标一个回归器.

I have been looking in to Multi-output regression the last view weeks. I am working with the scikit learn package. My machine learning problem has an a input of 3 features an needs to predict two output variables. Some ML models in the sklearn package support multioutput regression nativly. If the models do not support this, the sklearn multioutput regression algorithm can be used to convert it. The multioutput class fits one regressor per target.

  1. mullioutput回归类或受支持的多输出回归算法是否考虑了输入变量的潜在关系?
  2. 我应该使用神经网络代替多输出回归算法吗?

推荐答案

1)对于您的第一个问题,我将其分为两部分.

1) For your first question, I have divided that into two parts.

  • 第一部分在您链接的文档以及此

    由于MultiOutputRegressor每个目标可容纳一个Regressor,因此无法使用 目标之间相关性的优势.

    As MultiOutputRegressor fits one regressor per target it can not take advantage of correlations between targets.

  • 第一个问题的第二部分询问其他支持此算法的算法.为此,您可以固有地查看"用户指南中的多类" 部分.本质上,多类意味着他们不使用One-vs-Rest或One-vs-One策略来处理多类(OvO和OvR使用多个模型来适应多个类,因此可能不使用目标).本质上,多类意味着他们可以将多类设置结构化为一个模型.这列出了以下内容:

  • Second part of first question asks about other algorithms which support this. For that you can look at the "inherently multiclass" part in the user-guide. Inherently multi-class means that they don't use One-vs-Rest or One-vs-One strategy to be able to handle multi-class (OvO and OvR uses multiple models to fit multiple classes and so may not use the relationship between targets). Inherently multi-class means that they can structure the multi-class setting into a single model. This lists the following:

    sklearn.naive_bayes.BernoulliNB
    sklearn.tree.DecisionTreeClassifier
    sklearn.tree.ExtraTreeClassifier
    sklearn.ensemble.ExtraTreesClassifier
    sklearn.naive_bayes.GaussianNB
    sklearn.neighbors.KNeighborsClassifier
    sklearn.semi_supervised.LabelPropagation
    sklearn.semi_supervised.LabelSpreading
    sklearn.discriminant_analysis.LinearDiscriminantAnalysis
    sklearn.svm.LinearSVC (setting multi_class="crammer_singer")
    sklearn.linear_model.LogisticRegression (setting multi_class="multinomial")
    ...
    ...
    ...
    

    尝试最后用"Regressor"替换"Classifier",并在那里查看fit()方法的文档.例如,让我们使用 DecisionTreeRegressor. fit():

    Try replacing the 'Classifier' at the end with 'Regressor' and see the documentation of fit() method there. For example let's take DecisionTreeRegressor.fit():

    y : array-like, shape = [n_samples] or [n_samples, n_outputs]
    
        The target values (real numbers). 
        Use dtype=np.float64 and order='C' for maximum efficiency.
    

    您看到它支持目标(y)的二维数组.因此它可能能够使用目标的相关性和潜在关系.

    You see that it supports a 2-d array for targets (y). So it may be able to use correlation and underlying relationship of targets.

    2)现在,关于您是否使用神经网络的第二个问题取决于个人喜好,问题的类型,所拥有数据的数量和类型以及您想要进行的训练迭代.也许您可以尝试多种算法,然后选择哪种算法可以为您的数据和问题提供最佳的输出.

    2) Now for your second question about using neural network or not, it depends on personal preference, the type of problem, the amount and type of data you have, the training iterations you want to do. Maybe you can try multiple algorithms and choose what gives best output for your data and problem.

    这篇关于多输出回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆