在gbm多项式dist中,如何使用预测来获得分类输出? [英] In gbm multinomial dist, how to use predict to get categorical output?

查看:95
本文介绍了在gbm多项式dist中,如何使用预测来获得分类输出?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的响应是一个分类变量(某些字母),因此我在建立模型时使用了distribution ='multinomial',现在我想预测响应并根据这些字母而不是概率矩阵来获取输出

My response is a categorical variable (some alphabets), so I used distribution='multinomial' when making the model, and now I want to predict the response and obtain the output in terms of these alphabets, instead of matrix of probabilities.

但是在predict(model, newdata, type='response')中,它给出的概率与type='link'的结果相同.

However in predict(model, newdata, type='response'), it gives probabilities, same as the result of type='link'.

有没有办法获得分类输出?

Is there a way to obtain categorical outputs?

BST = gbm(V1~.,data=training,distribution='multinomial',n.trees=2000,interaction.depth=4,cv.folds=5,shrinkage=0.005)

predBST = predict(BST,newdata=test,type='response')

推荐答案

predict.gbm文档中,它被提及:

如果type ="response",则gbm会转换回与 结果.目前,这将产生的唯一效果是返回 bernoulli的概率和泊松的预期计数.为了 其他发行版的响应"和链接"返回相同的结果.

If type="response" then gbm converts back to the same scale as the outcome. Currently the only effect this will have is returning probabilities for bernoulli and expected counts for poisson. For the other distributions "response" and "link" return the same.

正如Dominic所建议的,您应该做的是对预测输出的矢量进行apply(.., 1, which.max)运算,从所得的predBST矩阵中以最高的概率选择响应. 这是带有iris数据集的代码示例:

What you should do, as Dominic suggests, is to pick the response with the highest probability from the resulting predBST matrix, by doing apply(.., 1, which.max) on the vector output from prediction. Here is a code sample with the iris dataset:

library(gbm)

data(iris)

df <- iris[,-c(1)] # remove index

df <- df[sample(nrow(df)),]  # shuffle

df.train <- df[1:100,]
df.test <- df[101:150,]

BST = gbm(Species~.,data=df.train,
         distribution='multinomial',
         n.trees=200,
         interaction.depth=4,
         #cv.folds=5,
         shrinkage=0.005)

predBST = predict(BST,n.trees=200, newdata=df.test,type='response')

p.predBST <- apply(predBST, 1, which.max)

> predBST[1:6,,]
     setosa versicolor  virginica
[1,] 0.89010862 0.05501921 0.05487217
[2,] 0.09370400 0.45616148 0.45013452
[3,] 0.05476228 0.05968445 0.88555327
[4,] 0.05452803 0.06006513 0.88540684
[5,] 0.05393377 0.06735331 0.87871292
[6,] 0.05416855 0.06548646 0.88034499

 > head(p.predBST)
 [1] 1 2 3 3 3 3

这篇关于在gbm多项式dist中,如何使用预测来获得分类输出?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆