用于校准的广义加性模型 [英] Generalized additive models for calibration

查看:91
本文介绍了用于校准的广义加性模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从事概率校准工作.我正在使用一种称为广义加性模型的概率映射方法.>

我写的算法是:

probMapping = function(x, y, datax, datay) {if(length(x) < length(y))stop("train 比 test 小")if(length(datax) < length(datay))stop("train 比 test 小")datax$prob = x # trainset:数据和原始概率datay$prob = y # 测试集:数据和原始概率prob_map = gam(目标 ~ 概率,数据 = datax,家庭 = 二项式,跟踪 = TRUE)prob_map_prob = 预测(prob_map,新数据 = 数据,类型 =概率")# 返回(str(datax))返回(prob_map_prob)}

我使用的包是mgcv.

  1. x - 对 train 数据集的预测
  2. y - 对 test 数据集的预测
  3. datax - traindata
  4. datay - testdata

问题:

  1. 输出值不在 0 和 1 之间
  2. 我收到以下警告消息:

    在 predict.gam(prob_map, newdata = datay, type = "prob") 中:未知类型,重置为术语.

解决方案

警告告诉您 predict.gam 无法识别您传递给 type 参数.由于不明白,它决定使用type的默认值,即"terms".

请注意,带有 type="terms"predict.gam 返回有关模型项的信息,不是概率.因此输出值不在 0 和 1 之间.

有关 mgcv::predict.gam 的更多信息,请查看 这里.

I work on calibration of probabilities. I'm using a probability mapping approach called generalized additive models.

The algorithm I wrote is:

probMapping = function(x, y, datax, datay) {

    if(length(x) < length(y))stop("train smaller than test")
    if(length(datax) < length(datay))stop("train smaller than test")

    datax$prob = x # trainset: data and raw probabilities
    datay$prob = y # testset: data and raw probabilities

    prob_map = gam(Target ~ prob, data = datax, familiy = binomial, trace = TRUE)
    prob_map_prob = predict(prob_map, newdata = datay, type = "prob")

  # return(str(datax))
  return(prob_map_prob)
}

The package I'm using is mgcv.

  1. x - prediction on train dataset
  2. y - prediction on test dataset
  3. datax - traindata
  4. datay - testdata

Problems:

  1. The output values are not between 0 and 1
  2. I get the following warning message:

    In predict.gam(prob_map, newdata = datay, type = "prob") :
    Unknown type, reset to terms.
    

解决方案

The warning is telling you that predict.gam doesn't recognize the value you passed to the type parameter. Since it didn't understand, it decided to use the default value of type, which is "terms".

Note that predict.gam with type="terms" returns information about the model terms, not probabilties. Hence the output values are not between 0 and 1.

For more information about mgcv::predict.gam, take a look here.

这篇关于用于校准的广义加性模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆