从Cox PH模型预测概率 [英] Predict probability from Cox PH model

查看:771
本文介绍了从Cox PH模型预测概率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用cox模型来预测一段时间后发生故障的可能性(称为停止)。

I am trying to use cox model to predict the probability of failure after time (which is named stop) 3.

bladder1 <- bladder[bladder$enum < 5, ] 
coxmodel = coxph(Surv(stop, event) ~ (rx + size + number)  + 
        cluster(id), bladder1)
range(predict(coxmodel, bladder1, type = "lp"))
range(predict(coxmodel, bladder1, type = "risk"))
range(predict(coxmodel, bladder1, type = "terms"))
range(predict(coxmodel, bladder1, type = "expected"))

但是,预测函数的输出为都不在0-1范围内。是否有任何函数或如何使用lp预测和基线危害函数来计算概率?

However, the outputs of predict function are all not in 0-1 range. Is there any function or how can I use the lp prediction and baseline hazard function to calculate probability?

推荐答案

请阅读帮助页面为 predict.coxph 。这些都不应该是概率。一组特定的协变量的线性预测因子是相对于假设(且很可能不存在)情况的对数风险比,其中所有预测因子均值。 期望最接近概率,因为它是预测的事件数,但是它需要指定时间,然后用观察开始时的危险数除。

Please read the help page for predict.coxph. None of those are supposed to be probabilities. The linear predictor for a specific set of covariates is the log-hazard-ratio relative to a hypothetical (and very possibly non-existent) case with the mean of all the predictor values. The 'expected' comes the closest to a probability since it is a predicted number of events, but it would require specification of the time and then be divided by the number at risk at the beginning of observation.

在该帮助页面上提供的用于预测的示例中,您可以看到预测事件的总和与实际数字接近:

In the case of the example offered on that help page for predict, you can see that the sum of predicted events is close the the actual number:

> sum(predict(fit,type="expected"), na.rm=TRUE)
[1] 163

> sum(lung$status==2)
[1] 165

我怀疑您可能希望改为使用 survfit 函数,因为事件的概率是1个生存概率。

I suspect you may want to be working instead with the survfit function, since the probability of event is 1-probability of survival.

?survfit.coxph

一个类似问题的代码出现在这里:添加R

The code for a similar question appears here: Adding column of predicted Hazard Ratio to dataframe after Cox Regression in R

中的Cox回归后的预测危害与数据框的比率列由于您建议使用百事通1数据集,因此这将是指定时间的代码= 5

Since you suggested using the bladder1 dataset, then this would be the code for a specification of time=5

 summary(survfit(coxmodel), time=5)
#------------------
Call: survfit(formula = coxmodel)

 time n.risk n.event survival std.err lower 95% CI upper 95% CI
    5    302      26    0.928  0.0141        0.901        0.956

T帽子将以生存预测作为列表元素返回为列表,列表元素名为 $ surv

That would return as a list with the survival prediction as a list element named $surv:

> str(summary(survfit(coxmodel), time=5))
List of 14
 $ n       : int 340
 $ time    : num 5
 $ n.risk  : num 302
 $ n.event : num 26
 $ conf.int: num 0.95
 $ type    : chr "right"
 $ table   : Named num [1:7] 340 340 340 112 NA 51 NA
  ..- attr(*, "names")= chr [1:7] "records" "n.max" "n.start" "events" ...
 $ n.censor: num 19
 $ surv    : num 0.928
 $ std.err : num 0.0141
 $ lower   : num 0.901
 $ upper   : num 0.956
 $ cumhaz  : num 0.0744
 $ call    : language survfit(formula = coxmodel)
 - attr(*, "class")= chr "summary.survfit"
> summary(survfit(coxmodel), time=5)$surv
[1] 0.9282944

这篇关于从Cox PH模型预测概率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆