将summary()写入as.data.frame以在ggplot/R中使用 [英] Write a summary() to as.data.frame for use in ggplot / R

查看:68
本文介绍了将summary()写入as.data.frame以在ggplot/R中使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请在下面找到数据样本 t .

我正在使用来自 etm包 etmCIF 进行竞争性风险分析-产生以下内容,虽然很好,但需要更好的图形:

曾经有一个 ggtrans.etm 函数将数据导入ggplot.但是,此功能显然被删除了吗?

但是,我希望将自己的summary()转换为data.frame,但收到错误消息:

  library(etm)cum_in<-etmCIF(Surv(os,事件%in%c(1,2))〜1,t,etype =事件,失败代码= c(1,2))摘要(cum_in) 

哪个给

  CIF 1P时间var较低较高n.风险n.事件0.00000000 0.3297396 0.000000000 0.00000000 0.0000000 100 00.00000000 57.5268750 0.000000000 0.00000000 0.0000000 90 00.00000000 178.0340104 0.000000000 0.00000000 0.0000000 54 00.06387317 271.0966667 0.001897498 0.01643949 0.2311213 22 00.21669472 369.4858854 0.007605761 0.09511485 0.4494356 11 10.21669472 925.1224479 0.007605761 0.09511485 0.4494356 2 0到岸价2P时间var较低较高n.风险n.事件0.01000000 0.3297396 0.0000990000 0.001414712 0.0688628 100 10.07065711 57.5268750 0.0006633366 0.034315233 0.1425376 90 10.14846026 178.0340104 0.0015118082 0.087973840 0.2445705 54 10.23751402 271.0966667 0.0031735841 0.146981679 0.3703251 22 10.23751402 369.4858854 0.0031735841 0.146981679 0.3703251 11 00.56839997 925.1224479 0.0281468521 0.287757542 0.8751468 2 1 

在ggplot2的数据框中,我需要 P time lower upper ,所以我尝试了

 库(ggplot2)ggplot(as.data.frame(cum_in),aes(x = time,y = P))+geom_ribbon(data = cum_in,aes(ymin = lower,ymax = upper)) 

哪个给

as.data.frame.default(cum_in)中的错误:无法强制类"etmCIF"到data.frame

有什么想法如何将summary()转换为对ggplot有用的东西?我宁愿不降级该包.

更新的问题

所以我尝试了@PoGibas的功能,起初效果不错.但是,该功能似乎有问题.

我已经更新了下面的数据示例 t

我有三个兼职.包含三个不同组的 t $ ki67in 的inc-curs分层.

暨.增量曲线估算如下

  library(etm)cum_in<-etmCIF(Surv(event.tid,event!= 0)〜ki67in,t,etype = event,failcode = 2) 

其中 plot(cum_in)正确绘制了以下内容:

但是当我尝试时(基于功能 etm_to_df )

  res<-etm_to_df(cum_in)ggplot(res,aes(time,P))+geom_ribbon(aes(ymin =较低,ymax =较高,填充= CIF),alpha = 0.2)+geom_line(aes(color = CIF)) 

我在ggplot中得到了这个废话(似乎没有三个组):

 >负责人CIF P时间var较低较高n.风险n.事件1:0 1 0.009259259 0.25000 8.494005e-05 0.001309500 0.06390547 108 12:0 1 0.018605870 1.75000 1.698800e-04 0.004685795 0.07234945 106 13:0 1 0.028419811 11.83333 2.618497e-04 0.009249879 0.08556618 100 14:0 1 0.028419811 12.00000 2.618497e-04 0.009249879 0.08556618 99 05:0 1 0.028419811 15.00000 2.618497e-04 0.009249879 0.08556618 97 06:0 1 0.038334927 18.00000 3.538387e-04 0.014552186 0.09898410 96 1>尾巴CIF P时间var较低较高n.风险n.事件1:0 1 0.12156863 56.00000 0.006511402 0.03179904 0.4054164 9 02:0 1 0.38184459 96.66667 0.049327707 0.10529823 0.8750079 3 13:0 2 0.00000000 1.50000 0.000000000 0.00000000 0.0000000 17 04:0 2 0.00000000 3.00000 0.000000000 0.00000000 0.0000000 15 05:0 2 0.09760349 56.00000 0.008548335 0.01442923 0.5160136 9 16:0 2 0.09760349 96.66667 0.008548335 0.01442923 0.5160136 3 0 

我的数据样本

  t<-structure(list(ki67in = structure(c(0,2,0,0,1,0,2,2,1,0、1、2、0、2、0、1、1、1、0、2、2、0、2、1、0、0、0、1、0、12,0,1,1,0,1,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,2,1,1,0,0,0,0,0,0,0,2,0,0,1,0,1,0,0,1,0,0,1,2,2,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,2,1,2,0,2,0,0,1,0,0,0,0,0,0,1,0,2,1,0,0,0,0,0,0,0,0,1,0,0,2,0,0,0,0,1,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0),class ="AsIs"),event = structure(c(1,1,1,1,1,1,0,0,1,0,1,0,1,0,0,0,0,1,0,0,0,0,0,0,2,0,2,0,0,2,0,0,1,0,2,1,1,0,0,0,0,0,0,1,0,2,2,0,0、0、2、0、0、0、2、2、0、1、1、0、2、0、2、0、2、0、0、0、10,1,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2,0,0,0,2,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0),class ="AsIs"),event.tid = c(1.75,1.5,11.83333333、0.25、1.75、1、2、96.66666667、2、106.5833333、3,3、3、4、4、4、4、141.9166667、5、6、7、8、8、8、9、11、12、13,13,15,15,15,40.91666667,17,17,18,173,28,29,30,33,34、35、178.5833333、37、38、39、40、41、45、49、49、50、52,53,54,56,56,194.4166667,56,57,58,58,60,60,60,60,61,275.75,63,189.75,66,67,67,72,72,74,78,80,80,80,81、82、83、83、84、84、85、85、86、86、88、88、88、88、89、89,89、90、90、91、91、92、92、251.8333333、92、93、93、93、93,93、93、94、97、98、98、99、99、99、100、101、101、101、103,103、103、103、104、104、106、106、109、110、111、111、112、114,114、115、116、117、299.8333333、118、118、119、120、120、120,120、120、120、121、121、123、124、124、125、125、125、125)),类="data.frame",row.names = c(1L,2L,3L,9L,10L,11L,12L,13L,14L,15L,16L,18L,19L,20L,21L,22L,23L,24L,25L,26L,27L,28L,29L,30L,31L,32L,33L,34L,35L,36L,37L,38L,39L,40L,41L,44L,45L,46L,47L,48L,49L,50L,51L,52L,53L,54L,55L,57L,59L,60L,61L,62L,63L,64L,65L,66L,67L,68L,69L,70L,71L,72L,73L,74L,75L,76L,77L,78L,79L,80L,81L,82L,83L,84L,85L,87L,89L,90L,91L,92L,93L,94L,96L,97L,98L,99L,100L,101L,102L,103L,104L,105L,106L,107L,109L,110L,111L,112L,113L,114L,115L,116L,117L,118L,119L,120L,121L,123L,124L,125L,126L,127L,128L​​,130L,131L,132L,133L,134L,135L,136L,137L,138L,139L,140L,141L,142L,143L,144L,145L,146L,147L,148L,149L,150L,151L,152L,153L,154L,155L,156L,157L,158L,159L,160L,161L,162L,163L,164L,165L,166L,167L,168L,169L,170L,171L,172L,173L,174L,175L)) 

解决方案

ggtransfo.etm 与此


旧功能:

 #与etm ::: summary.etmCIF的功能相同,但返回数据帧etm_to_df<-函数(object,ci.fun ="cloglog",级别= 0.95,...){l.X<-ncol(object $ X)l.trans<-nrow(object [[1]] $ trans)temp<-lapply(object [seq_len(l.X)],function(ll){res<-摘要(ll,ci.fun = ci.fun,级别=级别,...)data.table :: rbindlist(res [seq_len(l.trans)+1],idcol ="CIF")})do.call(rbind,temp)} 

Please find af data sample t below.

I am conducting a competing risk analysis using etmCIF from the etm package - yielding the following, which is nice but needs better graphics:

There used to be a ggtrans.etm function to import data to ggplot. However, this function is apparently removed?!

However, I wish to transform my summary() into a data.frame but I receive an error:

library(etm)
cum_in <- etmCIF(Surv(os, event %in% c(1,2)) ~ 1, t, etype = event, failcode = c(1,2))
summary(cum_in)

Which gives

CIF 1 
          P        time         var      lower     upper n.risk n.event
 0.00000000   0.3297396 0.000000000 0.00000000 0.0000000    100       0
 0.00000000  57.5268750 0.000000000 0.00000000 0.0000000     90       0
 0.00000000 178.0340104 0.000000000 0.00000000 0.0000000     54       0
 0.06387317 271.0966667 0.001897498 0.01643949 0.2311213     22       0
 0.21669472 369.4858854 0.007605761 0.09511485 0.4494356     11       1
 0.21669472 925.1224479 0.007605761 0.09511485 0.4494356      2       0

CIF 2 
          P        time          var       lower     upper n.risk n.event
 0.01000000   0.3297396 0.0000990000 0.001414712 0.0688628    100       1
 0.07065711  57.5268750 0.0006633366 0.034315233 0.1425376     90       1
 0.14846026 178.0340104 0.0015118082 0.087973840 0.2445705     54       1
 0.23751402 271.0966667 0.0031735841 0.146981679 0.3703251     22       1
 0.23751402 369.4858854 0.0031735841 0.146981679 0.3703251     11       0
 0.56839997 925.1224479 0.0281468521 0.287757542 0.8751468      2       1

I need P, time, lower and upper in a data frame for ggplot2, so I tried

library(ggplot2)
ggplot(as.data.frame(cum_in), aes(x=time, y=P))  +
  geom_ribbon(data=cum_in, aes(ymin=lower, ymax=upper))

Which gives

Error in as.data.frame.default(cum_in) : cannot coerce class ‘"etmCIF"’ to a data.frame

Any idea how to transform summary() into something useful for ggplot? I would prefer not to downgrade the package.

UPDATED QUESTION

So I tried the function by @PoGibas, which worked nice initially. However, there seems to be a problem with the function.

I have updated the data sample t below

I have three cum. inc.-curves stratified for t$ki67in which constitute three different groups.

The cum. inc.-curves are estimated as follow

library(etm)
cum_in <- etmCIF(Surv(event.tid, event!=0) ~ ki67in, t, etype = event, failcode = 2)

In which plot(cum_in) correctly plots the following:

But when I try (based on the function etm_to_df)

res <- etm_to_df(cum_in)
ggplot(res, aes(time, P)) + 
  geom_ribbon(aes(ymin = lower, ymax = upper, fill = CIF), alpha = 0.2) +
  geom_line(aes(color = CIF))

I get this nonsense in ggplot (which does not seem to have three groups):

> head(res)
   CIF           P     time          var       lower      upper n.risk n.event
1: 0 1 0.009259259  0.25000 8.494005e-05 0.001309500 0.06390547    108       1
2: 0 1 0.018605870  1.75000 1.698800e-04 0.004685795 0.07234945    106       1
3: 0 1 0.028419811 11.83333 2.618497e-04 0.009249879 0.08556618    100       1
4: 0 1 0.028419811 12.00000 2.618497e-04 0.009249879 0.08556618     99       0
5: 0 1 0.028419811 15.00000 2.618497e-04 0.009249879 0.08556618     97       0
6: 0 1 0.038334927 18.00000 3.538387e-04 0.014552186 0.09898410     96       1
> tail(res)
   CIF          P     time         var      lower     upper n.risk n.event
1: 0 1 0.12156863 56.00000 0.006511402 0.03179904 0.4054164      9       0
2: 0 1 0.38184459 96.66667 0.049327707 0.10529823 0.8750079      3       1
3: 0 2 0.00000000  1.50000 0.000000000 0.00000000 0.0000000     17       0
4: 0 2 0.00000000  3.00000 0.000000000 0.00000000 0.0000000     15       0
5: 0 2 0.09760349 56.00000 0.008548335 0.01442923 0.5160136      9       1
6: 0 2 0.09760349 96.66667 0.008548335 0.01442923 0.5160136      3       0

My data sample

    t <- structure(list(ki67in = structure(c(0, 2, 0, 0, 1, 0, 2, 2, 1, 
0, 1, 2, 0, 2, 0, 1, 1, 1, 0, 2, 2, 0, 2, 1, 0, 0, 0, 1, 0, 1, 
2, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 
0, 0, 0, 1, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 1, 0, 1, 
0, 0, 1, 0, 0, 1, 2, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 2, 1, 2, 0, 2, 0, 0, 
1, 0, 0, 0, 0, 0, 0, 1, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 
0, 2, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 
0, 0, 0), class = "AsIs"), event = structure(c(1, 1, 1, 1, 1, 
0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 0, 2, 
0, 0, 2, 0, 0, 1, 0, 2, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 2, 2, 0, 
0, 0, 2, 0, 0, 0, 2, 2, 0, 2, 1, 0, 2, 0, 2, 0, 2, 0, 0, 0, 1, 
0, 1, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 2, 0, 0, 0, 0), class = "AsIs"), event.tid = c(1.75, 1.5, 
11.83333333, 0.25, 1.75, 1, 2, 96.66666667, 2, 106.5833333, 3, 
3, 3, 4, 4, 4, 141.9166667, 5, 6, 7, 8, 8, 8, 9, 11, 12, 13, 
13, 15, 15, 15, 40.91666667, 17, 17, 18, 173, 28, 29, 30, 33, 
34, 35, 178.5833333, 37, 38, 39, 40, 41, 45, 49, 49, 50, 52, 
53, 54, 56, 56, 194.4166667, 56, 57, 58, 58, 60, 60, 60, 60, 
61, 275.75, 63, 189.75, 66, 67, 67, 72, 72, 74, 78, 80, 80, 80, 
81, 82, 83, 83, 84, 84, 85, 85, 86, 86, 88, 88, 88, 88, 89, 89, 
89, 90, 90, 91, 91, 92, 92, 251.8333333, 92, 93, 93, 93, 93, 
93, 93, 94, 97, 98, 98, 99, 99, 99, 100, 101, 101, 101, 103, 
103, 103, 103, 104, 104, 106, 106, 109, 110, 111, 111, 112, 114, 
114, 115, 116, 117, 299.8333333, 118, 118, 119, 120, 120, 120, 
120, 120, 120, 121, 121, 123, 124, 124, 125, 125, 125, 125)), class = "data.frame", row.names = c(1L, 
2L, 3L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 18L, 19L, 20L, 
21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 
34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 44L, 45L, 46L, 47L, 48L, 
49L, 50L, 51L, 52L, 53L, 54L, 55L, 57L, 59L, 60L, 61L, 62L, 63L, 
64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 73L, 74L, 75L, 76L, 
77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 87L, 89L, 90L, 91L, 
92L, 93L, 94L, 96L, 97L, 98L, 99L, 100L, 101L, 102L, 103L, 104L, 
105L, 106L, 107L, 109L, 110L, 111L, 112L, 113L, 114L, 115L, 116L, 
117L, 118L, 119L, 120L, 121L, 123L, 124L, 125L, 126L, 127L, 128L, 
130L, 131L, 132L, 133L, 134L, 135L, 136L, 137L, 138L, 139L, 140L, 
141L, 142L, 143L, 144L, 145L, 146L, 147L, 148L, 149L, 150L, 151L, 
152L, 153L, 154L, 155L, 156L, 157L, 158L, 159L, 160L, 161L, 162L, 
163L, 164L, 165L, 166L, 167L, 168L, 169L, 170L, 171L, 172L, 173L, 
174L, 175L))

解决方案

ggtransfo.etm was removed with this commit. One way is to play around that function, but I tried to improve etm:::summary.etmCIF function to return binded data frame (introduced data.table as a dependency):

# NEW VERSION (adapted according to question update)
# Works with multiple groups 
etm_to_df <- function(object, ci.fun = "cloglog", level = 0.95, ...) {
  l.X <- ncol(object$X)
  l.trans <- nrow(object[[1]]$trans)
  res <- list()
  for (i in seq_len(l.X)) {
      temp <- summary(object[[i]], ci.fun = ci.fun, level = level)
      res[[i]] <- data.table::rbindlist(
        temp[object$failcode + 1], idcol = "CIF"
      )[, CIF := paste0("CIF", CIF, "; ", names(object)[i])]
  }
  do.call(rbind, res)
}

This function returns a data frame with column CIF which contains identifier.

# With given OPs data one can use 
library(etm)
cum_in <- etmCIF(Surv(os, event %in% c(1,2)) ~ 1, n, etype = event, failcode = c(1,2))
res <- etm_to_df(cum_in)

Then it's easy to plot it using ggplot2:

library(ggplot2)
ggplot(res, aes(time, P)) +
  geom_ribbon(aes(ymin = lower, ymax = upper, fill = CIF), alpha = 0.2) +
  geom_line(aes(color = CIF)) +
  scale_fill_manual(values = c("red", "blue")) +
  scale_color_manual(values = c("red", "blue")) +
  theme_classic()


Old function:

# Same functionality as etm:::summary.etmCIF, but returns a data frame
etm_to_df <- function(object, ci.fun = "cloglog", level = 0.95, ...) {
  l.X <- ncol(object$X)
  l.trans <- nrow(object[[1]]$trans)
  temp <- lapply(object[seq_len(l.X)], function(ll) {
    res <- summary(ll, ci.fun = ci.fun, level = level, ...)
    data.table::rbindlist(res[seq_len(l.trans) + 1], idcol = "CIF")
  })
  do.call(rbind, temp)
}

这篇关于将summary()写入as.data.frame以在ggplot/R中使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆