途径:在R中的父子“节点”中处理事件列表 [英] Pathways: Manipulate list of events in parent-child 'nodes' in R

查看:148
本文介绍了途径:在R中的父子“节点”中处理事件列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对根据预先确定的事件清单(例如诊断,手术,治疗1,治疗2,死亡)可视化患者所经历的途径感兴趣。

I am interested in visualizing pathways patients have based on a pre-specified list of events (e.g. diagnosis, surgery, treatment1, treatment2, death).

测试数据集可能看起来像这样:

A test data set might look like this:

df <- structure(list(ID = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
3L, 3L, 3L, 3L), .Label = c("a", "b", "c"), class = "factor"), 
    Event = structure(c(2L, 3L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 
    5L, 1L), .Label = c("death", "diagnosis", "surgery", "treatment1", 
    "treatment2"), class = "factor"), date = structure(c(14610, 
    14619, 16667, 14975, 14976, 14977, 15074, 15084, 15006, 15050, 
    15051, 15053), class = "Date")), .Names = c("ID", "Event", 
"date"), row.names = c(NA, 12L), class = "data.frame")

> df
   ID      Event       date
1   a  diagnosis 2010-01-01
2   a    surgery 2010-01-10
3   a      death 2015-08-20
4   b  diagnosis 2011-01-01
5   b    surgery 2011-01-02
6   b treatment1 2011-01-03
7   b treatment2 2011-04-10
8   b      death 2011-04-20
9   c  diagnosis 2011-02-01
10  c    surgery 2011-03-17
11  c treatment2 2011-03-18
12  c      death 2011-03-20

数据已按ID和日期排序。

The data have been ordered by ID and date.

我所追求的是:

> result
  ID     parent      child datediff
1  a  diagnosis    surgery        9
2  a    surgery      death     1950
3  b  diagnosis    surgery        1
4  b    surgery treatment1        1
5  b treatment1 treatment2       90
6  b treatment2      death       10
7  c  diagnosis    surgery       45
8  c    surgery treatment2        1
9  c treatment2      death        2

(请注意,datediff列中的数字不是实际的)
,即一系列父子节点,它们之间的日期之间存在差异

(Note that the numbers in the datediff column are not actual) i.e. a series of parent-child nodes with the difference in dates between them.

这将允许我绘制节点,并对事件之间的时间做进一步的描述性分析。

This will allow me to plot the nodes, and do some further descriptive analysis on time between events.

我找到了一个用于绘制节点的程序包(见下文),但是,如果有人知道允许箭头宽度反映父子组合数量的方式/程序包,那就太棒了!

I found a package to plot nodes (see below), however, if someone knows a way/package that allows the arrow width to reflect the number of parent-child combinations, that would be awesome!

 require(igraph) # possible package to use
 parents<-c("A","A","A","A","A","A","C","C","F","F","H","I")
 children<-c("I","I","I","I","B","A","D","H","G","H","I","J")
 begats<-data.frame(parents=parents,children=children)
 graph_begats<-graph.data.frame(begats)
 tkplot(graph_begats)

干杯,
Luc

Cheers, Luc

推荐答案

折叠数据以提供每个父子组合以及它们发生了多少次的计数,例如:

Collapse your data up to give each parent-child combo and a count of how many times they occurred, e.g.:

# put the previous event against the current event, and drop the rows before the first event:
df$Event <- as.character(df$Event)
df$PreEvent <- with(df, ave(Event,ID,FUN=function(x) c(NA,head(x,-1)) ) )
result <- df[!is.na(df$PreEvent),c("ID","PreEvent","Event")]

# aggregate the combos by how often they occur:
result <- aggregate(list(count=rownames(result)),result[c("PreEvent","Event")],FUN=length)
#    PreEvent      Event count
#1    surgery      death     1
#2 treatment2      death     2
#3  diagnosis    surgery     3
#4    surgery treatment1     1
#5    surgery treatment2     1
#6 treatment1 treatment2     1

# plot in igraph, adjusting the edge.width to account for how many cases of each
# parent-child combo exist:
library(igraph)
g <- graph.data.frame(result)
plot(g,edge.width=result$count)

这篇关于途径:在R中的父子“节点”中处理事件列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆