在R的Gompertz老化分析 [英] Gompertz Aging analysis in R

查看:230
本文介绍了在R的Gompertz老化分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个苍蝇实验的生存数据,用于检查各种基因型的衰老率。这些数据可以在多种布局中使用,因此选择哪种取决于您,最适合您的答案。

一个数据框(wide.df)看起来像这样,其中每个基因型(Exp,其中有〜640)有一行,并且从第4天到第98天水平依次水平运行,每两天有新的死亡数量。

  Exp Day4 Day6 Day8 Day10 Day12 Day14 ... 
A 0 0 0 2 3 1 ...

我使用这个例子:

 宽。 df2< -data.frame(A,0,0,0,2,3,1,3,4,5,3,4,7,8,2,10,1,2)
colnames (wide.df2)LT; -C( EXP, 第四天, 第六天, 第8天, 第十天, 第12天, Day14, Day16, Day18, 第20天, Day22,Day24,Day26,Day28,Day30,Day32,Day34,Day36)

另一个版本就是这样,每个'Exp'每天都有一行,并记录当天的死亡人数。

 死亡日期
A 0 4
A 0 6
A 0 8
A 2 10
A 3 12
.. .. ..

为了举例:

  df2 <-data.frame(c(A,A,A , A, A, A, A, A, A, A, A, A, A, A, A, A, A),C(0,0,0,2,3,1,3,4,5,3,4,7,8,2,10,1,2),C(4,6 ,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36))
colnames(df2)< -c(Exp,死亡,日)

我想要做的是执行 Gompertz分析 见这里的生命表的第二段)。方程式为:
$ b $ p xxαe eβx x

其中 是在给定时间内死亡的概率,是初始死亡率,β是老化。

我希望能够得到一个数据帧,该数据帧对每个数据帧都有α和<β> 估计值我的〜640基因型用于以后的进一步分析。

我需要从上述数据框到这些值的输出, / strong>



我已经浏览了包含 flexsurv 的可能包含答案但我尝试失败找到并实施它。

解决方案



首先,要使 flexsurvreg 函数起作用,您需要将输入数据指定为 Surv 对象(从 package:survival )。这意味着每个观察一行。

首先要从您提供的汇总表中重新创建原始数据。
(我知道 rbind 效率不高,但对于大集合,您总是可以切换到 data.table )。

  ###获得行数> 1死亡数额
df3 < - df2 [df2 $死亡数量> (df3,FUN = function(x)rep(df3 [,2],df3 [1,2:3]
###扩展为每死亡一行
df3 < - sapply ,1]))
###每个死亡是1(发生一次)
df3 [,1] < - 1
###将此添加到< = 1的行死亡
df3< - rbind(df3,df2 [!df2 $死亡> 1,2:3])
###转换为Surv对象
库(生存)
s1 < - with(df3,Surv(Day,Deaths))
###获取Gompertz分布的参数
library(flexsurv)
f1 < - flexsurvreg(s1〜1,dist =gompertz)

给出

 > f1 $ res 
est L95%U95%
shape 0.165351912 0.1281016481 0.202602176
rate 0.001767956 0.0006902161 0.004528537

请注意,这是一个截取模型,因为所有的基因型都是 A
您可以在重新创建如上所述的每次观察数据后,将其循环显示在多个生存对象上。


flexsurv 文档:


具有形状参数 a 和速率参数
b 的Gompertz分布具有危险函数

H(x:a,b)= be ^ {ax}


所以看起来你的alpha是 b <强度>,比率,β是 a ,形状。


I have survival data from an experiment in flies which examines rates of aging in various genotypes. The data is available to me in several layouts so the choice of which is up to you, whichever suits the answer best.

One dataframe (wide.df) looks like this, where each genotype (Exp, of which there is ~640) has a row, and the days run in sequence horizontally from day 4 to day 98 with counts of new deaths every two days.

Exp      Day4   Day6    Day8    Day10   Day12   Day14    ...
A        0      0       0       2       3       1        ...

I make the example using this:

wide.df2<-data.frame("A",0,0,0,2,3,1,3,4,5,3,4,7,8,2,10,1,2)
colnames(wide.df2)<-c("Exp","Day4","Day6","Day8","Day10","Day12","Day14","Day16","Day18","Day20","Day22","Day24","Day26","Day28","Day30","Day32","Day34","Day36")

Another version is like this, where each day has a row for each 'Exp' and the number of deaths on that day are recorded.

Exp     Deaths  Day     
A       0       4    
A       0       6
A       0       8
A       2       10
A       3       12
..      ..      ..

To make this example:

df2<-data.frame(c("A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A"),c(0,0,0,2,3,1,3,4,5,3,4,7,8,2,10,1,2),c(4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36))
    colnames(df2)<-c("Exp","Deaths","Day")

What I would like to do is perform a Gompertz Analysis (See second paragraph of "the life table" here). The equation is:

μx = α*e β*x

Where μx is probability of death at a given time, α is initial mortality rate, and β is the rate of aging.

I would like to be able to get a dataframe which has α and β estimates for each of my ~640 genotypes for further analysis later.

I need help going from the above dataframes to an output of these values for each of my genotypes in R.

I have looked through the package flexsurv which may house the answer but I have failed in attempts to find and implement it.

解决方案

This should get you started...

Firstly, for the flexsurvreg function to work, you need to specify your input data as a Surv object (from package:survival). This means one row per observation.

The first thing is to re-create the 'raw' data from the summary tables you provide. (I know rbind is not efficient, but you can always switch to data.table for large sets).

### get rows with >1 death
df3 <- df2[df2$Deaths>1, 2:3]
### expand to give one row per death per time
df3 <- sapply(df3, FUN=function(x) rep(df3[, 2], df3[, 1]))
### each death is 1 (occurs once)
df3[, 1] <- 1
### add this to the rows with <=1 death
df3 <- rbind(df3, df2[!df2$Deaths>1, 2:3])
### convert to Surv object
library(survival)
s1 <- with(df3, Surv(Day, Deaths))
### get parameters for Gompertz distribution
library(flexsurv) 
f1 <- flexsurvreg(s1 ~ 1, dist="gompertz")

giving

> f1$res
              est         L95%        U95%
shape 0.165351912 0.1281016481 0.202602176
rate  0.001767956 0.0006902161 0.004528537

Note that this is an intercept-only model as all your genotypes are A. You can loop this over multiple survival objects once you have re-created the per-observation data as above.

From the flexsurv docs:

Gompertz distribution with shape parameter a and rate parameter b has hazard function

H(x: a, b) = b.e^{ax}

So it appears your alpha is b, the rate, and beta is a, the shape.

这篇关于在R的Gompertz老化分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆