系统发育模型,每个物种使用多个条目 [英] Phylogenetic model using multiple entries for each species

查看:202
本文介绍了系统发育模型,每个物种使用多个条目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于系统发育回归模型,我还比较陌生.过去,当我的树中每个物种只有1个条目时,我使用了PGLS.现在,我有了一个包含数千个记录的数据集,总共有9种物种,我想运行一个系统发育模型.我阅读了最常用软件包的教程(例如,雀跃),但不确定如何构建模型.

I am relatively new to phylogenetic regression models. In the past I used PGLS when I had only 1 entry for each species in my tree. Now I have a dataset with thousands of records for a total of 9 species and I would like to run a phylogenetic model. I read the tutorial of the most common packages (e.g. caper) but I am unsure how to build the model.

当我尝试创建用于雀跃的对象时,即使用:

When I try to create the object for caper, i.e. using:

obj <- comparative.data(phy = Tree, data = Data, names.col = species, vcv = TRUE, na.omit = FALSE, warn.dropped = TRUE)

我收到消息:

row.names<-.data.frame中的错误(*tmp*,value = value): 不允许重复的"row.names" 另外:警告消息: 设置'row.names'时的非唯一值:'Species1','Species2','Species3','Species4','Species5','Species6','Species7','Species8','Species9'

Error in row.names<-.data.frame(*tmp*, value = value) : duplicate 'row.names' are not allowed In addition: Warning message: non-unique values when setting 'row.names': ‘Species1’, ‘Species2’, ‘Species3’, ‘Species4’, ‘Species5’, ‘Species6’, ‘Species7’, ‘Species8’, ‘Species9’

我知道我可以通过应用MCMCglmm模型来解决此问题,但是我不熟悉贝叶斯模型.

I understood that I may solve this by applying a MCMCglmm model but I am unfamiliar with Bayesian models.

在此先感谢您的帮助.

Thanks in advance for your help.

推荐答案

这确实不适用于caper中的简单PGLS,因为它不能作为随机效应来对待个人.我建议您使用MCMCglmm,它理解起来并不复杂,并且可以让您具有随机效果.您可以从软件包作者的此处中找到出色的文档.或此处,或其他文档,软件包的某些特定方面(即树的不确定性)此处

This is indeed not going to work with a simple PGLS from caper because it cannot deal with individuals as a random effect. I suggest you use MCMCglmm that is not much more complex to understand and will allow you to have individuals as a random effect. You can find excellent documentation from the package's author here or here or an alternative documentation that's more dealing with some specific aspects of the package (namely tree uncertainty) here.

请简要介绍一下:

## Your comparative data
comp_data <- comparative.data(phy = my_tree, data =my_data,
      names.col = species, vcv = TRUE)

请注意,您可以拥有一个如下所示的标本栏:

Note that you can have a specimen column that can look like this:

   taxa        var1 var2 specimen
1     A  0.08730689    a    spec1
2     B  0.47092692    a    spec1
3     C -0.26302706    b    spec1
4     D  0.95807782    b    spec1
5     E  2.71590217    b    spec1
6     A -0.40752058    a    spec2
7     B -1.37192856    a    spec2
8     C  0.30634567    b    spec2
9     D -0.49828379    b    spec2
10    E  1.42722363    b    spec2

然后您可以设置公式(类似于简单的lm公式):

You can then set up your formula (similar to a simple lm formula):

## Your formula
my_formula <- variable1 ~ variable2

以及您的MCMC设置:

And your MCMC settings:

## Setting the prior list (see the MCMCglmm course notes for details)
prior <- list(R = list(V=1, nu=0.002),
              G = list(G1 = list(V=1, nu=0.002)))

## Setting the MCMC parameters
## Number of interations
nitt <- 12000

## Length of burnin
burnin <- 2000

## Amount of thinning
thin <- 5

然后您应该能够运行默认的MCMCglmm:

And you should then be able to run a default MCMCglmm:

## Extracting the comparative data
mcmc_data <- comp_data$data

## As MCMCglmm requires a colume named animal for it to identify it as a phylo
## model we include an extra colume with the species names in it.
mcmc_data <- cbind(animal = rownames(mcmc_data), mcmc_data)
mcmc_tree <- comp_data$phy

## The MCMCglmmm
mod_mcmc <- MCMCglmm(fixed = my_formual, 
                     random = ~ animal + specimen, 
                     family = "gaussian",
                     pedigree = mcmc_tree, 
                     data = mcmc_data,
                     nitt = nitt,
                     burnin = burnin,
                     thin = thin,
                     prior = prior)

这篇关于系统发育模型,每个物种使用多个条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆