JAGS错误-可能涉及以下某些或所有节点的有向循环 [英] JAGS error - Possible directed cycle involving some or all of the following nodes

查看:97
本文介绍了JAGS错误-可能涉及以下某些或所有节点的有向循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

完整的数据集包含〜11,000行.在检查代码是否运行的同时,我一直在以K = 400运行代码.

所有行都与地图上的特定单元格相关,并包含从Sentinel-2图像和数字高程图提取的信息.

117个细胞的子集还包含在实地考察中记录的栖息地协变量.因此,某些列(包括响应变量(S1和S2)和tussac)的特征在于NA的比例很高.

代码:

  add_c4<-模型{for(i在1:K中){S1 [i]〜dpois(lambda1 [i])lambda1 [i] <-exp(a0 + a1 * DEM_slope [i] + a2 * DEM_elevation [i] + a3 * tussac [i] + a4 * S2 [i])S2 [i]〜dpois(lambda2 [i])lambda2 [i]< -exp(c0 + c1 * DEM_slope [i] + c2 * DEM_elevation [i] + c3 * tussac [i] + c4 * S1 [i])muLogit_tussac [i] -b0 +哨兵1 [i] +哨兵3 [i] +哨兵7 [i] +哨兵8 [i] +哨兵9 [i] + DEM_slope [i]Logit_tussac [i]〜dnorm(muLogit_tussac [i],tau)logit(tussac [i])< -Logit_tussac [i]}#先验a0〜dnorm(0,10)a1〜dnorm(0,10)a2〜dnorm(0,10)a3〜dnorm(0,10)a4〜dnorm(0,10)b0〜dnorm(0,10)b1〜dnorm(0,10)b2〜dnorm(0,10)b3〜dnorm(0,10)c0〜dnorm(0,10)c1〜dnorm(0,10)c2〜dnorm(0,10)c3〜dnorm(0,10)c4〜dnorm(0,10)tau〜dgamma(0.001,0.001)#data#S1,S2,K,sentinel1,sentinel3,sentinel7,sentinel8,sentinel9,DEM_slope,DEM_elevation#inits#a0,a2,a3,a4,b0,b1,b2,b3,c0,c2,c3,c4#显示器#a0,a1,a2,a3,a4,b0,b1,b2,b3,tau,ped,dic,c0,c1,c2,c3,c4}" 

当我包含'c4 * S1 [i]'时,出现以下错误:

 涉及以下某些或所有节点的可能的定向循环 

然后继续列出S1,S2,lambda1和lambda2的所有值.

删除'c4 * S1 [i]'将导致代码运行.

我通过以下线程进行了研究:

JAGS中可能的定向循环错误

https://stats.stackexchange.com/questions/220312/coding-a-jags-error-model-for-a-dependent-variable-that-has-increasing-variance

其中的问题似乎是由发布者在等式两边使用"y"引起的.我认为我的问题是由于a4将代码发送到S2部分,而c4将代码发送回S1部分引起的,这有点像定向循环.知道如何解决这个问题吗?

我将数据集的前几行包括在内,以防万一:

  S1 S2 Logit_tussac湿度DEM_slope DEM_aspect DEM_elevation不适用不适用不适用2.434239 168.5011 0.588606366 0.0413 0.0499 0.0531 0.1035 0.1862 0.1968 0.1808 0.1318 0.0400 0.0199不适用不适用不适用3.705001 178.1289 1.007037127 0.0966 0.1108 0.1212 0.0855 0.0917 0.1063 0.0937 0.1842 0.0341 0.0161不适用不适用不适用5.006181 180.0000 1.883010797 0.1309 0.1472 0.1361 0.0855 0.0917 0.1063 0.0937 0.1572 0.0341 0.0161不适用不适用不适用5.006181 180.0000 2.758984468 0.0542 0.0512 0.0472 0.0145 0.0127 0.0092 0.0166 0.0510 0.0148 0.0080 

数据集子集,以便仅包含远程和本地感测数据的117行:

  S1 S2 Logit_tussac湿度DEM_slope DEM_aspect DEM_elevation不适用不适用不适用14.917334 256.1612 12.24432 0.0513 0.0588 0.0541 0.1145 0.1676 0.1988 0.1977 0.1658 0.1566 0.07700 0 -9.210240 1 23.803741 225.1231 16.88028 0.1058 0.1370 0.2139 0.2387 0.2654 0.2933 0.3235 0.2928 0.3093 0.1601不适用不适用不适用20.789165 306.0945 18.52480 0.0287 0.0279 0.0271 0.0276 0.0290 0.0321 0.0346 0.0452 0.0475 0.0219不适用不适用-9.210240 1 6.689442 287.9641 36.08975 0.0462 0.0679 0.1274 0.1535 0.1797 0.2201 0.2982 0.2545 0.4170 0.22520 0 -9.210240 1 25.476444 203.0659 23.59964 0.0758 0.1041 0.1326 0.1571 0.2143 0.2486 0.2939 0.2536 0.3336 0.19371 0 -1.385919 3 1.672511 270.0000 39.55215 0.0466 0.0716 0.1227 0.1482 0.2215 0.2715 0.3334 0.2903 0.3577 0.1957 

解决方案

正确识别后,您的问题就是模型图中的有向循环.这是一个问题,其原因是,证明DAG(有向无环图)不包含任何有向环非常重要,否则不能保证我们可以定义稳定的后验样本.

例如,采用以下包含定向循环的模型:

  model<-'model {for(1:N中的i){a [i]〜dnorm(b [i],tau)b [i]〜dnorm(a [i],tau)}tau〜dgamma(0.01,0.01)#monitor#tau#data#N}'N <-10runjags :: run.jags(模型) 

没有明智的方法来估算此模型,JAGS会告诉您.但从理论上讲,可以估计此模型:

  model<-'model {for(1:N中的i){a [i]〜dnorm(b [i],tau)b [i]〜dnorm(a [i],tau)}tau〜dgamma(0.01,0.01)#monitor#tau#data#N,一个}'N <-10<-rnorm(N)runjags :: run.jags(模型) 

发生的变化是,所有a []现在都是固定的(观察到的),因此我们实际上可以估计此模型.但是JAGS仍然可以检测到有向循环,因此需要一种解决方法:

  model<-'model {for(1:N中的i){a [i]〜dnorm(b [i],tau)b [i]〜dnorm(aa [i],tau)}tau〜dgamma(0.01,0.01)#monitor#tau#data#N,a,aa}'N <-10<-rnorm(N)aa<-arunjags :: run.jags(模型) 

这通过诱使JAGS认为a []和aa []不相关而隐藏了有向循环.但这仅在观察到所有a []/并将其固定后才起作用,否则在模型中不会估计或定义丢失的aa [].在您的情况下,似乎部分观察到了S1 []和S2 [],因此,除非您简单地省略缺少S1或S2的行/观测值,否则此技巧将不起作用(这可能不可行,因为您说它们具有较高的NA的比例.

否则,您将不得不以某种方式重新构建模型以打破有针对性的循环.这将涉及考虑系统底层的生物过程,以及如何在不创建定向循环的情况下表示所需的关系,因此我们实际上无法提供帮助.

希望有帮助,

马特

The full dataset contains ~11,000 rows. I have been running the code with K=400 whilst checking that the code runs.

All of the rows relate to a specific cell on a map and contain information extracted from Sentinel-2 images and a digital elevation map.

A subset of 117 cells also contain habitat covariates recorded on a field trip. As such, some of the columns, including the response variables (S1 and S2) and tussac, are characterised by a high proportion of NAs.

The code:

add_c4 <- "model{
for(i in 1:K) {
S1[i]~dpois(lambda1[i])
lambda1[i]<-exp(a0+a1*DEM_slope[i]+a2*DEM_elevation[i]+a3*tussac[i]+a4*S2[i])

S2[i]~dpois(lambda2[i])
lambda2[i]<-exp(c0+c1*DEM_slope[i]+c2*DEM_elevation[i]+c3*tussac[i]+c4*S1[i])

muLogit_tussac[i]<-b0 + sentinel1[i] + sentinel3[i] + sentinel7[i] + sentinel8[i] + sentinel9[i] + DEM_slope[i]

Logit_tussac[i]~dnorm(muLogit_tussac[i], tau)
logit(tussac[i])<-Logit_tussac[i]
}

# Priors

a0~dnorm(0, 10)
a1~dnorm(0, 10)
a2~dnorm(0, 10)
a3~dnorm(0, 10)
a4~dnorm(0, 10)

b0~dnorm(0, 10)
b1~dnorm(0, 10)
b2~dnorm(0, 10)
b3~dnorm(0, 10)

c0~dnorm(0, 10)
c1~dnorm(0, 10)
c2~dnorm(0, 10)
c3~dnorm(0, 10)
c4~dnorm(0, 10)

tau~dgamma(0.001, 0.001)

#data# S1, S2, K, sentinel1, sentinel3, sentinel7, sentinel8, sentinel9, DEM_slope, DEM_elevation
#inits# a0, a2, a3, a4, b0, b1, b2, b3, c0, c2, c3, c4
#monitor# a0, a1, a2, a3, a4, b0, b1, b2, b3, tau, ped, dic, c0, c1, c2, c3, c4
}"

When I include 'c4*S1[i]' I get the following error:

Possible directed cycle involving some or all of the following nodes

It then proceeds to list all values of S1, S2, lambda1 and lambda2.

Removing 'c4*S1[i]' results in the code running.

I've had a look through the following threads:

Possible directed cycle error in JAGS

https://stats.stackexchange.com/questions/220312/coding-a-jags-error-model-for-a-dependent-variable-that-has-increasing-variance

The issues in them seems to have been caused by the poster using 'y' on both sides of an equation. I think that my issue is caused by the fact that a4 sends the code to the S2 section and c4 sends it back to the S1 section, which is a bit like a directed cycle. Any idea how to get around this?

I've included the top rows of the dataset in case it's of any use:

S1 S2 Logit_tussac moisture DEM_slope DEM_aspect DEM_elevation sentinel1 sentinel2 sentinel3 sentinel4 sentinel5 sentinel6 sentinel7 sentinel8 sentinel9 sentinel10
NA NA     NA            NA  2.434239   168.5011   0.588606366    0.0413    0.0499    0.0531    0.1035    0.1862    0.1968    0.1808    0.1318    0.0400     0.0199
NA NA     NA            NA  3.705001   178.1289   1.007037127    0.0966    0.1108    0.1212    0.0855    0.0917    0.1063    0.0937    0.1842    0.0341     0.0161
NA NA     NA            NA  5.006181   180.0000   1.883010797    0.1309    0.1472    0.1361    0.0855    0.0917    0.1063    0.0937    0.1572    0.0341     0.0161
NA NA     NA            NA  5.006181   180.0000   2.758984468    0.0542    0.0512    0.0472    0.0145    0.0127    0.0092    0.0166    0.0510    0.0148     0.0080

Dataset subset so that only the 117 rows that contain remote and locally sensed data:

S1 S2 Logit_tussac moisture DEM_slope DEM_aspect DEM_elevation sentinel1 sentinel2 sentinel3 sentinel4 sentinel5 sentinel6 sentinel7 sentinel8 sentinel9 sentinel10
NA NA        NA        NA   14.917334   256.1612      12.24432    0.0513    0.0588    0.0541    0.1145    0.1676    0.1988    0.1977    0.1658    0.1566     0.0770
0  0  -9.210240         1   23.803741   225.1231      16.88028    0.1058    0.1370    0.2139    0.2387    0.2654    0.2933    0.3235    0.2928    0.3093     0.1601
NA NA        NA        NA   20.789165   306.0945      18.52480    0.0287    0.0279    0.0271    0.0276    0.0290    0.0321    0.0346    0.0452    0.0475     0.0219
NA NA -9.210240         1    6.689442   287.9641      36.08975    0.0462    0.0679    0.1274    0.1535    0.1797    0.2201    0.2982    0.2545    0.4170     0.2252
0  0  -9.210240         1   25.476444   203.0659      23.59964    0.0758    0.1041    0.1326    0.1571    0.2143    0.2486    0.2939    0.2536    0.3336     0.1937
1  0  -1.385919         3    1.672511   270.0000      39.55215    0.0466    0.0716    0.1227    0.1482    0.2215    0.2715    0.3334    0.2903    0.3577     0.1957

解决方案

As you have correctly identified, your problem is a directed cycle in the model graph. The reason this is a problem is that it turns out to be quite important that a DAG (directed acyclic graph) does not contain any directed cycles, otherwise there is no guarantee that we can define stable posteriors to sample from.

For example, take the following model containing a directed cycle:

model <- 'model{

    for(i in 1:N){
        a[i] ~ dnorm(b[i], tau)
        b[i] ~ dnorm(a[i], tau)
    }

    tau ~ dgamma(0.01,0.01)

    #monitor# tau
    #data# N

}'

N <- 10
runjags::run.jags(model)

There is no sensible way to estimate this model, and JAGS will tell you so. But it is in theory possible to estimate this model:

model <- 'model{

    for(i in 1:N){
        a[i] ~ dnorm(b[i], tau)
        b[i] ~ dnorm(a[i], tau)
    }

    tau ~ dgamma(0.01,0.01)

    #monitor# tau
    #data# N, a

}'

N <- 10
a <- rnorm(N)
runjags::run.jags(model)

What has changed is that all a[] are now fixed (observed), so we actually can estimate this model. But JAGS still detects the directed cycle, so a workaround is needed:

model <- 'model{

    for(i in 1:N){
        a[i] ~ dnorm(b[i], tau)
        b[i] ~ dnorm(aa[i], tau)
    }

    tau ~ dgamma(0.01,0.01)

    #monitor# tau
    #data# N, a, aa

}'

N <- 10
a <- rnorm(N)
aa <- a
runjags::run.jags(model)

This hides the directed cycle by tricking JAGS into thinking that a[] and aa[] are unrelated. But this only works when all a[] are observed/fixed, otherwise the missing aa[] are not estimated or defined within the model. In your case, S1[] and S2[] seem to be partially observed, so this trick will not work unless you can simply omit the rows/observations with missing S1 or S2 (which may not be feasible as you say they have a high proportion of NA).

Otherwise you will have to re-formulate your model somehow to break the directed cycle. This would involve thinking about the biological process underlying your system and how to represent the relationship you want without creating a directed cycle, so it is not really something we can help with.

Hope that helps,

Matt

这篇关于JAGS错误-可能涉及以下某些或所有节点的有向循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆