更改stats的行为::加载dplyr包时滞后 [英] Changing behaviour of stats::lag when loading dplyr package

查看:333
本文介绍了更改stats的行为::加载dplyr包时滞后的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当使用
dplyr 包时,我无法使用 stats :: lag 函数。具体来说,我在加载 dplyr 之前和之后得到滞后
函数的不同结果。

I am having trouble with the stats::lag function when using the dplyr package. Specifically, I get different results from the lag function before and after loading dplyr.

例如,这里是一个抽样时间序列。如果我用
k = -1 计算滞后,则滞后系列从1971年开始。

For example, here is a sample time series. If I calculate the lag with k = -1, the lagged series starts in 1971.

data <- ts(1:10, start = 1970, frequency = 1)
lag1 <- stats::lag(data, k = -1)
start(lag1)[1]

## [1] 1971

现在,如果我加载 dplyr ,同样的电话会产生一个从
1970开始的滞后系列。

Now, if I load dplyr, the same call yields a lagged series starting in 1970.

library(dplyr)

## 
## Attaching package: 'dplyr'
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

lag2 <- stats::lag(data, k = -1)
start(lag2)[1]

## [1] 1970

start(lag1)[1] == start(lag2)[1]

## [1] FALSE

加载 dplyr ,我的猜测是,这必须用Environ做
发言:。但是,分离 dplyr 似乎没有帮助。

Given the warnings when loading dplyr, my guess is that this has to do with Environments. But, detaching dplyr doesn't seem to help.

detach("package:dplyr", unload = TRUE, character.only = TRUE)
lag3 <- stats::lag(data, k = -1)
start(lag3)[1]

## [1] 1970

start(lag1)[1] == start(lag3)[1]

## [1] FALSE

任何建议都非常感谢。到目前为止,我唯一的解决方案是
在计算 lag1 lag2 之间重启R会话。

Any suggestions are greatly appreciated. My only solution so far is to restart the R session between calculating lag1 and lag2.

这是我的会话:

##  setting  value                       
##  version  R version 3.2.0 (2015-04-16)
##  system   i386, mingw32               
##  ui       RTerm                       
##  language (EN)                        
##  collate  English_Canada.1252         
##  tz       America/New_York            
## 
##  package    * version  date       source        
##  assertthat   0.1      2013-12-06 CRAN (R 3.2.0)
##  bitops       1.0-6    2013-08-17 CRAN (R 3.2.0)
##  DBI          0.3.1    2014-09-24 CRAN (R 3.2.0)
##  devtools     1.8.0    2015-05-09 CRAN (R 3.2.0)
##  digest       0.6.8    2014-12-31 CRAN (R 3.2.0)
##  dplyr        0.4.1    2015-01-14 CRAN (R 3.2.0)
##  evaluate     0.7      2015-04-21 CRAN (R 3.2.0)
##  formatR      1.2      2015-04-21 CRAN (R 3.2.0)
##  git2r        0.10.1   2015-05-07 CRAN (R 3.2.0)
##  htmltools    0.2.6    2014-09-08 CRAN (R 3.2.0)
##  httr       * 0.6.1    2015-01-01 CRAN (R 3.2.0)
##  knitr        1.10.5   2015-05-06 CRAN (R 3.2.0)
##  magrittr     1.5      2014-11-22 CRAN (R 3.2.0)
##  memoise      0.2.1    2014-04-22 CRAN (R 3.2.0)
##  Rcpp         0.11.6   2015-05-01 CRAN (R 3.2.0)
##  RCurl        1.95-4.6 2015-04-24 CRAN (R 3.2.0)
##  rmarkdown    0.6.1    2015-05-07 CRAN (R 3.2.0)
##  rversions    1.0.0    2015-04-22 CRAN (R 3.2.0)
##  stringi      0.4-1    2014-12-14 CRAN (R 3.2.0)
##  stringr      1.0.0    2015-04-30 CRAN (R 3.2.0)
##  XML          3.98-1.1 2013-06-20 CRAN (R 3.2.0)
##  yaml         2.1.13   2014-06-12 CRAN (R 3.2.0)

ve也尝试过 unloadNamespace ,如@BondedDust所示:

I've also tried unloadNamespace, as suggested by @BondedDust:

unloadNamespace("dplyr")  
lag4 <- stats::lag(data, k = -1)  

## Warning: namespace 'dplyr' is not available and has been replaced  
## by .GlobalEnv when processing object 'sep'  

start(lag4)[1]  

## [1] 1970  

start(lag1)[1] == start(lag4)[1]  

## [1] FALSE


推荐答案

dplyr包有效地覆盖滞后。调度机制没有发现 lag ,因为真的没有这个名字的功能,只有两个副本 lag.default ,一个在统计,一个在dplyr和dplyr副本被首先找到。您可以使用 ::: -mechanism来强制使用统计版本:

The dplyr package is effectively overwriting 'lag'. The dispatch mechanism is not finding lag because there really is no function by that name, just two copies of lag.default, one in 'stats' and one in 'dplyr' and the 'dplyr' copy is being found first. You can force the stats version to be found with the use of the :::-mechanism:

> lag2 <- stats::lag.default(data, k = -1)
Error: 'lag.default' is not an exported object from 'namespace:stats'

> lag2 <- stats:::lag.default(data, k = -1)
> stats::start(lag2)[1]
[1] 1971

dplyr ::: lag.default 不使用时间序列特定的功能。我无法解释为什么unloadNamespace无法删除函数的定义,但它仍然存在:

The dplyr:::lag.default does not use the time-series specific functions. I'm not able to explain why unloadNamespace fails to remove the function's definition but it is still there:

> unloadNamespace("dplyr")
> getAnywhere(lag.default)
2 differing objects matching ‘lag.default’ were found
in the following places
  registered S3 method for lag from namespace dplyr
  namespace:stats
Use [] to view one of them

进一步的奇怪:卸载 dply -namespace我看到这样:

Further weirdness: After unloading the dply-namespace I see this:

> environment(getAnywhere(lag.default)[1])
<environment: namespace:dplyr>
> environment(getAnywhere(lag.default)[2])
<environment: namespace:dplyr>
> environment(getAnywhere(lag.default)[3])
<environment: namespace:stats>

(然后重新启动并加载dplyr,我看到相同的明显的双重条目。)

(And then restarting and loading dplyr, I see the same apparent double-entry.)

dplyr :: lag 的帮助页面还有一些奇怪的事:

There's also something weird about the help page for dplyr::lag:

> help(lag,pac=dplyr)
No documentation for ‘lag’ in specified packages and libraries:
you could try ‘??lag’
> help(`lag`,pac=`dplyr`)
No documentation for ‘lag’ in specified packages and libraries:
you could try ‘??lag’
> help(`lag.default`,pac=`dplyr`)  # This finally succeeds!

看github(在确定我有最新版本的dplyr在CRAN之后),我看到这是 R CMD检查过程的一个问题: https:// github.com/hadley/dplyr/commit/f8a46e030b7b899900f2091f41071619d0a46288 。显然 lag.default 不会在将来的版本中被覆盖,但是 lag 将会屏蔽stats-version。我不知道 lag.zoo lag.zooreg 会发生什么。也许还会宣布加载包裹时写入或屏蔽?

Looking at github (after determining that I had the latest version of dplyr on CRAN), I see that this was an issue for the R CMD check process: https://github.com/hadley/dplyr/commit/f8a46e030b7b899900f2091f41071619d0a46288 . Apparently lag.default will not be over-written in future versions, but lag will mask the stats-version. I wonder what happens to lag.zoo and lag.zooreg. Maybe it will also announce that over-writing or masking when the package is loaded?

这篇关于更改stats的行为::加载dplyr包时滞后的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆