在Hmisc之后加载tidyverse时的评估错误 [英] Evaluation Error when tidyverse is loaded after Hmisc

查看:81
本文介绍了在Hmisc之后加载tidyverse时的评估错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用r 3.3.3,dplyr 0.7.4和Hmisc 4.1-1.我注意到加载程序包的顺序会影响dplyr :: summaries函数是否起作用.我知道以不同顺序加载软件包会掩盖某些功能,但是我正在使用package :: function()语法来避免该问题.确切的问题与标记的变量有关.我知道过去有tidyverse和变量标签的问题,但似乎没有一个能解决为什么发生这种特殊情况的问题.

I am using r 3.3.3, dplyr 0.7.4, and Hmisc 4.1-1. I noticed that the order I load packages effects whether or not a dplyr::summaries function wold work or not. I understand that loading packages in a different order would mask certain functions but I am using the package::function() syntax to avoid that issue. The exact issue revolves around labeled variables. I know that there has been issues in the past with tidyverse and variable labels but none seem to address why this particular situation is occurring.

第一个有效的示例-我仅加载Hmisc,然后加载dplyr,并且能够汇总数据-

First example that works - I load only Hmisc then dplyr and I am able to summaries the data-

#this works fine
library(Hmisc)
library(dplyr)

Hmisc::label(iris$Petal.Width) <- "Petal Width"

sumpct <- iris %>% 
  dplyr::group_by(Species) %>% 
  dplyr::summarise(med =median(Petal.Width),A40 = round(100*ecdf(Petal.Width)(.40),1),
            A50 =round(100*ecdf(Petal.Width)(.50),1),
            mns = mean(Petal.Width),
            lowermean = mean(Petal.Width)-sd(Petal.Width),
            lowermedian = median(Petal.Width) - sd(Petal.Width))

下面的第二个示例中断.我开始一个新的会话并在Hmisc之后加载tidyverse,但仍使用package :: function()语法,但这会引发错误:

Second example below breaks. I start a new session and load tidyverse after Hmisc and still use the package::function() syntax but this throws the error :

summarise_impl(.data,点)中的错误:评估错误:xlabels必须是相同类型.

Error in summarise_impl(.data, dots) : Evaluation error: x and labels must be same type.

第二个示例:

###restart session 
#this example does not work

library(Hmisc)
library(tidyverse)


Hmisc::label(iris$Petal.Width) <- "Petal Width"

sumpct <- iris %>% 
  dplyr::group_by(Species) %>% 
  dplyr::summarise(med =median(Petal.Width),A40 = round(100*ecdf(Petal.Width)(.40),1),
                   A50 =round(100*ecdf(Petal.Width)(.50),1),
                   mns = mean(Petal.Width),
                   lowermean = mean(Petal.Width)-sd(Petal.Width),
                   lowermedian = median(Petal.Width) - sd(Petal.Width))

但是,第三个示例确实起作用,我只是重新启动会话并在Hmisc之前加载tidyverse

However, the third example does work where I just restart the session and load tidyverse before Hmisc

第三个示例:

###switch order of loading packages and this works

library(tidyverse)
library(Hmisc)


Hmisc::label(iris$Petal.Width) <- "Petal Width"

sumpct <- iris %>% 
  dplyr::group_by(Species) %>% 
  dplyr::summarise(med =median(Petal.Width),A40 = round(100*ecdf(Petal.Width)(.40),1),
                   A50 =round(100*ecdf(Petal.Width)(.50),1),
                   mns = mean(Petal.Width),
                   lowermean = mean(Petal.Width)-sd(Petal.Width),
                   lowermedian = median(Petal.Width) - sd(Petal.Width)) 

所以我的问题是,当我使用package :: function()语法(特别是针对标记变量和tidyverse)时,为什么加载软件包的顺序很重要?

So my question is why does the order in which I load packages matter when I am using the package::function() syntax specifically with respect to labeled variables and tidyverse?

更新:错误的以下会话信息:

Update: session info below for the error:

sessionInfo()

R版本3.3.3(2017-03-06) 运行于:Windows 7 x64 附加的基本程序包:[1]统计图形grDevices utils数据集方法基本

R version 3.3.3 (2017-03-06) Running under: Windows 7 x64 attached base packages: [1] stats graphics grDevices utils datasets methods base

其他附带的软件包:[1] bindrcpp_0.2 forcats_0.3.0
stringr_1.3.0 dplyr_0.7.4 [5] purrr_0.2.4 readr_1.1.1
tidyr_0.8.0 tibble_1.4.2 [9] tidyverse_1.2.1 Hmisc_4.1-1
ggplot2_2.2.1 Formula_1.2-2 [13] Surviving_2.41-3点阵_0.20-35

other attached packages: [1] bindrcpp_0.2 forcats_0.3.0
stringr_1.3.0 dplyr_0.7.4 [5] purrr_0.2.4 readr_1.1.1
tidyr_0.8.0 tibble_1.4.2 [9] tidyverse_1.2.1 Hmisc_4.1-1
ggplot2_2.2.1 Formula_1.2-2 [13] survival_2.41-3 lattice_0.20-35

通过名称空间(未附加)加载:[1] reshape2_1.4.3
splines_3.3.3 haven_1.1.1 [4] colorspace_1.3-2
htmltools_0.3.6 base64enc_0.1-3 [7] rlang_0.2.0
支柱_1.2.1 foreign_0.8-69 [10]胶水_1.2.0
RColorBrewer_1.1-2 readxl_1.0.0 [13] modelr_0.1.1
plyr_1.8.4绑定器_0.1.1 [16] cellranger_1.1.0
munsell_0.4.3 gtable_0.2.0 [19] rvest_0.3.2
htmlwidgets_1.0 psych_1.7.8 [22] gridExtra_0.6-28 knitr_1.20 parallel_3.3.3 [25] htmlTable_1.11.2
扫帚_0.4.3 Rcpp_0.12.16 [28] acepack_1.4.1
scales_0.5.0 backports_1.1.2 [31] checkmate_1.8.5
jsonlite_1.5 gridExtra_2.3 [34] mnormt_1.5-5
hms_0.4.2摘要_0.6.15 [37] stringi_1.1.7
grid_3.3.3 cli_1.0.0 [40] tools_3.3.3
magrittr_1.5 lazyeval_0.2.1 [43] cluster_2.0.6
crayon_1.3.4 pkgconfig_2.0.1 [46] Matrix_1.2-12
xml2_1.2.0 data.table_1.10.4-3 [49] lubridate_1.7.3
断言_0.2.0 httr_1.3.1 [52] rstudioapi_0.7
R6_2.2.2 rpart_4.1-13 [55] nnet_7.3-12
nlme_3.1-131.1

loaded via a namespace (and not attached): [1] reshape2_1.4.3
splines_3.3.3 haven_1.1.1 [4] colorspace_1.3-2
htmltools_0.3.6 base64enc_0.1-3 [7] rlang_0.2.0
pillar_1.2.1 foreign_0.8-69 [10] glue_1.2.0
RColorBrewer_1.1-2 readxl_1.0.0 [13] modelr_0.1.1
plyr_1.8.4 bindr_0.1.1 [16] cellranger_1.1.0
munsell_0.4.3 gtable_0.2.0 [19] rvest_0.3.2
htmlwidgets_1.0 psych_1.7.8 [22] latticeExtra_0.6-28 knitr_1.20 parallel_3.3.3 [25] htmlTable_1.11.2
broom_0.4.3 Rcpp_0.12.16 [28] acepack_1.4.1
scales_0.5.0 backports_1.1.2 [31] checkmate_1.8.5
jsonlite_1.5 gridExtra_2.3 [34] mnormt_1.5-5
hms_0.4.2 digest_0.6.15 [37] stringi_1.1.7
grid_3.3.3 cli_1.0.0 [40] tools_3.3.3
magrittr_1.5 lazyeval_0.2.1 [43] cluster_2.0.6
crayon_1.3.4 pkgconfig_2.0.1 [46] Matrix_1.2-12
xml2_1.2.0 data.table_1.10.4-3 [49] lubridate_1.7.3
assertthat_0.2.0 httr_1.3.1 [52] rstudioapi_0.7
R6_2.2.2 rpart_4.1-13 [55] nnet_7.3-12
nlme_3.1-131.1

推荐答案

更新:从 避风港版本2.0.0 此问题已得到解决,因为避风港标签"类已重命名为"haven_labelled"以避免与 Hmisc 发生冲突.

UPDATE: As of haven version 2.0.0 this issue has been resolved, as the haven "labelled" class was renamed to "haven_labelled" to avoid conflicts with Hmisc.

tl; dr:订单很重要.

要获得更详细的答案,让我们首先重现该错误:

For a more detailed answer, let's first reproduce the error:

library(Hmisc)
#> Loading required package: lattice
#> Loading required package: survival
#> Loading required package: Formula
#> Loading required package: ggplot2
#> 
#> Attaching package: 'Hmisc'
#> The following objects are masked from 'package:base':
#> 
#>     format.pval, units
library(tidyverse)
#> Warning: package 'forcats' was built under R version 3.4.4

从原始的summarise示例中逐个删除元素后, 我设法减少了将错误仅重现为以下几行代码:

After removing elements piece by piece from the original summarise example, I managed to reduce reproducing the error to just these lines of code:

Hmisc::label(iris$Petal.Width) <- "Petal Width"
head(iris)
#> Error: `x` and `labels` must be same type

我们可以看一下追溯,看看是否可以找到可能导致错误的函数:

We can have a look at the traceback to see if we can locate a function that could be causing the error:

traceback()
#> 8: stop("`x` and `labels` must be same type", call. = FALSE)
#> 7: labelled(NextMethod(), attr(x, "labels"))
#> 6: `[.labelled`(xj, i)
#> 5: xj[i]
#> 4: `[.data.frame`(x, seq_len(n), , drop = FALSE)
#> 3: x[seq_len(n), , drop = FALSE]
#> 2: head.data.frame(iris)
#> 1: head(iris)

[.labelled调用看起来可疑.为什么叫它?

The [.labelled call looks suspicious. Why is it even called?

lapply(iris, class)
#> $Sepal.Length
#> [1] "numeric"
#> 
#> $Sepal.Width
#> [1] "numeric"
#> 
#> $Petal.Length
#> [1] "numeric"
#> 
#> $Petal.Width
#> [1] "labelled" "numeric" 
#> 
#> $Species
#> [1] "factor"

啊,用Hmisc::labelPetal.Width设置标签也添加了S3类. 我们可以检查在getAnywhere中定义该方法的位置:

Ah, setting a label for Petal.Width with Hmisc::label also added the S3 class. We can inspect where the method is defined with getAnywhere:

getAnywhere("[.labelled")
#> 2 differing objects matching '[.labelled' were found
#> in the following places
#>   registered S3 method for [ from namespace haven
#>   namespace:Hmisc
#>   namespace:haven
#> Use [] to view one of them

实际上,havenHmisc都定义了该方法.并且由于haven是 在Hmisc之后加载,首先找到其定义,然后使用它:

Indeed, both haven and Hmisc define the method. And since haven is loaded after Hmisc, its definition is found first, and thus gets used:

getAnywhere("[.labelled")[1]
#> function (x, ...) 
#> {
#>     labelled(NextMethod(), attr(x, "labels"))
#> }
#> <environment: namespace:haven>

haven期望labelled对象具有labels属性,该属性 Hmisc::label不会创建:

haven expects labelled objects to have a labels attribute, which Hmisc::label doesn't create:

attr(iris$Petal.Width, "labels")
#> NULL

这就是错误的出处.


但是请等待:为什么甚至加载了haven?它不附带library(tidyverse). 原来,haven列为导入包. >, 这会导致在连接包装后将其加载(请参见例如 此处).并加载一个包裹, 除其他外,注册其S3方法:这是冲突的地方 来自.


But wait: why is haven even loaded? It's not attached with library(tidyverse). Turns out, that haven is listed as an imported package in tidyverse, which causes it to be loaded when the package is attached (see e.g. here). And loading a package, among other things, registers its S3 methods: which is where the conflict comes from.

照原样,如果要同时使用Hmisctidyverse,则顺序很重要. 为了进一步解决该问题,可能需要更改源级别 软件包对labelled S3类的使用.

As it is, if you want to use both Hmisc and tidyverse, order matters. To address the issue further would likely require source level changes in the packages' use of the labelled S3 class.

reprex软件包(v0.2.0)创建于2018-03-21.

Created on 2018-03-21 by the reprex package (v0.2.0).

这篇关于在Hmisc之后加载tidyverse时的评估错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆