长变量名称在dplyr中失败 [英] Long variable name fails in dplyr

查看:140
本文介绍了长变量名称在dplyr中失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

长度超过39个字符的字符串在dplyr中失败,返回错误:Error:index out of bounds。



我错过了什么,还是这个错误?



40个字符不起作用:



 库(dplyr)

名称(iris)[5] < - vvv_5vvv10vvv15vvv20vvvvvvvvvvvvvvvvvv

iris%>%dplyr :: group_by(vvv_5vvv10vvv15vvv20vvvvvvvvvvvvvvvvvvv)%>%

dplyr :: summarize(n())

给我错误:
错误:索引出边界



< h2> 39个字符作品:

 名称(iris)[5]<  - vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vv39

iris%>%dplyr :: group_by(vvv_5vvv10vvvvvvvvvvvvvvvvvvvvvvvvvvv39)%>%

dplyr :: summarize(n())

工作正常。给我这个(想要的)输出

 来源:本地数据框架[3 x 2] 

vvv_5vvv10vvv15vvv20vvvvvvvvvvvvvvvvv39 n()
1 setosa 50
2 versicolor 50
3 virginica 5

SessionInfo()

 > sessionInfo()
R版本3.1.1(2014-07-10)
平台:x86_64-w64-mingw32 / x64(64位)

语言环境:
[1] LC_COLLATE = Denmark_Denmark.1252 LC_CTYPE = Denmark_Denmark.1252 LC_MONETARY = Denmark_Denmark.1252 LC_NUMERIC = C
[5] LC_TIME = Denmark_Denmark.1252

附加的基本包:
[1]统计图形grDevices utils数据集方法base

其他附加的包:
[1] dplyr_0.3.0.2

通过命名空间加载(而不是附加):
[1] assertthat_0.1 DBI_0.3.1 lazyeval_0.1.9 magrittr_1.0.1 parallel_3.1.1 Rcpp_0.11.3 tools_3.1.1


解决方案

这似乎是 一个已知问题 ,修正为 dplyr 0.3.1 。从@romainfrancois在发帖中的回复:



发生在这里[...]

  new_groups<  -  lazyeval :: auto_name(new_groups)

因为:

  lazyeval :: auto_name 
函数(x,max_width = 40)
{
名称(x)< - auto_names(x,max_width = max_width)
x
}
< environment:namespace:lazyeval>



更新

dplyr 0.4.0 group_by()支持超过39个字符的变量,这要归功于 lazyeval 中的修正:

 库(dplyr)
#具有40个字符的变量名
名称(iris)[5]< - vvv_5vvv10vvvvvvvvvvvvvvvvvvvvvvvvvvv

iris%>%
group_by(vvv_5vvv10vvvvvvvvvvvvvvvvvvvvvvvvvv40)%>%
总结(n())
# vvv_5vvv10vvv15vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv

Strings longer than 39 characters fail in dplyr, returning the error: "Error: index out of bounds".

Am I missing something or is this a bug?

40 characters does not work:

library(dplyr)

names(iris)[5]   <- "vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40"

iris %>% dplyr::group_by(  vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40 ) %>%

  dplyr::summarise( n() )

Gives me the error: Error: index out of bounds

39 characters works:

names(iris)[5]   <- "vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vv39"

iris %>% dplyr::group_by(  vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vv39 ) %>%

  dplyr::summarise( n() )

Works fine. gives me this (desired) output

Source: local data frame [3 x 2]

      vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vv39 n()
    1                                  setosa  50
    2                              versicolor  50
    3                               virginica  5

SessionInfo()

> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Danish_Denmark.1252  LC_CTYPE=Danish_Denmark.1252    LC_MONETARY=Danish_Denmark.1252 LC_NUMERIC=C                   
[5] LC_TIME=Danish_Denmark.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.3.0.2

loaded via a namespace (and not attached):
[1] assertthat_0.1 DBI_0.3.1      lazyeval_0.1.9 magrittr_1.0.1 parallel_3.1.1 Rcpp_0.11.3    tools_3.1.1  

解决方案

This seems to be a known issue, to be fixed in dplyr 0.3.1. From the reply by @romainfrancois in the post:

"It happens here [...]

new_groups <- lazyeval::auto_name(new_groups)

because:

lazyeval::auto_name
function (x, max_width = 40)
{
    names(x) <- auto_names(x, max_width = max_width)
    x
}
<environment: namespace:lazyeval>

"

Update
In dplyr 0.4.0 "group_by() supports variables with more than 39 characters thanks to a fix in lazyeval":

library(dplyr)
# Variable name with 40 characters
names(iris)[5] <- "vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40"

iris %>% 
  group_by(vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40) %>%
  summarise(n())
#   vvv_5vvv10vvv15vvv20vvv25vvv30vvv35vvv40 n()
# 1                                   setosa  50
# 2                               versicolor  50
# 3                                virginica  50

这篇关于长变量名称在dplyr中失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆