在dplyr tbl_df中获得一个丢弃的列的最佳做法 [英] Best practice to get a dropped column in dplyr tbl_df
问题描述
drop = TRUE
在 [。data.frame
是R历史上最糟糕的设计决策。 dplyr
纠正并不会隐含删除。当尝试将旧代码转换为 dplyr
样式时,当 d [,1]
或 d [1]
被假定为一个向量。
我目前的解决方法使用 unlist
如下所示,以获得1列向量。任何更好的想法?
库(dplyr)
d2 = data.frame(x = 1: 5,y =(1:5)^ 2)
str(d2 [,1])#implicit drop = TRUE
#int [1:5] 1 2 3 4 5
str(d2 [,1,drop = FALSE])
#data.frame':5 obs。的1个变量:
#$ x:int 1 2 3 4 5
#用dplyr函数
d1 = data_frame(x = 1:5,y = x ^ 2)
str(d1 [,1])$ b $ b#Classes'tbl_df'和'data.frame':5 obs。的1个变量:
#$ x:int 1 2 3 4 5
str(unlist(d1 [,1]))
#这个丑陋的结构与str相同(d2 [,1])$ b $ b str(d1 [,1] [[1]])
您可以使用 [[
提取功能而不是 [
。
d1 [[1]]
## [1] 1 2 3 4 5
如果您使用dplyr使用很多管道,您可能还需要使用便利功能从
和 magrittr
包中提取 extract2
d1%>%magrittr :: extract(1)%>%str
##类'tbl_df'和'data.frame':5 obs。的1个变量:
## $ x:int 1 2 3 4 5
d1%>%magrittr :: extract2(1)%>%str
## int [1: 5] 1 2 3 4 5
或如果提取
对你来说太冗长了,你可以直接在管道中使用 [
:
code> d1%>%`[`(1)%>%str
##类'tbl_df'和'data.frame':5 obs。的1个变量:
## $ x:int 1 2 3 4 5
d1%>%`[[`(1)%>%str
## int [1: 5] 1 2 3 4 5
I remember a comment on r-help in 2001 saying that drop = TRUE
in [.data.frame
was the worst design decision in R history.
dplyr
corrects that and does not drop implicitly. When trying to convert old code to dplyr
style, this introduces some nasty bugs when d[, 1]
or d[1]
is assumed a vector.
My current workaround uses unlist
as shown below to obtain a 1-column vector. Any better ideas?
library(dplyr)
d2 = data.frame(x = 1:5, y = (1:5) ^ 2)
str(d2[,1]) # implicit drop = TRUE
# int [1:5] 1 2 3 4 5
str(d2[,1, drop = FALSE])
# data.frame': 5 obs. of 1 variable:
# $ x: int 1 2 3 4 5
# With dplyr functions
d1 = data_frame(x = 1:5, y = x ^ 2)
str(d1[,1])
# Classes ‘tbl_df’ and 'data.frame': 5 obs. of 1 variable:
# $ x: int 1 2 3 4 5
str(unlist(d1[,1]))
# This ugly construct gives the same as str(d2[,1])
str(d1[,1][[1]])
You can just use the [[
extract function instead of [
.
d1[[1]]
## [1] 1 2 3 4 5
If you use a lot of piping with dplyr, you may also want to use the convenience functions extract
and extract2
from the magrittr
package:
d1 %>% magrittr::extract(1) %>% str
## Classes ‘tbl_df’ and 'data.frame': 5 obs. of 1 variable:
## $ x: int 1 2 3 4 5
d1 %>% magrittr::extract2(1) %>% str
## int [1:5] 1 2 3 4 5
Or if extract
is too verbose for you, you can just use [
directly in the pipe:
d1 %>% `[`(1) %>% str
## Classes ‘tbl_df’ and 'data.frame': 5 obs. of 1 variable:
## $ x: int 1 2 3 4 5
d1 %>% `[[`(1) %>% str
## int [1:5] 1 2 3 4 5
这篇关于在dplyr tbl_df中获得一个丢弃的列的最佳做法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!