dplyr:在group_by组内添加行 [英] dplyr: add rows within group_by groups
问题描述
在group_by()
组中添加行是否比使用bind_rows()
更好的方法?这是一个有点笨拙的示例:
Is there a better way to add rows within group_by()
groups than using bind_rows()
? Here's an example that's a little clunky:
df <- data.frame(a=c(1,1,1,2,2), b=1:5)
df %>%
group_by(a) %>%
do(bind_rows(data.frame(a=.$a[1], b=0), ., data.frame(a=.$a[1], b=10)))
我们的想法是,可以从组中推断出我们已经分组的列.
The idea is that columns that we're already grouping on could be inferred from the groups.
我想知道这样的事情是否可以代替:
I was wondering whether something like this could work instead:
df %>%
group_by(a) %>%
insert(b=0, .at=0) %>%
insert(b=10)
类似于append()
,它可以默认为在所有现有元素之后插入,并且可以很聪明地对未指定的任何列使用组值.也许将NA
用于未指定的非分组列.
Like append()
, it could default to inserting after all existing elements, and it could be smart enough to use group values for any columns unspecified. Maybe use NA
for non-grouping columns unspecified.
我是否错过了现有的便捷语法,或者这会有所帮助吗?
Is there an existing convenient syntax I've missed, or would this be helpful?
推荐答案
以下是使用data.table
的方法:
library(data.table)
setDT(df)
rbind(df, expand.grid(b = c(0, 10), a = df[ , unique(a)]))[order(a, b)]
根据您的实际情况,这个更简单的选择也可以使用:
Depending on your actual context this much simpler alternative would work too:
df[ , .(b = c(0, b, 10)), by = a]
(如果我们不在乎保留名称b
,我们可以在j
中简单地使用c(0, b, 10)
)
(and we can simply use c(0, b, 10)
in j
if we don't care about keeping the name b
)
前者的优点是,即使df
具有更多列,它也可以工作-只需将fill = TRUE
设置为rbind.data.table
.
The former has the advantage that it will work even if df
has more columns -- just have to set fill = TRUE
for rbind.data.table
.
这篇关于dplyr:在group_by组内添加行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!