dplyr:在group_by组内添加行 [英] dplyr: add rows within group_by groups

查看:137
本文介绍了dplyr:在group_by组内添加行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

group_by()组中添加行是否比使用bind_rows()更好的方法?这是一个有点笨拙的示例:

Is there a better way to add rows within group_by() groups than using bind_rows()? Here's an example that's a little clunky:

df <- data.frame(a=c(1,1,1,2,2), b=1:5)

df %>%
  group_by(a) %>%
  do(bind_rows(data.frame(a=.$a[1], b=0), ., data.frame(a=.$a[1], b=10)))

我们的想法是,可以从组中推断出我们已经分组的列.

The idea is that columns that we're already grouping on could be inferred from the groups.

我想知道这样的事情是否可以代替:

I was wondering whether something like this could work instead:

df %>%
  group_by(a) %>%
  insert(b=0, .at=0) %>%
  insert(b=10)

类似于append(),它可以默认为在所有现有元素之后插入,并且可以很聪明地对未指定的任何列使用组值.也许将NA用于未指定的非分组列.

Like append(), it could default to inserting after all existing elements, and it could be smart enough to use group values for any columns unspecified. Maybe use NA for non-grouping columns unspecified.

我是否错过了现有的便捷语法,或者这会有所帮助吗?

Is there an existing convenient syntax I've missed, or would this be helpful?

推荐答案

以下是使用data.table的方法:

library(data.table)
setDT(df)

rbind(df, expand.grid(b = c(0, 10), a = df[ , unique(a)]))[order(a, b)]

根据您的实际情况,这个更简单的选择也可以使用:

Depending on your actual context this much simpler alternative would work too:

df[ , .(b = c(0, b, 10)), by = a]

(如果我们不在乎保留名称b,我们可以在j中简单地使用c(0, b, 10))

(and we can simply use c(0, b, 10) in j if we don't care about keeping the name b)

前者的优点是,即使df具有更多列,它也可以工作-只需将fill = TRUE设置为rbind.data.table.

The former has the advantage that it will work even if df has more columns -- just have to set fill = TRUE for rbind.data.table.

这篇关于dplyr:在group_by组内添加行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆