dplyr 在 mutate 中广播每组的单个值 [英] dplyr broadcasting single value per group in mutate

查看:29
本文介绍了dplyr 在 mutate 中广播每组的单个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试做一些非常类似于 相对于每个组中的一个值进行缩放(通过 dplyr)(但是这个解决方案对我来说似乎使 R 崩溃).我想为每个组复制一个值并添加一个重复此值的新列.作为一个例子,我有

I am trying to do something very similar to Scale relative to a value in each group (via dplyr) (however this solution seems to crash R for me). I would like to replicate a single value for each group and add a new column with this value repeated. As an example I have

library(dplyr)

data = expand.grid(
  category = LETTERS[1:2],
  year = 2000:2003)
data$value = runif(nrow(data))

data

  category year     value
1        A 2000 0.6278798
2        B 2000 0.6112281
3        A 2001 0.2170495
4        B 2001 0.6454874
5        A 2002 0.9234604
6        B 2002 0.9311204
7        A 2003 0.5387899
8        B 2003 0.5573527

我想要一个像

data

  category year     value    value2
1        A 2000 0.6278798 0.6278798
2        B 2000 0.6112281 0.6112281
3        A 2001 0.2170495 0.6278798
4        B 2001 0.6454874 0.6112281
5        A 2002 0.9234604 0.6278798
6        B 2002 0.9311204 0.6112281
7        A 2003 0.5387899 0.6278798
8        B 2003 0.5573527 0.6112281

即每个类别的值是 2000 年的值.我试图考虑可扩展到给定过滤标准的通用解决方案,即类似

i.e. the value for each category is the value from year 2000. I was trying to think of a general solution extensible to a given filtering criteria, i.e. something like

data %>% group_by(category) %>% mutate(value = filter(data, year==2002))

但是由于分配的长度不正确,这不起作用.

however this does not work because of incorrect length in the assignment.

推荐答案

这样做:

data %>% group_by(category) %>%
  mutate(value2 = value[year == 2000])

你也可以这样做:

data %>% group_by(category) %>%
  arrange(year) %>%
  mutate(value2 = value[1])

data %>% group_by(category) %>%
  arrange(year) %>%
  mutate(value2 = first(value))

data %>% group_by(category) %>%
  mutate(value2 = nth(value, n = 1, order_by = "year"))

或者可能有其他几种方式.

or probably several other ways.

您对 mutate(value = filter(data, year==2002)) 的尝试由于某些原因没有意义.

Your attempt with mutate(value = filter(data, year==2002)) doesn't make sense for a few reasons.

  1. 当您再次显式传入 data 时,它不是之前分组的链的一部分,因此它不知道分组.

  1. When you explicitly pass in data again, it's not part of the chain that got grouped earlier, so it doesn't know about the grouping.

所有 dplyr 动词都将数据框作为第一个参数并返回一个数据框,包括 filter.当您执行 value = filter(...) 时,您试图将完整的数据框分配给单列 value.

All dplyr verbs take a data frame as first argument and return a data frame, including filter. When you do value = filter(...) you're trying to assign a full data frame to the single column value.

这篇关于dplyr 在 mutate 中广播每组的单个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆