dplyr在mutate中每组广播单个值 [英] dplyr broadcasting single value per group in mutate

查看:75
本文介绍了dplyr在mutate中每组广播单个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试做与相对于每个组中的值进行缩放(通过dplyr)(但是,这种解决方案似乎会使R崩溃)。我想为每个组复制一个值,并添加一个重复此值的新列。例如,我有

I am trying to do something very similar to Scale relative to a value in each group (via dplyr) (however this solution seems to crash R for me). I would like to replicate a single value for each group and add a new column with this value repeated. As an example I have

library(dplyr)

data = expand.grid(
  category = LETTERS[1:2],
  year = 2000:2003)
data$value = runif(nrow(data))

data

  category year     value
1        A 2000 0.6278798
2        B 2000 0.6112281
3        A 2001 0.2170495
4        B 2001 0.6454874
5        A 2002 0.9234604
6        B 2002 0.9311204
7        A 2003 0.5387899
8        B 2003 0.5573527

想要一个像这样的数据帧

And I would like a dataframe like

data

  category year     value    value2
1        A 2000 0.6278798 0.6278798
2        B 2000 0.6112281 0.6112281
3        A 2001 0.2170495 0.6278798
4        B 2001 0.6454874 0.6112281
5        A 2002 0.9234604 0.6278798
6        B 2002 0.9311204 0.6112281
7        A 2003 0.5387899 0.6278798
8        B 2003 0.5573527 0.6112281

ie 。每个类别的值都是2000年以来的值。我试图考虑一个可扩展到给定过滤条件的通用解决方案,例如

i.e. the value for each category is the value from year 2000. I was trying to think of a general solution extensible to a given filtering criteria, i.e. something like

data %>% group_by(category) %>% mutate(value = filter(data, year==2002))

但是由于分配长度错误,此操作不起作用。

however this does not work because of incorrect length in the assignment.

推荐答案

data %>% group_by(category) %>%
  mutate(value2 = value[year == 2000])

您也可以这样操作:

data %>% group_by(category) %>%
  arrange(year) %>%
  mutate(value2 = value[1])

data %>% group_by(category) %>%
  arrange(year) %>%
  mutate(value2 = first(value))

data %>% group_by(category) %>%
  mutate(value2 = nth(value, n = 1, order_by = "year"))



<或其他几种方式。

or probably several other ways.

尝试使用 mutate(value = filter(data,year == 2002))有几个原因。

Your attempt with mutate(value = filter(data, year==2002)) doesn't make sense for a few reasons.


  1. 再次显式传递 data 时,它不属于

所有 dplyr 动词将数据框作为第一个参数,并返回一个数据框,包括 filter 。当您执行 value = filter(...)时,您试图将完整的数据框分配给单列 value

All dplyr verbs take a data frame as first argument and return a data frame, including filter. When you do value = filter(...) you're trying to assign a full data frame to the single column value.

这篇关于dplyr在mutate中每组广播单个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆