将data.fame分组后的自定义函数 [英] custom function after grouping data.fame
问题描述
给出以下data.frame
Given the following data.frame
d <- rep(c("a", "b"), each=5)
l <- rep(1:5, 2)
v <- 1:10
df <- data.frame(d=d, l=l, v=v*v)
df
d l v
1 a 1 1
2 a 2 4
3 a 3 9
4 a 4 16
5 a 5 25
6 b 1 36
7 b 2 49
8 b 3 64
9 b 4 81
10 b 5 100
现在我想在按l分组后添加另一列。额外的列应包含v_b-v_a
Now I want to add another column after grouping by l. The extra column should contain the value of v_b - v_a
d l v e
1 a 1 1 35 (36-1)
2 a 2 4 45 (49-4)
3 a 3 9 55 (64-9)
4 a 4 16 65 (81-16)
5 a 5 25 75 (100-25)
6 b 1 36 35 (36-1)
7 b 2 49 45 (49-4)
8 b 3 64 55 (64-9)
9 b 4 81 65 (81-16)
10 b 5 100 75 (100-25)
在括号中是如何计算值的方法。
In paranthesis the way how to calculate the value.
我正在寻找一种使用dplyr的方法。所以我从这样的东西开始
I'm looking for a way using dplyr. So I started with something like this
df %.%
group_by(l) %.%
mutate(e=myCustomFunction)
但是我应该如何定义myCustomFunction?我认为data.frame的分组会产生另一个(sub)data.frame,这是此函数的参数。但这不是...
But how should I define myCustomFunction? I thought grouping of the data.frame produces another (sub-)data.frame which is a parameter to this function. But it isn't...
推荐答案
我猜这是 dplyr
等同于@jlhoward的 data.table
解决方案:
I guess this is the dplyr
equivalent to @jlhoward's data.table
solution:
df %>%
group_by(l) %>%
mutate(e = v[d == "b"] - v[d == "a"])
在OP注释后编辑:
如果要使用自定义函数,这是一种可能的方法:
Edit after comment by OP:
If you want to use a custom function, here's a possible way:
myfunc <- function(x) {
with(x, v[d == "b"] - v[d == "a"])
}
test %>%
group_by(l) %>%
do(data.frame(. , e = myfunc(.))) %>%
arrange(d, l) # <- just to get it back in the original order
在@hadley评论后编辑:
正如哈德利在下面的评论,在这种情况下,最好将函数定义为
Edit after comment by @hadley:
As hadley commented below, it would be better in this case to define the function as
f <- function(v, d) v[d == "b"] - v[d == "a"]
然后使用custo m函数 f
在 mutate
内:
and then use the custom function f
inside a mutate
:
df %>%
group_by(l) %>%
mutate(e = f(v, d))
感谢@hadley的评论。
Thanks @hadley for the comment.
这篇关于将data.fame分组后的自定义函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!