组对列的逐步连接 [英] Progressive concatenation of a column by a group
问题描述
假设我输入以下内容:
ID date_1 date_2 str
1 1 2010-07-04 2008-01-20 A
2 2 2015-07-01 2011-08-31 C
3 3 2015-03-06 2013-01-18 D
4 4 2013-01-10 2011-08-30 D
5 5 2014-06-04 2011-09-18 B
6 5 2014-06-04 2011-09-18 B
7 6 2012-11-22 2011-09-28 C
8 7 2014-06-17 2013-08-04 A
10 7 2014-06-17 2013-08-04 B
11 7 2014-06-17 2013-08-04 B
我想逐步将 str
列中的组变量 ID
,如以下输出所示:
I would like to progressively concatenate the values of the str
column by the group variable ID
, as showed in the following output :
ID date_1 date_2 str
1 1 2010-07-04 2008-01-20 A
2 2 2015-07-01 2011-08-31 C
3 3 2015-03-06 2013-01-18 D
4 4 2013-01-10 2011-08-30 D
5 5 2014-06-04 2011-09-18 B
6 5 2014-06-04 2011-09-18 B,B
7 6 2012-11-22 2011-09-28 C
8 7 2014-06-17 2013-08-04 A
10 7 2014-06-17 2013-08-04 A,B
11 7 2014-06-17 2013-08-04 A,B,B
我尝试将 ave()
函数与以下代码配合使用:
I tried to use the ave()
function with this code :
within(table, {
Emp_list <- ave(str, ID, FUN = function(x) paste(x, collapse = ","))
})
但它给出以下输出,这不完全是我要:
but it gives the following output, which is not exactly what I want :
ID date_1 date_2 str
1 1 2010-07-04 2008-01-20 A
2 2 2015-07-01 2011-08-31 C
3 3 2015-03-06 2013-01-18 D
4 4 2013-01-10 2011-08-30 D
5 5 2014-06-04 2011-09-18 B,B
6 5 2014-06-04 2011-09-18 B,B
7 6 2012-11-22 2011-09-28 C
8 7 2014-06-17 2013-08-04 A,B,B
10 7 2014-06-17 2013-08-04 A,B,B
11 7 2014-06-17 2013-08-04 A,B,B
我当然想避免循环,因为我在大型数据库上工作。
Of course I'd like to avoid loops, as I work on a large database.
推荐答案
将 ave()
与怎么样? Reduce()
。 Reduce()
函数允许我们在计算结果时对其进行累加。因此,如果我们使用 paste()
运行它,则可以累积粘贴的字符串。
How about ave()
with Reduce()
. The Reduce()
function allows us to accumulate results as they are calculated. So if we run it with paste()
we can accumulate the pasted strings.
f <- function(x) {
Reduce(function(...) paste(..., sep = ", "), x, accumulate = TRUE)
}
df$str <- with(df, ave(as.character(str), ID, FUN = f)
给出更新的数据框 df
ID date_1 date_2 str
1 1 2010-07-04 2008-01-20 A
2 2 2015-07-01 2011-08-31 C
3 3 2015-03-06 2013-01-18 D
4 4 2013-01-10 2011-08-30 D
5 5 2014-06-04 2011-09-18 B
6 5 2014-06-04 2011-09-18 B, B
7 6 2012-11-22 2011-09-28 C
8 7 2014-06-17 2013-08-04 A
10 7 2014-06-17 2013-08-04 A, B
11 7 2014-06-17 2013-08-04 A, B, B
注意: function(...)paste(...,sep =,)
也可以是 function(x,y)paste(x,y,sep = ,)
。 (感谢Pierre Lafortune)
Note: function(...) paste(..., sep = ", ")
could also be function(x, y) paste(x, y, sep = ", ")
. (Thanks Pierre Lafortune)
这篇关于组对列的逐步连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!