用不同的值替换组中的最后一个值 [英] Replacing the last value within groups with different values

查看:176
本文介绍了用不同的值替换组中的最后一个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题类似于这篇文章,但是不同之处在于每个组/ id中的最后一个值都是0,而不同的值用于替换每个组/ id中的最后一个值。

My question is similar to this post, but the difference is instead of replacing the last value within each group/id with all 0's, different values are used to replace the last value within each group/id.

这是一个例子(我从上面的链接中借了):

Here is an example (I borrowed it from the above link):

          id  Time
1         1    3
2         1    10
3         1    1
4         1    0
5         1    9999
6         2    0
7         2    9
8         2    500
9         3    0
10        3    1

在上述链接中,每个组/ id中的最后一个值被替换为零,使用以下内容:

In the above link, the last value within each group/id was replaced by a zero, using something like:

df %>%
  group_by(id) %>%
  mutate(Time = c(Time[-n()], 0))

输出为

          id  Time
1         1    3
2         1    10
3         1    1
4         1    0
5         1    0
6         2    0
7         2    9
8         2    0
9         3    0
10        3    0

在我的情况下,我想要每个组/ id中的最后一个值由不同的值代替。最初,每个组/ id中的最后一个值为 9999 500 1 。现在我想: 9999 被替换为 5 500 12 替代, 1 替换为 92 。所需的输出是:

In my case, I would like the last value within each group/id to be replaced by a different value. Originally, the last value within each group/id was 9999, 500, and 1. Now I would like: 9999 is replaced by 5, 500 is replaced by 12, and 1 is replaced by 92. The desired output is:

          id  Time
1         1    3
2         1    10
3         1    1
4         1    0
5         1    5
6         2    0
7         2    9
8         2    12
9         3    0
10        3    92

我尝试过一个:

df %>%
  group_by(id) %>%
  mutate(Time = replace(Time, n(), c(5,12,92))),

但它没有工作。

推荐答案

使用 data.table 的另一种方法是创建另一个包含要为给定的 id 替换值,然后通过引用加入和更新(同时)。

Another way using data.table would be to create another data.table which contains the values to be replaced with for a given id, and then join and update by reference (simultaneously).

require(data.table) # v1.9.5+ (for 'on = ' feature)
replace = data.table(id = 1:3, val = c(5L, 12L, 9L)) # from @David
setDT(df)[replace, Time := val, on = "id", mult = "last"]

#     id Time
#  1:  1    3
#  2:  1   10
#  3:  1    1
#  4:  1    0
#  5:  1    5
#  6:  2    0
#  7:  2    9
#  8:  2   12
#  9:  3    0
# 10:  3    9




数据表连接被认为是子集的扩展。很自然地想到在连接上进行我们在子集上所做的任何操作。两个操作在某些行上执行某些操作

In data.table, joins are considered as an extension of subsets. It's natural to think of doing whatever operation we do on subsets also on joins. Both operations do something on some rows.

对于每个替换$ id ,我们在 df $ id 中找到最后匹配的行( mult =last) , 更新该行与相应的 val

For each replace$id, we find the last matching row (mult = "last") in df$id, and update that row with the corresponding val.

v1.9.5的安装说明 这里。希望这有帮助。

这篇关于用不同的值替换组中的最后一个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆