如何通过引用更改data.table中每个组中的最后一个值 [英] How to change the last value in each group by reference, in data.table
问题描述
对于按站点分组的data.table DT,按时间t排序,我需要更改每个组中变量的最后一个值。我认为应该可以使用:=进行引用,但是我还没有找到一种可行的方法。
For a data.table DT grouped by site, sorted by time t, I need to change the last value of a variable in each group. I assume it should be possible to do this by reference using :=, but I haven't found a way that works yet.
示例数据:
require(data.table) # using 1.8.11
DT <- data.table(site=c(rep("A",5), rep("B",4)),t=c(1:5,1:4),a=as.double(c(11:15,21:24)))
setkey(DT, site, t)
DT
# site t a
# 1: A 1 11
# 2: A 2 12
# 3: A 3 13
# 4: A 4 14
# 5: A 5 15
# 6: B 1 21
# 7: B 2 22
# 8: B 3 23
# 9: B 4 24
所需的结果是更改a的最后一个值在每个组中,例如为999,因此结果如下:
The desired result is to change the last value of a in each group, for example to 999, so the result looks like:
# site t a
# 1: A 1 11
# 2: A 2 12
# 3: A 3 13
# 4: A 4 14
# 5: A 5 999
# 6: B 1 21
# 7: B 2 22
# 8: B 3 23
# 9: B 4 999
似乎应该使用.I和/或.N,但是我有找不到有效的表格。在与.I [.N]相同的语句中使用:=会产生错误。下面给出了要进行分配的行号:
It seems like .I and/or .N should be used, but I haven't found a form that works. The use of := in the same statement as .I[.N] gives an error. The following gives me the row numbers where the assignment is to be made:
DT[, .I[.N], by=site]
# site V1
# 1: A 5
# 2: B 9
但我似乎无法将其与:=分配一起使用。以下给出错误:
but I don't seem to be able to use this with a := assignment. The following give errors:
DT[.N, a:=999, by=site]
# Null data.table (0 rows and 0 cols)
DT[, .I[.N, a:=999], by=site]
# Error in `:=`(a, 999) :
# := and `:=`(...) are defined for use in j, once only and in particular ways.
# See help(":="). Check is.data.table(DT) is TRUE.
DT[.I[.N], a:=999, by=site]
# Null data.table (0 rows and 0 cols)
有没有一种方法可以通过引用data.table来做到这一点?还是在R中用另一种方法更好地完成了?
Is there a way to do this by reference in data.table? Or is this better done another way in R?
推荐答案
当前您可以使用:
DT[DT[, .I[.N], by = site][['V1']], a := 999]
# or, avoiding the overhead of a second call to `[.data.table`
set(DT, i = DT[,.I[.N],by='site'][['V1']], j = 'a', value = 999L)
替代方法:
使用替换
...
DT[, a := replace(a, .N, 999), by = site]
或将替换项转移到RHS中,并用<$ c包装$ c> {} 并返回完整向量
or shift the replacement to the RHS, wrapped by {}
and return the full vector
DT[, a := {a[.N] <- 999L; a}, by = site]
或使用 mult ='last'
并充分利用逐个的优势。
or use mult='last'
and take advantage of by-without-by
. This requires the data.table to be keyed by the groups of interest.
DT[unique(site), a := 999, mult = 'last']
有功能请求#2793
DT[, a[.N] := 999]
但这尚未实现
这篇关于如何通过引用更改data.table中每个组中的最后一个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!