是否可以重命名“ by”? R中的data.table中的分组变量? [英] Is it possible to rename a "by" grouping variable in data.table in R en passant?
问题描述
在 data.table
中我注意到了,当使用 by
选项汇总值时,分组变量采用自然数据集中的顺序,我相信类似于SQL。因此,如果数据中2在1之前位于1之前,则输出的顺序将聚合级别2置于1之前。在大多数情况下,我不希望这样做。我注意到可以在 by
变量上调用 sort
,但是输出列标签现在为排序
。可以通过以前的值(或完全不同的名称)来命名它吗?
I have noticed in data.table
when aggregating values using the by
option, the grouping variable takes its natural order in the dataset, akin to SQL I believe. So if 2 precedes 1 in the data, the ordering of the output has the aggregate level 2 preceding 1. In most cases, I don't want this. I noticed one can call sort
on the by
variable, but the output column label is now sort
. Is it possible to name it by its previous value (or something completely different?) example:
mydt <- data.table(nums=1:5, lets=letters[5:1])
mydt[, .(is2=nums==2), by=sort(lets)]
给予
sort is2
1: a F
2: b T
3: c F
4: d F
5: e F
但我想要:
lets is2
1: a F
2: b T
3: c F
4: d F
5: e F
推荐答案
问题标题为是否可以在data中通过data.table重命名 by分组变量?,但是实际的问题是如何通过分组变量对聚合结果进行排序。因此,有两个问题。
The question is titled Is it possible to rename a "by" grouping variable in data.table in R en passant? but the actual problem is how to sort the result of an aggregation by the grouping variables. So, there are two questions in one.
是的,例如,
mydt[, .(is2 = nums == 2), by = .(lets = paste(lets, toupper(lets), sep = "-"))]
lets is2
1: e-E FALSE
2: d-D TRUE
3: c-C FALSE
4: b-B FALSE
5: a-A FALSE
为了说明起见,使用了完全不同的函数。
For illustration, a completely different function is used.
最简单的方法是使用 keyby =
,正如弗兰克。
The simplest way is to use keyby =
as already mentioned by Frank.
mydt[, .(is2 = nums == 2), keyby = lets]
lets is2
1: a FALSE
2: b FALSE
3: c FALSE
4: d TRUE
5: e FALSE
help( data.table)
说
与 by
相同,但在<< c>上运行 setkey()
为方便起见,在
的code> by 列中。通常,当您希望对结果进行排序时,通常会定期使用'keyby ='
。
Same as
by
, but with an additionalsetkey()
run on theby
columns of the result, for convenience. It is common practice to use 'keyby=' routinely when you wish the result to be sorted.
可以随后对结果进行排序:
Alternatively, the result can be ordered afterwards:
mydt[, .(is2 = nums == 2), by = lets][order(lets)]
lets is2
1: a FALSE
2: b FALSE
3: c FALSE
4: d TRUE
5: e FALSE
这篇关于是否可以重命名“ by”? R中的data.table中的分组变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!