按组获取唯一值的行 [英] get rows of unique values by group
问题描述
我有一个data.table,并且想要选择data.table的那些行,其中变量x的某些值相对于另一个变量y是唯一的
I have a data.table and want to pick those lines of the data.table where some values of a variable x are unique relative to another variable y
可能会获得x的唯一值,在单独的数据集中按y分组,像这样
It's possible to get the unique values of x, grouped by y in a separate dataset, like this
dt[,unique(x),by=y]
但是我想选择原始数据集中的行。我不需要新的data.table,因为我还需要其他变量。
But I want to pick the rows in the original dataset where this is the case. I don't want a new data.table because I also need the other variables.
因此,我必须添加什么代码才能在<
So, what do I have to add to my code to get the rows in dt
for which the above is true?
dt <- data.table(y=rep(letters[1:2],each=3),x=c(1,2,2,3,2,1),z=1:6)
y x z
1: a 1 1
2: a 2 2
3: a 2 3
4: b 3 4
5: b 2 5
6: b 1 6
我想要什么:
y x z
1: a 1 1
2: a 2 2
3: b 3 4
4: b 2 5
5: b 1 6
推荐答案
data.table
在使用重复
的方式上有点不同。这是我之前在这里见过的方法:
data.table
is a bit different in how to use duplicated
. Here's the approach I've seen around here somewhere before:
dt <- data.table(y=rep(letters[1:2],each=3),x=c(1,2,2,3,2,1),z=1:6)
setkey(dt, "y", "x")
key(dt)
# [1] "y" "x"
!duplicated(dt)
# [1] TRUE TRUE FALSE TRUE TRUE TRUE
dt[!duplicated(dt)]
# y x z
# 1: a 1 1
# 2: a 2 2
# 3: b 1 6
# 4: b 2 5
# 5: b 3 4
这篇关于按组获取唯一值的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!