复制后的data.table截止行 [英] data.table cutoff row after duplicate

查看：50 发布时间：2020/10/15 20:35:57 r data.table

本文介绍了复制后的data.table截止行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

让我们说我有以下数据集：

Lets say i have the following dataset:

library(data.table)
dt <- data.table(x = c(1, 2, 4, 5, 2, 3, 4))

> dt
   x
1: 1
2: 2
3: 4
4: 5
5: 2
6: 3
7: 4

我想在第4行之后中断，因为那是第一个重复行（编号2

I would like to cutoff after the 4th row since then the first duplicate (number 2) occurs.

预期输出：

   x
1: 1
2: 2
3: 4
4: 5

不用说，我并不是在寻找 dt [1：4，，] [] ，因为真实数据集更加复杂。

Needless to say, I am not looking for dt[1:4, ,][] as the real dataset more "complicated".

我尝试了 shift（）， .I ，但是没有用。
一个想法是： dt [x％in％dt $ x [1 :(。I-1）] 、. SD，] [] 。

I tried around with shift(), .I, but it didnt work. One idea was: dt[x %in% dt$x[1:(.I - 1)], .SD, ][].

也许我们可以使用重复的

dt[seq_len(which(duplicated(x))[1]-1)]
#   x
#1: 1
#2: 2
#3: 4
#4: 5

还是@lmo建议

dt[seq_len(which.max(duplicated(dt))-1)]

这篇关于复制后的data.table截止行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文