从数据表中选择具有最小值的行 [英] Select row from data.table with min value
本文介绍了从数据表中选择具有最小值的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个 data.table
,我需要在上面计算一些新值,并选择 min
行
I have a data.table
and I need to compute some new value on it and select row with min
value.
tb <- data.table(g_id=c(1, 1, 1, 2, 2, 2, 3),
item_no=c(24,25,26,27,28,29,30),
time_no=c(100, 110, 120, 130, 140, 160, 160),
key="g_id")
# g_id item_no time_no
# 1: 1 24 100
# 2: 1 25 110
# 3: 1 26 120
# 4: 2 27 130
# 5: 2 28 140
# 6: 2 29 160
# 7: 3 30 160
ts <- 118
gId <- 2
tb[.(gId), list(item_no, tdiff={z=abs(time_no - ts)})]
# g_id item_no tdiff
# 1: 2 27 12
# 2: 2 28 22
# 3: 2 29 42
现在我需要以最小的<$获得该行(实际上只有该行的 item_no
) c $ c> tdiff
And now I need to get the row (actually only item_no
of this row) with minimal tdiff
# g_id item_no tdiff
# 1: 2 27 12
我可以用 tb
?最快的方法是什么(因为我需要执行此操作约500,000行)?
Can I make it in one operation with tb
? What is the fastest way to do this (because I need to do this operation about 500,000 rows)?
推荐答案
您可以尝试 .SD
和 [] []
链查询。
据我了解,问题在于,首先更新一个新列,然后找到最小的tdiff
The problem to my understanding is that first you update an new column, then find the minimal tdiff
library(data.table)
tb <- data.table(g_id=c(1, 1, 1, 2, 2, 2, 3),
item_no=c(24,25,26,27,28,29,30),
time_no=c(100, 110, 120, 130, 140, 160, 160),
key="g_id")
ts <- 118
# My solution is quite simple
tb[, tdiff := list(tdiff=abs(time_no - ts))][, .SD[which.min(tdiff)], by = key(tb)]
我认为 .SD
比较合适。您也可以使用:= $code>
I think .SD
is more appropriate. Also you can update using :=
进行更新,这是输出:
g_id item_no time_no tdiff
1: 1 26 120 2
2: 2 27 130 12
3: 3 30 160 42
这篇关于从数据表中选择具有最小值的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文