如何在data.table中的每个组中对随机行进行抽样? [英] How do you sample random rows within each group in a data.table?
本文介绍了如何在data.table中的每个组中对随机行进行抽样?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何使用data.table有效地对数据框中每个组中的行进行抽样?
How would you use data.table to efficiently take a sample of rows within each group in a data frame?
DT = data.table(a = sample(1:2), b = sample(1:1000,20))
DT
a b
1: 2 562
2: 1 183
3: 2 180
4: 1 874
5: 2 533
6: 1 21
7: 2 57
8: 1 20
9: 2 39
10: 1 948
11: 2 799
12: 1 893
13: 2 993
14: 1 69
15: 2 906
16: 1 347
17: 2 969
18: 1 130
19: 2 118
20: 1 732
思考类似: DT [,sample(??,3),by = a]
将返回每个a返回的行不重要):
I was thinking of something like: DT[ , sample(??, 3), by = a]
that would return a sample of three rows for each "a" (the order of the returned rows isn't significant):
a b
1: 2 180
2: 2 57
3: 2 799
4: 1 69
5: 1 347
6: 1 732
我是新的data.table和R,所以任何建设性的指导将非常感激。
I'm new to data.table and R so any constructive guidance would be greatly apprecieated
推荐答案
也许这样吗?
> DT[,.SD[sample(.N,3)],by = a]
a b
1: 1 744
2: 1 497
3: 1 167
4: 2 888
5: 2 950
6: 2 343
(感谢Josh的修正,下面。)
(Thanks to Josh for the correction, below.)
这篇关于如何在data.table中的每个组中对随机行进行抽样?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文