如何在data.table中的每个组中对随机行进行抽样? [英] How do you sample random rows within each group in a data.table?

查看:118
本文介绍了如何在data.table中的每个组中对随机行进行抽样?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用data.table有效地对数据框中每个组中的行进行抽样?

How would you use data.table to efficiently take a sample of rows within each group in a data frame?

DT = data.table(a = sample(1:2), b = sample(1:1000,20))
DT
    a   b
 1: 2 562
 2: 1 183
 3: 2 180
 4: 1 874
 5: 2 533
 6: 1  21
 7: 2  57
 8: 1  20
 9: 2  39
10: 1 948
11: 2 799
12: 1 893
13: 2 993
14: 1  69
15: 2 906
16: 1 347
17: 2 969
18: 1 130
19: 2 118
20: 1 732

思考类似: DT [,sample(??,3),by = a] 将返回每个a返回的行不重要):

I was thinking of something like: DT[ , sample(??, 3), by = a] that would return a sample of three rows for each "a" (the order of the returned rows isn't significant):

    a   b
 1: 2 180
 2: 2  57
 3: 2 799
 4: 1  69
 5: 1 347
 6: 1 732

我是新的data.table和R,所以任何建设性的指导将非常感激。

I'm new to data.table and R so any constructive guidance would be greatly apprecieated

推荐答案

也许这样吗?

> DT[,.SD[sample(.N,3)],by = a]
   a   b
1: 1 744
2: 1 497
3: 1 167
4: 2 888
5: 2 950
6: 2 343

(感谢Josh的修正,下面。)

(Thanks to Josh for the correction, below.)

这篇关于如何在data.table中的每个组中对随机行进行抽样?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆