如何从数据表中提取几个随机行 [英] How do you extract a few random rows from a data.table on the fly

查看：193 发布时间：2017/3/12 10:28:55 r data.table sample

本文介绍了如何从数据表中提取几个随机行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个大的data.table（约24000行和增长）。我想根据几个条件和该子集（最终约3000行）的子集该数据库我想随机抽样只有4行。我不想创建一个名为3000行的data.table，计数其行，然后基于行号进行抽样。我怎么能在飞行呢？或者我应该通过创建表，然后对其进行处理，抽样，然后使用 rm（）来摆脱它呢？

I have a large data.table (about 24000 rows and growing). I want to subset that datatable based on a couple of criteria and from that subset (ends up being about 3000 rows) I want to randomly sample just 4 rows. I do not want to create a named 3000 or so row data.table, count its rows and then sample based on row number. How can I do it on the fly? Or should I just suck it up by creating the table and then working on it, sampling it and then using rm() to get rid of it?

让我模拟我的问题

require(data.table)
random.length  <-  sample(x = 15:30, size = 1)
data.table(city=sample(c("Cape Town", "New York", "Pittsburgh", "Tel Aviv", "Amsterdam"), size=random.length, replace = TRUE), score = sample(x=1:10, size = random.length, replace=TRUE))

这是一个随机长度表，模拟了一个事实，根据我的标准，并根据我的起始表，我不知道子集表的长度是

That makes a random length table, which simulates the fact that depending on my criteria and depending on my starting table, I do not know what the length of the subsetted table with be

现在，如果我只想要前三行，我可以这样做。

Now, if I just wanted the first three rows I could do as so

data.table(city=sample(c("Cape Town", "New York", "Pittsburgh", "Tel Aviv", "Amsterdam"), size=random.length, replace = TRUE), score = sample(x=1:10, size = random.length, replace=TRUE))[1:3]

$ b b

但是让我们说我不想要前三行，而是随机的3行，那么我想做一些这样的事情...

But let us say I did not want the first three rows but rather a random 3 rows, then I would want to do something such as this...

data.table(city=sample(c("Cape Town", "New York", "Pittsburgh", "Tel Aviv", "Amsterdam"), size=random.length, replace = TRUE), score = sample(x=1:10, size = random.length, replace=TRUE))[sample(x= 1:number of rows of that previous data.table,size = 3 ]

这将不起作用。如何计算初始数据框架的长度？

That will not work. How do I compute, on the fly, what the length of the initial data.frame was?

推荐答案

刚刚创建的 .N 在 i 中工作。新的README项目：

Have just made .N work in i. New README item :

.N 现在可在 i ， FR＃724 。感谢新手间接这里和Farrel直接此处。

.N is now available in i, FR#724. Thanks to newbie indirectly here and Farrel directly here.

现在的工作原理：

DT[...][...][sample(.N,3)]

例如

> random.length  <-  sample(x = 15:30, size = 1)
> data.table(city = sample(c("Cape Town", "New York", "Pittsburgh", "Tel Aviv", "Amsterdam"),size=random.length, replace = TRUE), score = sample(x=1:10, size = random.length, replace=TRUE))[sample(.N, 3)] 
         city score
1:   New York     4
2: Pittsburgh     3
3:  Cape Town     9
>

这篇关于如何从数据表中提取几个随机行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从数据表中提取几个随机行 [英] How do you extract a few random rows from a data.table on the fly

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何从数据表中提取几个随机行 [英] How do you extract a few random rows from a data.table on the fly

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭