从数据表中，随机选择每个组一行 [英] from data table, randomly select one row per group

查看：125 发布时间：2017/3/12 11:13:16 r data.table subset random-sample

本文介绍了从数据表中，随机选择每个组一行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在寻找一种有效的方法来从数据表中选择行，因此对于特定列中的每个唯一值，我都有一个代表性行。

I'm looking for an efficient way to select rows from a data table such that I have one representative row for each unique value in a particular column.

我提出一个简单的例子：

Let me propose a simple example:

require(data.table)

y = c('a','b','c','d','e','f','g','h')
x = sample(2:10,8,replace = TRUE)
z = rep(y,x)
dt = as.data.table( z )

我的目标是通过为列z中的每个字母ah抽取一行来子集数据表dt。

my objective is to subset data table dt by sampling one row for each letter a-h in column z.

推荐答案

OP只提供一个列。假设原始数据集中有多个列，我们从每个组的行序列中按z， sample 分组，得到行索引（ .I ），使用行索引（ $ V1 ）提取列，并使用它来子集'dt'的行。

OP provided only a single column in the example. Assuming that there are multiple columns in the original dataset, we group by 'z', sample 1 row from the sequence of rows per group, get the row index (.I), extract the column with the row index ($V1) and use that to subset the rows of 'dt'.

dt[dt[ , .I[sample(.N,1)] , by = z]$V1]

这篇关于从数据表中，随机选择每个组一行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从数据表中，随机选择每个组一行 [英] from data table, randomly select one row per group

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从数据表中，随机选择每个组一行 [英] from data table, randomly select one row per group

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭