提取带有嵌套条件的data.frame中的行的随机样本 [英] Extracting a random sample of rows in a data.frame with a nested conditional

查看:131
本文介绍了提取带有嵌套条件的data.frame中的行的随机样本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题基于SO帖子在此处找到,并使用了以下代码已从R帮助邮件列表中的帖子进行了修改,该帖子可以为在这里看到

This question builds from the SO post found here and uses code that was modified from a post on the R-help mailing list which can be seen here

我正在尝试提取数据帧中具有条件的行的随机样本.使用如下所示的R iris数据:

I am trying to extract a random sample of rows in a data frame but with a conditional. Using the R iris data which looks like:

> head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa 

要获取一个简单的随机样本,下面的代码可以很好地获取2行样本.

To take a simple random sample, the code below works fine to take a sample of 2 rows.

iris[sample(nrow(iris), 2), ]

但是我不确定如何对种类"字段进行条件设置.例如,如何如上所述获取随机样本,但仅当Species != setosa

However I am unsure how to condition the Species field. For example how to take the random sample as indicated above but only when Species != "setosa"

iris$Species

> summary(iris$Species)
    setosa versicolor  virginica 
        50         50         50

我不确定如何正确嵌套条件句.下面是我较早的尝试之一,其中包括明显不正确的结果…….

I am unsure how to correctly nest conditionals. One of my earlier attempts is below with the obviously incorrect results included….

> iris[sample(nrow(iris)[iris$Species != "setosa"], 2), ]
     Sepal.Length Sepal.Width Petal.Length Petal.Width Species
NA             NA          NA           NA          NA    <NA>
NA.1           NA          NA           NA          NA    <NA>

谢谢

推荐答案

我将使用which来获取行号的向量,根据条件,您可以从中sample ....

I'd use which to get the vector of rows numbers from which you can sample given your condition....

iris[ sample( which( iris$Species != "setosa" ) , 2 ) , ]
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
#59           6.6         2.9          4.6         1.3 versicolor
#133          6.4         2.8          5.6         2.2  virginica

这篇关于提取带有嵌套条件的data.frame中的行的随机样本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆