来自R数据帧子集的行的随机样本 [英] Random sample of rows from subset of an R dataframe
问题描述
如果我只是有数据,例如
,那么有没有办法从数据框架的一部分获取行样? (F,M,M,F,F,M,F,..., F)
age< - c(23,25,27,29,31,33,35,37)
然后我可以轻松地用
样本(年龄[性别==F],3)
并获取类似
[1] 31 35 29
但是如果我将这些数据转换成一个数据框,那么
mydf< - data.frame(gender,age)
我无法使用明显的
code> sample(mydf [mydf $ gender ==F,],3)
虽然我可以用一些荒谬的括号来表达一些奇怪的东西,如
mydf [sample((1:nrow(mydf)) [mydf $ gender ==F],3),]
a找到我想要的东西,就像
性别年龄
7 F 35
4 F 29
1 F 23
有没有更好的方法,减少时间来解决如何写?
你的复杂方式几乎是如何做的 - 我认为所有的答案将是这个主题的变体。
例如,我喜欢生成 mydf $ gender ==F
>
idx< - 其中(mydf $ gender ==F)
然后我从中抽取:
mydf [sample(idx ,3),]
所以在一行(虽然,你减少了荒谬的数量的括号,通过拥有多行,使代码更容易理解):
mydf [sample(which(mydf $ gender =='F' ),3),]
虽然我是黑客!我的一部分喜欢一线,我的理智部分说,即使双线是两条线,这是更可以理解的 - 这只是你的选择。
Is there a good way of getting a sample of rows from part of a dataframe?
If I just have data such as
gender <- c("F", "M", "M", "F", "F", "M", "F", "F")
age <- c(23, 25, 27, 29, 31, 33, 35, 37)
then I can easily sample the ages of three of the Fs with
sample(age[gender == "F"], 3)
and get something like
[1] 31 35 29
but if I turn this data into a dataframe
mydf <- data.frame(gender, age)
I cannot use the obvious
sample(mydf[mydf$gender == "F", ], 3)
though I can concoct something convoluted with an absurd number of brackets like
mydf[sample((1:nrow(mydf))[mydf$gender == "F"], 3), ]
and get what I want which is something like
gender age
7 F 35
4 F 29
1 F 23
Is there a better way that takes me less time to work out how to write?
Your convoluted way is pretty much how to do it - I think all the answers will be variations on that theme.
For example, I like to generate the mydf$gender=="F"
indices first:
idx <- which(mydf$gender=="F")
Then I sample from that:
mydf[ sample(idx,3), ]
So in one line (although, you reduce the absurd number of brackets and possibly make your code easier to understand by having multiple lines):
mydf[ sample( which(mydf$gender=='F'), 3 ), ]
While the "wheee I'm a hacker!" part of me prefers the one-liner, the sensible part of me says that even though the two-liner is two lines, it is much more understandable - it's just your choice.
这篇关于来自R数据帧子集的行的随机样本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!