子集数据帧中意外输出的原因 - R [英] Reason for unexpected output in subsetting data frame - R

查看:34
本文介绍了子集数据帧中意外输出的原因 - R的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有数据框a",它有一个名为VAL"的变量.我想计算 VAL 的值为 23 或 24 的元素.

I have the data frame "a" and it has a variable called "VAL". I want to count the elements where the value of VAL is 23 or 24.

我使用了两个工作正常的代码:

I used two codes which worked Ok:

nrow(subset(a,VAL==23|VAL==24) 
nrow(subset(a,VAL %in% c(23,24)))

但是,我尝试了其他提供意外输出的代码,但我不知道为什么.

But, I tried other code which gives an unexpected output and I don't know why.

nrow(subset(a,VAL ==c(23,24)))

即使我改变了 23 和 24 的顺序,它也会给出不同的意外输出.

Even if I change the order of 23 and 24, it gives a different unexpected output.

nrow(subset(a,VAL ==c(24,23)))

为什么这些代码不正确?他们实际上在做什么?

Why are those codes incorrect ? What are they actually doing?

推荐答案

通过一个例子来说明哪里出错了:

Working through an example shows where it is going wrong:

a <- data.frame(VAL=c(1,1,1,23,24))
a
#  VAL
#1   1
#2   1
#3   1
#4  23
#5  24

这些工作:

a$VAL %in% c(23,24)
#[1] FALSE FALSE FALSE  TRUE  TRUE
a$VAL==23 | a$VAL==24
#[1] FALSE FALSE FALSE  TRUE  TRUE

由于比较时的向量回收,以下内容不起作用 - 请注意下面的警告消息,例如:

The following doesn't work due to vector recycling when comparing - take note of the warning message below E.g.:

a$VAL ==c(23,24)
#[1] FALSE FALSE FALSE FALSE FALSE
#Warning message:
#In a$VAL == c(23, 24) :
#  longer object length is not a multiple of shorter object length

这最后一点代码回收了您正在测试的内容,并且基本上是在比较:

This last bit of code recycles what you are testing against and is basically comparing:

c( 1,  1,  1, 23, 24) #to
c(23, 24, 23, 24, 23)

...所以你不会得到任何返回的行.改变顺序会给你

...so you don't get any rows returned. Changing the order will give you

c( 1,  1,  1, 23, 24) #to
c(24, 23, 24, 23, 24)

...你会得到两行返回(这纯粹是靠运气给出了预期的结果,但不适合使用).

...and you will get two rows returned (which gives the intended result by pure luck, but it is not appropriate to use).

这篇关于子集数据帧中意外输出的原因 - R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆