子集未提供正确值时的逻辑条件 [英] Logical condition while subsetting not giving correct values
问题描述
我想使用逻辑对我正在使用的数据帧project
进行子集化.我得到一个矛盾的结果. ROLL.NO.
参数前面的逻辑部分与问题无关.抱歉,我无法提供可复制的示例.让我知道如何在不显示数据框中相关列的整个393个条目的情况下使此问题可重复.D14
和DC31
是简单的整数值,其中一些值为NA
.和DC31
是简单的整数值.和DC31
是简单的整数值. >
I wanted to subset data frame project
I was working with, using a logical. I am getting a paradoxical result. The part of the logical preceding the ROLL.NO.
argument is irrelevant to the question. Sorry, I could not give a reproducible example. Do let me know how can I make this question reproducible without having to show the entire 393 entries of the relevant columns in my data frame.D14
and DC31
are simple integer values, with some values being NA
.
culprits<-project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)]
culprits
[1] 3138 3129 3129 3135 3135 3136 3120 3126 3133 3125 3125 3125 3132 3132 3123 3123 3131
project$HOUSE.NO[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3131]
[1] "14/132" "14/176" "16/133" "14/111" "14/252"
> project$HOUSE.NO[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3129]
[1] "14/132" "15/162" "14/176" "16/133" "14/111"
> project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3136]
[1] 3129 3136 3120 3123 3123
project$ROLL.NO.[(project$DC31==1&project$D14==2)|(project$DC31==2&project$D14==1)&!is.na(project$DC31)&!is.na(project$D14)&project$ROLL.NO.==3125]
[1] 3129 3120 3125 3125 3125 3123 3123
project$ROLL.NO.[project$ROLL.NO.==3136]
[1] 3136 3136 3136 3136 3136 3136 3136 3136 3136
我试图了解代码中出了什么问题,并且还包括了这些查询的结果.当project$ROLL.NO.==3136
是FALSE
的任何其他ROLL.NO.
时,我看不到为什么其他ROLL.NO.
在与其他参数一起添加&
时被调用.此外,相同的三个条目会与任何被称为ROLL.NO.
的条目一起错误地重复.ROLL.NO.
列中没有NA
值.并且在每个条件下逻辑向量的长度是相同的,因此没有回收.让我知道是否需要提供其他信息.
I tried to understand what was going wrong in my code and I have also included the results of those queries. When project$ROLL.NO.==3136
is FALSE
for any other ROLL.NO.
, I fail to see why are other ROLL.NO.
called when other arguments are added with an &
with it. Moreover, the same three entries erroneously repeat along with any called ROLL.NO.
There are no NA
values in the ROLL.NO.
column. And the length of the logical vectors in each of the conditions is the same, hence no recycling. Do let me know if additional information needs to be given.
附录
project <- structure(list(ROLL.NO. = c(3138L, 3138L, 3138L, 3138L, 3138L,
3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L,
3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3138L, 3129L, 3129L,
3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L,
3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L, 3129L,
3129L, 3129L, 3129L, 3121L, 3121L, 3121L, 3121L, 3121L, 3121L
), DC31 = c(2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 2L,
1L, 2L, 2L, 2L, 3L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L,
2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L,
1L, 2L, 2L, 2L, 2L), D14 = c(2L, 2L, 1L, 2L, 2L, 1L, 1L, 2L,
1L, 2L, 1L, 2L, 0L, 1L, 2L, 2L, 0L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L,
2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L), HOUSE.NO = c("14/274",
"14/259", "14/217", "14/258", "14/306", "14/300", "14/96", "14/166",
"14/69", "14/68", "14/16", "14/93", "14/130", "14/321", "14/324",
"14/139", "14/314", "14/323", "14/208", "14/78", "14/150", "14/155",
"14/102", "14/132", "14/159", "14/163", "14/165", "14/146", "14/148",
"14/104", "14/56", "14/53", "14/99", "14/48", "15/164", "15/148",
"15/158", "15/107", "15/160", "15/162", "15/243", "15/66", "15/249",
"15/86", "14/388", "14/396", "14/431", "14/401", "14/103", "15/36"
)), .Names = c("ROLL.NO.", "DC31", "D14", "HOUSE.NO"), row.names = c(NA,
50L), class = "data.frame")
推荐答案
来自?base::Logic
,help('&')
,help('|')
等
有关这些运算符的优先级,请参见
Syntax
:与许多其他语言(包括S)不同,AND和OR运算符的优先级不同(AND运算符的优先级高于OR运算符).
See
Syntax
for the precedence of these operators: unlike many other languages (including S) the AND and OR operators do not have the same precedence (the AND operators have higher precedence than the OR operators).
解释原因
TRUE | TRUE & FALSE
# [1] TRUE
本质上是
TRUE | (TRUE & FALSE)
这也是正确的,并且简化了您在这里所做的事情:
which is also true, and a simplification of what you are doing here:
(project$DC31==1&project$D14==2) |
(project$DC31==2&project$D14==1) &
!is.na(project$DC31) &
!is.na(project$D14) &
project$ROLL.NO. == 3131
因为您期望结果仅包含一些我假设的project$ROLL.NO. == 3131
,所以即使其中一些错误,如果一个或多个OR
为true,您可能会得到一些不是ROLL.NO.
而不是3131
since you expect the result only to contain some project$ROLL.NO. == 3131
I assume, so even if some of these are false, if one or more OR
is true, you may get some that are not ROLL.NO.
which are not 3131
还请注意,!
的优先级高于逻辑
Also note that !
has a higher precedence than logicals
这篇关于子集未提供正确值时的逻辑条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!