使用any()vs |在dplyr :: mutate中 [英] Using any() vs | in dplyr::mutate
问题描述
当我比较 |
vs any()
> dplyr :: mutate()?
Why should I use |
vs any()
when I'm comparing columns in dplyr::mutate()
?
为什么他们返回不同的答案?
And why do they return different answers?
例如:
library(tidyverse)
df <- data_frame(x = rep(c(T,F,T), 4), y = rep(c(T,F,T, F), 3), allF = F, allT = T)
df %>%
mutate(
withpipe = x | y # returns expected results by row
, usingany = any(c(x,y)) # returns TRUE for every row
)
这是怎么回事,为什么我要使用一种比较值的方法?
What's going on here and why should I use one way of comparing values over another?
推荐答案
两者的区别在于答案的计算方式:
The difference between the two is how the answer is calculated:
- 用于
|
,按行比较元素,并使用布尔逻辑返回正确的值。在上面的示例中,每个x和y对彼此进行比较,并为每对返回一个逻辑值,从而得出12个不同的答案,每个答案对应数据帧的每一行。 - 另一方面,
any()
查看整个向量并返回单个值。在上面的示例中,使用任何列计算新的列的mutate行基本上是这样做的:
或any(c(df $ x,df $ y) )
,它将返回TRUE
,因为在任一<$中至少有一个TRUE
值c $ c> df $ xdf $ y
。然后,将那个值分配给数据框的每一行。
- for
|
, elements are compared row-wise and boolean logic is used to return the proper value. In the example above each x and y pair are compared to each other and a logical value is returned for each pair, resulting in 12 different answers, one for each row of the data frame. any()
, on the other hand, looks at the entire vector and returns a single value. In the above example, the mutate line that calculates the newusingany
column is basically doing this:any(c(df$x, df$y))
, which will returnTRUE
because there's at least oneTRUE
value in eitherdf$x
ordf$y
. That single value is then assigned to every row of the data frame.
您可以使用数据中的其他列来查看实际情况框架:
You can see this in action using the other columns in your data frame:
df %>%
mutate(
usingany = any(c(x,y)) # returns all TRUE
, allfany = any(allF) # returns all FALSE because every value in df$allF is FALSE
)
要回答何时使用:在要逐行比较元素时使用 |
。当您想要有关整个数据帧的通用答案时,请使用 any()
。
To answer when you should use which: use |
when you want to compare elements row-wise. Use any()
when you want a universal answer about the entire data frame.
TLDR,当使用<$时c $ c> dplyr :: mutate(),通常需要使用 |
。
TLDR, when using dplyr::mutate()
, you're usually going to want to use |
.
这篇关于使用any()vs |在dplyr :: mutate中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!