dplyr 句号字符“."是什么意思?参考? [英] What does the dplyr period character "." reference?

查看:18
本文介绍了dplyr 句号字符“."是什么意思?参考?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下 dplyr 代码中的句点 . 引用什么?:

What does the period . reference in the following dplyr code?:

(df <- as.data.frame(matrix(rep(1:5, 5), ncol=5)))
#    V1 V2 V3 V4 V5
#  1  1  1  1  1  1
#  2  2  2  2  2  2
#  3  3  3  3  3  3
#  4  4  4  4  4  4
#  5  5  5  5  5  5

dplyr::mutate_each(df, funs(. == 5))
#       V1    V2    V3    V4    V5
#  1 FALSE FALSE FALSE FALSE FALSE
#  2 FALSE FALSE FALSE FALSE FALSE
#  3 FALSE FALSE FALSE FALSE FALSE
#  4 FALSE FALSE FALSE FALSE FALSE
#  5  TRUE  TRUE  TRUE  TRUE  TRUE

这是所有列"的简写吗?这是 . 特定的 dplyr 语法还是一般的 R 语法(如所讨论的 这里)?

Is this shorthand for "all columns"? Is this . specific dplyr syntax or is it general R syntax (as discussed here)?

还有,为什么下面的代码会报错?

Also, why does the following code result in an error?

dplyr::filter(df, . == 5)
#  Error: object '.' not found

推荐答案

点在 dplyr 中主要(不只)在 mutate_eachsummarise_each 中使用做.在前两个(以及它们的 SE 对应项)中,它指的是 funs 中的函数应用到的所有列.在 do 中,它指的是(可能分组的)data.frame,因此您可以通过使用 .$xyz 来引用名为xyz"的列来引用单个列.

The dot is used within dplyr mainly (not exclusively) in mutate_each, summarise_each and do. In the first two (and their SE counterparts) it refers to all the columns to which the functions in funs are applied. In do it refers to the (potentially grouped) data.frame so you can reference single columns by using .$xyz to reference a column named "xyz".

无法运行的原因

filter(df, . == 5)

是因为 a) filter 并非设计用于处理多个列,例如 mutate_each 和 b) 您需要使用管道运算符 %>%(最初来自 magrittr).

is because a) filter is not designed to work with multiple columns like mutate_each for example and b) you would need to use the pipe operator %>% (originally from magrittr).

但是,当与管道运算符 %>% 结合使用时,您可以将它与 filter 内的 rowSums 之类的函数一起使用:

However, you could use it with a function like rowSums inside filter when combined with the pipe operator %>%:

> filter(mtcars, rowSums(. > 5) > 4)
Error: Objekt '.' not found

> mtcars %>% filter(rowSums(. > 5) > 4) %>% head()
    lm cyl disp  hp drat    wt  qsec vs am gear carb
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
4 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
5 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
6 14.3   8  360 245 3.21 3.570 15.84  0  0    3    4

您还应该查看 magrittr 帮助文件:

You should also take a look at the magrittr help files:

library(magrittr)
help("%>%")

从帮助页面:

将 lhs 放置在 rhs 调用中的其他位置通常你会希望 lhs 在另一个位置而不是第一个位置上调用 rhs.为此,您可以使用点 (.) 作为占位符.例如,y %>% f(x, .) 等价于 f(x, y)z %>% f(x,y, arg = .) 等价于 f(x, y, arg = z).

Placing lhs elsewhere in rhs call Often you will want lhs to the rhs call at another position than the first. For this purpose you can use the dot (.) as placeholder. For example, y %>% f(x, .) is equivalent to f(x, y) and z %>% f(x, y, arg = .) is equivalent to f(x, y, arg = z).

将圆点用于次要目的通常,除了 lhs 本身的值之外,在 rhs 调用中还需要 lhs 的某些属性或属性,例如行数或列数.在 rhs 调用中多次使用点占位符是完全有效的,但这是设计使然在嵌套中使用时行为略有不同函数调用.特别是,如果占位符仅用于嵌套函数调用,lhs 也将作为第一个参数!这样做的原因是,在大多数用例中,这会产生最多可读的代码.例如,iris %>% subset(1:nrow(.) %% 2 == 0) 是相当于 iris %>% subset(., 1:nrow(.) %% 2 == 0) 但稍微有点更紧凑.可以通过封闭来推翻这种行为大括号中的 rhs.例如,1:10 %>% {c(min(.), max(.))} 是相当于 c(min(1:10), max(1:10)).

Using the dot for secondary purposes Often, some attribute or property of lhs is desired in the rhs call in addition to the value of lhs itself, e.g. the number of rows or columns. It is perfectly valid to use the dot placeholder several times in the rhs call, but by design the behavior is slightly different when using it inside nested function calls. In particular, if the placeholder is only used in a nested function call, lhs will also be placed as the first argument! The reason for this is that in most use-cases this produces the most readable code. For example, iris %>% subset(1:nrow(.) %% 2 == 0) is equivalent to iris %>% subset(., 1:nrow(.) %% 2 == 0) but slightly more compact. It is possible to overrule this behavior by enclosing the rhs in braces. For example, 1:10 %>% {c(min(.), max(.))} is equivalent to c(min(1:10), max(1:10)).

这篇关于dplyr 句号字符“."是什么意思?参考?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆