如何使用R中的%。%运算符(EDIT:运算符在2014年不推荐使用) [英] How to use the %.% operator in R (EDIT: operator deprecated in 2014)

查看:136
本文介绍了如何使用R中的%。%运算符(EDIT:运算符在2014年不推荐使用)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编辑:%。%运算符现在已被弃用。使用magrittr中的%>%。

%.% operator is now deprecated. Use %>% from magrittr.

ORIGINAL QUESTION
这个%。%
我已经看到它使用了很多dplyr包,但似乎找不到任何支持文件,它是什么或它的工作原理。

ORIGINAL QUESTION What does this %.% operator do?? I've seen it used a lot with the dplyr package, but can't seem to find any supporting documentation on what it is or how it works.

似乎将命令链接在一起,但是这是尽可能的告诉我...当我在这里,任何人都可以解释那些特殊的运营商的悬念与$ code% code> sign做什么时候在技术上是适当的时候使用它们来代码更好?

It seems to chain commands together, but that's as far as I can tell...While I'm at it, can anyone explain what the gambit of those special operators that hang around with the % sign do and when is technically the right time to use them to code better?

推荐答案

我认为Hadley会成为向你解释的最好的人,但我会给它一个镜头。

I think Hadley would be the best person to explain to you, but I will give it a shot.

%。%是一个称为链操作符的二进制运算符。在 R 你几乎可以使用特殊字符定义您自己的任何二进制运算符。从我看起来,我们几乎使用它来简化可链接语法(如 x + y ,比 sum(x, y)的)。你可以用他们做的很酷的东西,看到这里很酷的例子。

%.% is a binary operator called chain operator. In Ryou can pretty much define any binary operator of your own with the special character %. From what I have seem, we pretty much use it to make easier "chainable" syntaxes (like x+y, much better than sum(x,y)). You can do really cool stuff with them, see this cool example here.

dplyr 中的%。%的目的是什么?为了让您更轻松地表达自己,减少您想要做的事情与表达方式之间的差距。

What is the purpose of %.% in dplyr? To make it easier for you to express yourself, reducing the gap between what you want to do and how you express it.

dplyr的介绍让我们假设你想按年,月和日组合航班,选择这些变量加上到达和离开的延迟,总结这些平均值,然后过滤只是那些超过30的延迟。如果没有%。%,你会有写这样:

Taking the example from the introduction to dplyr, let's suppose you want to group flights by year, month and day, select those variables plus the delays in arrival and departure, summarise these by the mean and then filter just those delays over 30. If there were no %.%, you would have to write like this:

filter(
  summarise(
    select(
      group_by(hflights, Year, Month, DayofMonth),
      Year:DayofMonth, ArrDelay, DepDelay
    ),
    arr = mean(ArrDelay, na.rm = TRUE),
    dep = mean(DepDelay, na.rm = TRUE)
  ),
  arr > 30 | dep > 30
)

它做的工作。但是很难表达自己并阅读它。现在,您可以使用链接运算符%。%编写相同的内容:

It does the job. But it is pretty difficult to express yourself and to read it. Now, you can write the same thing with a more friendly syntax using the chain operator %.%:

hflights %.%
  group_by(Year, Month, DayofMonth) %.%
  select(Year:DayofMonth, ArrDelay, DepDelay) %.%
  summarise(
    arr = mean(ArrDelay, na.rm = TRUE),
    dep = mean(DepDelay, na.rm = TRUE)
  ) %.%
  filter(arr > 30 | dep > 30)

写入和阅读更容易!

这是如何工作的?

让我们来看看这些定义。首先为%。%

Let's take a look at the definitions. First for %.%:

function (x, y) 
{
    chain_q(list(substitute(x), substitute(y)), env = parent.frame())
}

它使用另一个名为 chain_q 的函数。所以让我们来看一下:

It uses another function called chain_q. So let's look at it:

function (calls, env = parent.frame()) 
{
    if (length(calls) == 0) 
        return()
    if (length(calls) == 1) 
        return(eval(calls[[1]], env))
    e <- new.env(parent = env)
    e$`__prev` <- eval(calls[[1]], env)
    for (call in calls[-1]) {
        new_call <- as.call(c(call[[1]], quote(`__prev`), as.list(call[-1])))
        e$`__prev` <- eval(new_call, e)
    }
    e$`__prev`
}

这是什么?

为了简化事情,我们假设你打电话给: group_by(hflights,Year,Month,DayofMonth )%。%select(Year:DayofMonth,ArrDelay,DepDelay)

To simplify things, let's assume you called: group_by(hflights,Year, Month, DayofMonth) %.% select(Year:DayofMonth, ArrDelay, DepDelay).

您的电话 x y code> group_by(hflights,Year,Month,DayofMonth)和 select(Year:DayofMonth,ArrDelay,DepDelay)。因此,该函数创建一个名为 e e < - new.env(parent = env))的新环境,使用第一个调用的评估( e $'__ prev'< - eval(calls [[1]],env保存一个名为 __ prev 然后对于其他调用,它创建另一个调用,其第一个参数是先前的调用 - 即 __ prev - 在我们的例子中,它将是 select('__ prev',Year:DayofMonth,ArrDelay,DepDelay) - 所以它链循环中的调用。

Your calls x and y are then both group_by(hflights,Year, Month, DayofMonth) and select(Year:DayofMonth, ArrDelay, DepDelay). So the function creates a new environment called e (e <- new.env(parent = env)) and saves an object called __prev with the evaluation of the first call (e$'__prev' <- eval(calls[[1]], env). Then for each other call it creates another call whose first argument is the previous call - that is __prev - in our case it would be select('__prev', Year:DayofMonth, ArrDelay, DepDelay) - so it "chains" the calls inside the loop.

由于您可以使用二进制运算符,您实际上可以使用此语法以非常可读的方式表达非常复杂的操作。

Since you can use binary operators one over another, you actually can use this syntax to express very complex manipulations in a very readable way.

这篇关于如何使用R中的%。%运算符(EDIT:运算符在2014年不推荐使用)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆