如何在 R 中使用 %.% 运算符(编辑:2014 年不推荐使用的运算符) [英] How to use the %.% operator in R (EDIT: operator deprecated in 2014)

查看:27
本文介绍了如何在 R 中使用 %.% 运算符(编辑:2014 年不推荐使用的运算符)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

%.% 运算符现已弃用.使用 %>% 来自 magrittr.

%.% operator is now deprecated. Use %>% from magrittr.

原始问题这个 %.% 运算符有什么作用??我已经看到它与 dplyr 包一起使用了很多,但似乎找不到任何关于它是什么或它是如何工作的支持文档.

ORIGINAL QUESTION What does this %.% operator do?? I've seen it used a lot with the dplyr package, but can't seem to find any supporting documentation on what it is or how it works.

它似乎将命令链接在一起,但据我所知......当我在做的时候,谁能解释一下那些与 % 签字,从技术上讲什么时候是使用它们来更好地编码的合适时机?

It seems to chain commands together, but that's as far as I can tell...While I'm at it, can anyone explain what the gambit of those special operators that hang around with the % sign do and when is technically the right time to use them to code better?

推荐答案

我认为 Hadley 是向您解释的最佳人选,但我会试一试.

I think Hadley would be the best person to explain to you, but I will give it a shot.

%.% 是一个二元运算符,称为链运算符.在 R 中,您几乎可以使用特殊字符 % 定义您自己的任何二元运算符.从我看来,我们几乎用它来制作更简单的可链接"语法(如 x+y,比 sum(x,y) 好得多).你可以用它们做一些很酷的事情,在这里查看这个很酷的例子.

%.% is a binary operator called chain operator. In Ryou can pretty much define any binary operator of your own with the special character %. From what I have seem, we pretty much use it to make easier "chainable" syntaxes (like x+y, much better than sum(x,y)). You can do really cool stuff with them, see this cool example here.

dplyr%.%的作用是什么?让你更容易表达自己,减少你想做什么和你表达方式之间的差距.

What is the purpose of %.% in dplyr? To make it easier for you to express yourself, reducing the gap between what you want to do and how you express it.

dplyr 简介中的示例假设您想按年、月和日对航班进行分组,选择这些变量加上到达和离开的延误,按平均值汇总这些,然后仅过滤超过 30 的延误.如果没有 %.%,你必须这样写:

Taking the example from the introduction to dplyr, let's suppose you want to group flights by year, month and day, select those variables plus the delays in arrival and departure, summarise these by the mean and then filter just those delays over 30. If there were no %.%, you would have to write like this:

filter(
  summarise(
    select(
      group_by(hflights, Year, Month, DayofMonth),
      Year:DayofMonth, ArrDelay, DepDelay
    ),
    arr = mean(ArrDelay, na.rm = TRUE),
    dep = mean(DepDelay, na.rm = TRUE)
  ),
  arr > 30 | dep > 30
)

它可以完成工作.但是表达自己和阅读它是非常困难的.现在,您可以使用链运算符 %.%:

It does the job. But it is pretty difficult to express yourself and to read it. Now, you can write the same thing with a more friendly syntax using the chain operator %.%:

hflights %.%
  group_by(Year, Month, DayofMonth) %.%
  select(Year:DayofMonth, ArrDelay, DepDelay) %.%
  summarise(
    arr = mean(ArrDelay, na.rm = TRUE),
    dep = mean(DepDelay, na.rm = TRUE)
  ) %.%
  filter(arr > 30 | dep > 30)

写和读都更容易!

这是如何运作的?

让我们来看看定义.首先为 %.%:

Let's take a look at the definitions. First for %.%:

function (x, y) 
{
    chain_q(list(substitute(x), substitute(y)), env = parent.frame())
}

它使用另一个名为 chain_q 的函数.那么让我们来看看它:

It uses another function called chain_q. So let's look at it:

function (calls, env = parent.frame()) 
{
    if (length(calls) == 0) 
        return()
    if (length(calls) == 1) 
        return(eval(calls[[1]], env))
    e <- new.env(parent = env)
    e$`__prev` <- eval(calls[[1]], env)
    for (call in calls[-1]) {
        new_call <- as.call(c(call[[1]], quote(`__prev`), as.list(call[-1])))
        e$`__prev` <- eval(new_call, e)
    }
    e$`__prev`
}

有什么作用?

为了简化事情,假设您调用了:group_by(hflights,Year, Month, DayofMonth) %.% select(Year:DayofMonth, ArrDelay, DepDelay).

To simplify things, let's assume you called: group_by(hflights,Year, Month, DayofMonth) %.% select(Year:DayofMonth, ArrDelay, DepDelay).

您的调用 xy 那么都是 group_by(hflights,Year, Month, DayofMonth)select(Year:DayofMonth、ArrDelay、DepDelay).因此该函数创建了一个名为 e (e <- new.env(parent = env)) 的新环境并保存了一个名为 __prev 的对象随着第一次调用的评估 (e$'__prev' <- eval(calls[[1]], env).然后对于每个其他调用,它会创建另一个调用,其第一个参数是前一个参数call - 即 __prev - 在我们的例子中是 select('__prev', Year:DayofMonth, ArrDelay, DepDelay) - 所以它链接"了内部的调用循环.

Your calls x and y are then both group_by(hflights,Year, Month, DayofMonth) and select(Year:DayofMonth, ArrDelay, DepDelay). So the function creates a new environment called e (e <- new.env(parent = env)) and saves an object called __prev with the evaluation of the first call (e$'__prev' <- eval(calls[[1]], env). Then for each other call it creates another call whose first argument is the previous call - that is __prev - in our case it would be select('__prev', Year:DayofMonth, ArrDelay, DepDelay) - so it "chains" the calls inside the loop.

由于您可以交替使用二元运算符,因此您实际上可以使用这种语法以一种非常易读的方式表达非常复杂的操作.

Since you can use binary operators one over another, you actually can use this syntax to express very complex manipulations in a very readable way.

这篇关于如何在 R 中使用 %.% 运算符(编辑:2014 年不推荐使用的运算符)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆