Dplyr加入=(a = b),其中a和b是包含字符串的变量? [英] Dplyr join on by=(a = b), where a and b are variables containing strings?

查看:118
本文介绍了Dplyr加入=(a = b),其中a和b是包含字符串的变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用dplyr执行一个内部连接两个表,我认为我被非标准评估规则所绊倒。当使用by =(a=b)参数时,当a和b是实际的字符串时,一切都会按预期工作。这是一个玩具示例:

  library(dplyr)
data(iris)

inner_join(iris,iris,by = c(Sepal.Length=Sepal.Width))

但是我想说的是将inner_join放在一个函数中:

  library(dplyr)
data(iris)

myfn< - function(xname,yname){
data(iris)
inner_join(iris,iris,by = c(xname = yname))


myfn(Sepal.Length,Sepal.Width)

这将返回以下错误:



错误:无法加入列'xname'x'Sepal.Width':index out of bounds



我怀疑有一些花哨的表情,表达,引用或引用我可以做的,使这项工作,但我有点

解决方案

您可以使用

  myfn<  -  function(xname,yname){
data(iris)
inner_ join(iris,iris,by = setNames(yname,xname))
}

?inner_join 文档中的

  by = c(一个=b)#相同于= c(a =b)

略有误导,因为这两个值都不是正确的字符值。您实际上创建了一个命名的字符向量。动态设置等号左侧的值与右侧的值不同。您可以使用 setNames()动态设置向量的名称。


I am trying to perform an inner join two tables using dplyr, and I think I'm getting tripped up by non-standard evaluation rules. When using the by=("a" = "b") argument, everything works as expected when "a" and "b" are actual strings. Here's a toy example that works:

library(dplyr)
data(iris)

inner_join(iris, iris, by=c("Sepal.Length" = "Sepal.Width"))

But let's say I was putting inner_join in a function:

library(dplyr)
data(iris)

myfn <- function(xname, yname) {
    data(iris)
    inner_join(iris, iris, by=c(xname = yname))
}

myfn("Sepal.Length", "Sepal.Width")

This returns the following error:

Error: cannot join on columns 'xname' x 'Sepal.Width': index out of bounds

I suspect there is some fancy expression, deparsing, quoting, or unquoting that I could do to make this work, but I'm a bit murky on those details.

解决方案

You can use

myfn <- function(xname, yname) {
    data(iris)
    inner_join(iris, iris, by=setNames(yname, xname))
}

The suggested syntax in the ?inner_join documentation of

by = c("a"="b")   # same as by = c(a="b")

is slightly misleading because both those values aren't proper character values. You're actually created a named character vector. To dynamically set the values to the left of the equals sign is different from those on the right. You can use setNames() to set the names of the vector dynamically.

这篇关于Dplyr加入=(a = b),其中a和b是包含字符串的变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆