什么时候使用"with"功能,为什么好呢? [英] When to use 'with' function and why is it good?

查看:154
本文介绍了什么时候使用"with"功能,为什么好呢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用with()有什么好处?它在帮助文件中提到它在根据数据创建的环境中评估表达式.这有什么好处?与仅在全局环境中进行评估相比,创建环境并在其中进行评估是否更快?还是我想念其他东西?

What are the benefits of using with()? In the help file it mentions it evaluates the expression in an environment it creates from the data. What are the benefits of this? Is it faster to create an environment and evaluate it in there as opposed to just evaluating it in the global environment? Or is there something else I'm missing?

推荐答案

with是不带data参数的函数的包装程序

有许多函数可用于数据框并带有data参数,因此您不必在每次引用列时都重新输入数据框的名称. lmplot.formulasubsettransform只是几个示例.

with is a wrapper for functions with no data argument

There are many functions that work on data frames and take a data argument so that you don't need to retype the name of the data frame for every time you reference a column. lm, plot.formula, subset, transform are just a few examples.

with是通用的 wrapper ,可让您像使用数据参数一样使用任何函数.

with is a general purpose wrapper to let you use any function as if it had a data argument.

使用mtcars数据集,我们可以在使用或不使用data参数的情况下拟合模型:

Using the mtcars data set, we could fit a model with or without using the data argument:

# this is obviously annoying
mod = lm(mtcars$mpg ~ mtcars$cyl + mtcars$disp + mtcars$wt)

# this is nicer
mod = lm(mpg ~ cyl + disp + wt, data = mtcars)

但是,如果(出于某种奇怪的原因)我们想要找到cyl + disp + wtmean,则会出现问题,因为mean没有像lm那样的数据参数.这是with解决的问题:

However, if (for some strange reason) we wanted to find the mean of cyl + disp + wt, there is a problem because mean doesn't have a data argument like lm does. This is the issue that with addresses:

# without with(), we would be stuck here:
z = mean(mtcars$cyl + mtcars$disp + mtcars$wt)

# using with(), we can clean this up:
z = with(mtcars, mean(cyl + disp + wt))

with(data, foo(...))中包装foo()可以让我们使用任何函数foo 好像具有data参数-也就是说,我们可以使用未加引号的列名,从而防止重复data_name$column_namedata_name[, "column_name"].

Wrapping foo() in with(data, foo(...)) lets us use any function foo as if it had a data argument - which is to say we can use unquoted column names, preventing repetitive data_name$column_name or data_name[, "column_name"].

只要您愿意(R控制台)和在R脚本中进行交互,都可以使用with来保存键入内容并使代码更清晰.

Use with whenever you like interactively (R console) and in R scripts to save typing and make your code clearer. The more frequently you would need to re-type your data frame name for a single command (and the longer your data frame name is!), the greater the benefit of using with.

还请注意,with不仅限于数据帧.来自?with:

Also note that with isn't limited to data frames. From ?with:

对于默认的with方法,它可以是环境,列表,数据框或如sys.call中所示的整数.

For the default with method this may be an environment, a list, a data frame, or an integer as in sys.call.

我不经常在环境中工作,但是当我这样做时,我发现with非常方便.

I don't often work with environments, but when I do I find with very handy.

正如@Rich Scriven在评论中建议的那样,当您需要使用诸如rle之类的结果时,with可能非常有用.如果只需要一次结果,那么他的示例with(rle(data), lengths[values > 1])允许您匿名使用rle(data)结果.

As @Rich Scriven suggests in comments, with can be very useful when you need to use the results of something like rle. If you only need the results once, then his example with(rle(data), lengths[values > 1]) lets you use the rle(data) results anonymously.

许多具有data参数的函数在调用它时不仅将其用作更简单的语法,而且还使用了它.大多数建模函数(例如lm)以及许多其他建模函数(ggplot!)也对提供的data做了大量工作.如果您使用with 代替data参数,则会限制您可以使用的功能. 如果有data参数,请使用data参数,而不要使用with.

Many functions that have a data argument use it for more than just easier syntax when you call it. Most modeling functions (like lm), and many others too (ggplot!) do a lot with the provided data. If you use with instead of a data argument, you'll limit the features available to you. If there is a data argument, use the data argument, not with.

在上面的示例中,结果被分配给全局环境(bar = with(...)).要在列表/环境/数据内部中进行分配,可以使用within. (对于data.framestransform也很好.)

In my example above, the result was assigned to the global environment (bar = with(...)). To make an assignment inside the list/environment/data, you can use within. (In the case of data.frames, transform is also good.)

在R软件包中不要使用with. help(subset)中有一个警告可能也适用于with:

Don't use with in R packages. There is a warning in help(subset) that could apply just about as well to with:

警告:这是一种便利功能,旨在交互使用.对于编程,最好使用诸如[之类的标准子集功能,尤其是参数子集的非标准评估会产生意想不到的后果.

Warning This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.

如果使用with构建R包,则在检查它时可能会收到有关使用无可见绑定的变量的警告或说明.这将使CRAN无法接受该软件包.

If you build an R package using with, when you check it you will probably get warnings or notes about using variables without a visible binding. This will make the package unacceptable by CRAN.

许多(大多数为过时的)R教程使用attach来避免使全局环境可以访问列,从而避免重新键入数据框名称. attach被广泛认为是不良做法,应避免使用.附加的主要危险之一是,如果分别修改数据列,则它们可能会变得不同步. with避免了这种陷阱,因为它一次被调用一个表达式.关于Stack Overflow的问题很多,新用户遵循旧教程并因attach而遇到问题.简单的解决方案总是不要使用attach .

Many (mostly dated) R tutorials use attach to avoid re-typing data frame names by making columns accessible to the global environment. attach is widely considered to be bad practice and should be avoided. One of the main dangers of attach is that data columns can become out of sync if they are modified individually. with avoids this pitfall because it is invoked one expression at a time. There are many, many questions on Stack Overflow where new users are following an old tutorial and run in to problems because of attach. The easy solution is always don't use attach.

如果要执行许多数据操作步骤,您可能会发现自己以with(my_data, ...开始每一行代码.您可能会认为这种重复几乎与不使用with一样糟糕. data.tabledplyr软件包都提供了具有非重复语法的有效数据操作.我鼓励您学习使用其中之一.两者都有出色的文档.

If you are doing many steps of data manipulation, you may find yourself beginning every line of code with with(my_data, .... You might think this repetition is almost as bad as not using with. Both the data.table and dplyr packages offer efficient data manipulation with non-repetitive syntax. I'd encourage you to learn to use one of them. Both have excellent documentation.

这篇关于什么时候使用"with"功能,为什么好呢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆