为什么子集不介意为数据帧缺少子集参数? [英] Why subset doesn't mind missing subset argument for dataframes?

查看:127
本文介绍了为什么子集不介意为数据帧缺少子集参数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通常我想知道神秘的错误来自哪里,但是现在我的问题是神秘的错误来自哪里。

Normally I wonder where mysterious errors come from but now my question is where a mysterious lack of error comes from.

numbers <- c(1, 2, 3)
frame <- as.data.frame(numbers)

如果我键入

subset(numbers, )

(所以我想取一些子集,但是忘记指定的子集参数子集函数),然后R提醒我(应该这样):

(so I want to take some subset but forget to specify the subset-argument of the subset function) then R reminds me (as it should):


subset.default(numbers,)中的错误:

参数子集丢失,没有默认值

Error in subset.default(numbers, ) :
argument "subset" is missing, with no default

但是当我键入

subset(frame,)

(so与 data.frame 而不是矢量相同),它不会给出错误,而是返回(完整的)数据帧。

(so the same thing with a data.frame instead of a vector), it doesn't give an error but instead just returns the (full) dataframe.

这是怎么回事?为什么我没有收到应有的错误消息?

What is going on here? Why don't I get my well deserved error message?

推荐答案

tl; dr : code> subset 函数根据所馈送的对象的类型调用不同的函数(具有不同的方法)。在上面的示例中, subset(numbers)使用 subset.default ,而 subset(frame ,)使用 subset.data.frame

tl;dr: The subset function calls different functions (has different methods) depending on the type of object it is fed. In the example above, subset(numbers, ) uses subset.default while subset(frame, ) uses subset.data.frame.

R内置了两个面向对象的系统。最简单和最常见的称为S3。这种OO编程风格实现了Wickham所谓的泛型OO。在这种OO风格下,称为泛型函数的对象查看对象的类,然后将适当的方法应用于该对象。如果不存在直接方法,那么总会有默认方法可用。

R has a couple of object-oriented systems built-in. The simplest and most common is called S3. This OO programming style implements what Wickham calls a "generic-function OO." Under this style of OO, an object called a generic function looks at the class of an object and then applies the proper method to the object. If no direct method exists, then there is always a default method available.

要想更好地了解S3的工作方式和其他OO系统的工作方式,可以检查一下高级R 网站的相关部分。为对象找到适当方法的过程称为方法分派。您可以在帮助文件?UseMethod 中了解有关此内容的更多信息。

To get a better idea of how S3 works and the other OO systems work, you might check out the relevant portion of the Advanced R site. The procedure of finding the proper method for an object is referred to as method dispatch. You can read more about this in the help file ?UseMethod.

如<$>的详细信息部分所述c $ c>?subset ,子集函数是通用函数。这意味着子集在第一个参数中检查对象的类,然后使用方法分派将适当的方法应用于该对象。

As noted in the Details section of ?subset, the subset function "is a generic function." This means that subset examines the class of the object in the first argument and then uses method dispatch to apply the appropriate method to the object.

通用函数的方法编码为


<通用函数名称>。<类名>

< generic function name >.< class name >

并可以使用 methods(<通用函数名称>)。对于子集,我们得到

and can be found using methods(<generic function name>). For subset, we get

methods(subset)
[1] subset.data.frame subset.default    subset.matrix    
see '?methods' for accessing help and source code

表示如果对象具有data.frame类,则 subset 调用 subset.data.frame 方法(函数)。定义如下:

which indicates that if the object has a data.frame class, then subset calls the subset.data.frame the method (function). It is defined as below:

subset.data.frame
function (x, subset, select, drop = FALSE, ...) 
{
    r <- if (missing(subset)) 
        rep_len(TRUE, nrow(x))
    else {
        e <- substitute(subset)
        r <- eval(e, x, parent.frame())
        if (!is.logical(r)) 
            stop("'subset' must be logical")
        r & !is.na(r)
    }
    vars <- if (missing(select)) 
        TRUE
    else {
        nl <- as.list(seq_along(x))
        names(nl) <- names(x)
        eval(substitute(select), nl, parent.frame())
    }
    x[r, vars, drop = drop]
}

请注意,如果缺少subset参数,第一行

Note that if the subset argument is missing, the first lines

    r <- if (missing(subset)) 
        rep_len(TRUE, nrow(x))

生成一个与data.frame长度相同的TRUES向量,最后一行

produce a vector of TRUES of the same length as the data.frame, and the last line

    x[r, vars, drop = drop]

将此向量输入到row参数中,这意味着如果您不包括子集参数,则 subset 函数将返回data.frame的所有行。

feeds this vector into the row argument which means that if you did not include a subset argument, then the subset function will return all of the rows of the data.frame.

方法调用的输出中可以看到, 子集没有用于ato的方法麦克风向量。这意味着,作为错误

As we can see from the output of the methods call, subset does not have methods for atomic vectors. This means, as your error


subset.default(numbers,)

Error in subset.default(numbers, )

当您将子集应用于向量时,R调用子集。默认方法定义为

that when you apply subset to a vector, R calls the subset.default method which is defined as

subset.default
function (x, subset, ...) 
{
    if (!is.logical(subset)) 
        stop("'subset' must be logical")
    x[subset & !is.na(subset)]
}

子集当缺少子集参数时,.default 函数会引发 stop 错误。

这篇关于为什么子集不介意为数据帧缺少子集参数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆