从函数返回数据帧并将其存储在工作区中 [英] Return a dataframe from a function and store it in the workspace

查看:29
本文介绍了从函数返回数据帧并将其存储在工作区中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我使用 R 的第一周,我似乎无法管理关于函数的一件事.

This is my first week working with R and there is one thing about function I cannot seems to manage.

df <- data.frame(a = c(1:10),
             b = c("a", "a", "b", "c", "c", "b", "a", "c", "c", "b"))

testF = function(select) {
dum = subset(df, b == select)
}

lapply(unique(df$b), testF)

这个函数现在只是在屏幕上打印数据集.但我想将结果作为单独的数据框存储在我的工作区中.在这个例子中,这将给出三个数据帧;a、b 和 c.

This function now just prints the the data sets on screen. But I would like to store the results as separate data frames in my workspace. In this example this would give three data frames; a, b and c.

感谢您的帮助.

推荐答案

Roland 有针对特定问题的正确解决方案:不需要多个 split().只是为了确保: split() 返回一个列表.要在您的工作区中获取单独的数据框,您可以:

Roland has the correct solution for the specific problem: more than a split() is not needed. Just to make sure: split() returns a list. To get separate data frames in you workspace, you do:

list2env(split(df,df$b),.GlobalEnv)

或者,使用赋值:

tmp <- split(df,df$b)
for(i in names(tmp)) assign(i,tmp[[i]])

<小时>

关于子集的一个词

这就是说,关于为什么你的函数明显错误的更多细节.首先,在 ?subset 中你读到:

This said, some more detail on why your function is plain wrong. First of all, in ?subset you read:

警告

这是一个旨在交互使用的便利功能.为了编程最好使用标准的子集功能,如[,特别是参数子集的非标准评估可能会产生意想不到的后果.

This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.

转换为:在你的生活中永远不要再在函数中使用 subset().

Translates to: Never ever in your life use subset() within a function again.

关于从函数返回值的一句话

接下来,一个函数总是返回一个结果:

Next to that, a function always returns a result:

  • 如果使用了 return() 语句,它将返回作为 return() 参数给出的任何内容.
  • 否则返回最后一行的结果.
  • if a return() statement is used, it returns whatever is given as an argument to return().
  • otherwise it returns the result of the last line.

就您而言,最后一行包含一个作业.现在赋值也返回一个值,但你看不到它.它被隐形返回.用括号括起来就可以看到,例如:

In your case, the last line contains an assignment. Now an assignment also returns a value, but you don't see it. It's returned invisibly. You can see it by wrapping it in parentheses, for example:

> x <- 10
> (x <- 20)
[1] 20

这完全没有必要.这就是为什么您的函数在 lapply() 中使用时起作用的原因(lapply 捕获不可见的输出),但在命令行中使用时不会给您任何(可见的)输出.你可以捕捉它:

This is absolutely unnecessary. It's the reason why your function works when used in lapply() (lapply catches invisible output), but won't give you any (visible) output when used at the command line. You can capture it though :

> testF("b")
> x <- testF("b")
> x
    a b
3   3 b
6   6 b
10 10 b

函数中的赋值没有意义:要么显式返回 dum,要么直接将赋值全部删除

The assignment in your function doesn't make sense: either you return dum explicitly, or you just drop the assignment alltogether

更正您的函数

因此,鉴于这只是一个示例,仅使用 split() 无法解决真正的问题,您的函数将是:

So, given this is just an example and the real problem wouldn't be solved by simply using split() your function would be :

testF <- function(select) {
    dum <- df[df$b=select,]
    return(dum)
}

或者简单地说:

testF <- function(select){
    df[df$b=select,]
}

这篇关于从函数返回数据帧并将其存储在工作区中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆