从函数返回数据框并将其存储在工作空间中 [英] Return a dataframe from a function and store it in the workspace

查看:119
本文介绍了从函数返回数据框并将其存储在工作空间中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我与R的第一周工作,有一件事情我似乎无法管理。

  df< ;  -  data.frame(a = c(1:10),
b = c(a,a,b,c,c,b,a c,c,b))

testF = function(select){
dum = subset(df,b == select)
}

lapply(unique(df $ b),testF)

在屏幕上打印数据集。但是我想将结果存储在我的工作空间中的单独数据帧。在这个例子中,这将给出三个数据帧; a,b和c。



感谢帮助。

解决方案

Roland针对具体问题提供了正确的解决方案:不需要 split()。只要确保: split()返回列表。要在您的工作空间中获取单独的数据框,请执行以下操作:

  list2env(split(df,df $ b),。GlobalEnv) 

或使用assign:



<$ p $ (i,tmp [[i]])

tmp< - split(df,df $ b) code>






子集上的一个字



这就是说,为什么你的函数是错误的呢?首先,在?子集中,您阅读:


/ strong>



这是一个方便的功能,旨在交互使用。对于
编程,最好使用标准子集功能,如
[,特别是参数子集
的非标准评估可能会产生意想不到的后果。


转换为:从来没有在你的生活中再次使用 subset()






从函数返回值的单词



旁边,一个函数总是返回一个结果:




  • 如果一个 return()语句被使用,它返回任何作为参数给出的值到 return()

  • 否则返回结果最后一行。



在你的情况下,最后一行包含一个作业。现在一个作业也返回一个值,但是你看不到它。 c 返回。您可以将其包装在括号中,例如:

 > x<  -  10 
> (x < - 20)
[1] 20

这绝对不必要。这就是为什么您的函数在 lapply()(lapply catchches invisible output)中使用时的原因,但是在命令行中使用时不会给出任何(可见)输出。您可以通过以下方式捕获它:

 > testF(b)
> x< - testF(b)
> x
ab
3 3 b
6 6 b
10 10 b

你的函数中的赋值是没有意义的:你明确地返回 dum ,或者你只是把这个赋值全部放在




更正您的功能



所以,例如,只需使用 split(),您的功能将是:

  testF<  -  function(select){
dum< - df [df $ b = select,]
return(dum)
}

或简单地:

  testF<  -  function(select){
df [df $ b = select,]
}


This is my first week working with R and there is one thing about function I cannot seems to manage.

df <- data.frame(a = c(1:10),
             b = c("a", "a", "b", "c", "c", "b", "a", "c", "c", "b"))

testF = function(select) {
dum = subset(df, b == select)
}

lapply(unique(df$b), testF)

This function now just prints the the data sets on screen. But I would like to store the results as separate data frames in my workspace. In this example this would give three data frames; a, b and c.

Thank for the help.

解决方案

Roland has the correct solution for the specific problem: more than a split() is not needed. Just to make sure: split() returns a list. To get separate data frames in you workspace, you do:

list2env(split(df,df$b),.GlobalEnv)

Or, using assign:

tmp <- split(df,df$b)
for(i in names(tmp)) assign(i,tmp[[i]])


A word on subset

This said, some more detail on why your function is plain wrong. First of all, in ?subset you read:

Warning

This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.

Translates to: Never ever in your life use subset() within a function again.


A word on returning values from a function

Next to that, a function always returns a result:

  • if a return() statement is used, it returns whatever is given as an argument to return().
  • otherwise it returns the result of the last line.

In your case, the last line contains an assignment. Now an assignment also returns a value, but you don't see it. It's returned invisibly. You can see it by wrapping it in parentheses, for example:

> x <- 10
> (x <- 20)
[1] 20

This is absolutely unnecessary. It's the reason why your function works when used in lapply() (lapply catches invisible output), but won't give you any (visible) output when used at the command line. You can capture it though :

> testF("b")
> x <- testF("b")
> x
    a b
3   3 b
6   6 b
10 10 b

The assignment in your function doesn't make sense: either you return dum explicitly, or you just drop the assignment alltogether


Correcting your function

So, given this is just an example and the real problem wouldn't be solved by simply using split() your function would be :

testF <- function(select) {
    dum <- df[df$b=select,]
    return(dum)
}

or simply:

testF <- function(select){
    df[df$b=select,]
}

这篇关于从函数返回数据框并将其存储在工作空间中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆