从函数返回数据帧 [英] Return a data frame from function

查看:34
本文介绍了从函数返回数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在函数中有以下代码

Myfunc<- function(directory, MyFiles, id = 1:332) {# 取消注释以下 3 行以进行测试#directory<-本地"#id=c(2, 4)#MyFiles<-c(f2.csv,f4.csv)id<-iddf2 <- data.frame()for(i in 1:length(idd)) {EmptyVector <- read.csv(MyFiles[i])comp_cases[i]<-sum(complete.cases(EmptyVector))打印(comp_cases[[i]])id=idd[i]ret2=comp_cases[[i]]df2<-rbind(df2,data.frame(id,ret2))}打印(df2)返回(df2)}

当我尝试通过选择函数内的代码并注释掉返回值在 R 中运行它时,这会起作用.我从打印语句中得到了一个不错的数据框:

>df2编号 ret21 2 9942 4 7112

但是,当我尝试从函数返回数据帧 df2 时,它只返回第一行,忽略所有其他值.我的问题是它在我尝试过的各种值的函数内工作(打开具有各种组合的多个文件),而不是在我尝试返回数据框时.有人可以帮忙吗.非常感谢.

解决方案

如果我理解正确,您正在尝试创建一个数据框,其中包含每个 id 的完整案例数.假设您的文件是具有您指定的 id-numbers 的名称(例如 f2.csv),您可以按如下方式简化您的功能:

myfunc <- function(directory, id = 1:332) {y <- 向量()for(i in 1:length(id)){x <- idy <- c(y, sum(complete.cases(read.csv(as.character(paste0(directory,"/","f",id[i],".csv"))))))}df <- data.frame(x, y)colnames(df) <- c("id","re​​t2")回报(df)}

你可以这样调用这个函数:

myfunc("您的目录名称",25:87)

<小时>

对上述代码的解释.你必须把你的问题分解成几个步骤:

  1. 你需要一个 id 的向量,这是由 x <- id
  2. 完成的
  3. 对于每个 id,您需要完整案例的数量.为了得到它,你必须先阅读文件.这是由 read.csv(as.character(paste0(directory,"/","f",id[i],".csv"))) 完成的.要获得该文件的完整案例数,您必须将 read.csv 代码包装在 sumcomplete.cases 中.
  4. 现在您想将该数字添加到向量中.因此,您需要一个空向量 (y <- vector()),您可以将第 2 步中的完整案例数添加到该向量中.这是通过将第 2 步中的代码包装在 y 中来完成的<- c(y, "code step 2").有了这个,您将每个 id 的完整案例数添加到向量 y 中.
  5. 最后一步是将这两个向量组合成一个带有 df <- data.frame(x, y) 的数据帧,并分配一些有意义的 colnames.

通过在 for 循环中包含步骤 1、2 和 3(y <- vector() 部分除外),您可以迭代指定 id 的列表.必须在 for 循环之前使用 y <- vector() 创建空向量,以便 for 循环可以向 y 添加值.

I have the following code inside a function

Myfunc<- function(directory, MyFiles, id = 1:332) {
# uncomment the 3 lines below for testing
#directory<-"local"
#id=c(2, 4)
#MyFiles<-c(f2.csv,f4.csv)
idd<-id

df2 <- data.frame()

for(i in 1:length(idd)) {
  EmptyVector <- read.csv(MyFiles[i])  
  comp_cases[i]<-sum(complete.cases(EmptyVector))
  print(comp_cases[[i]])
  id=idd[i]
  ret2=comp_cases[[i]]
  df2<-rbind(df2,data.frame(id,ret2))
 }
print(df2)
return(df2)
}

This works when I try to run it in R by selecting the code inside the function and commenting out the return. I get a nice data frame like from the print statement:

> df2
 id ret2
1 2  994
2 4  7112

However, when I try to return the dataframe df2 from the function it only returns the 1st row, ignoring all other values. My problem is that it works within the function for various values I have tried (opening multiple files with various combinations) and not when I try to return the data frame. Can someone help please. Thanks a lot in advance.

解决方案

If I understand you correctly, you are trying to create a dataframe with the number of complete cases for each id. Supposing your files are names with the id-numbers like you specified (e.g. f2.csv), you can simplify your function as follows:

myfunc <- function(directory, id = 1:332) {
  y <- vector()
  for(i in 1:length(id)){
    x <- id
    y <- c(y, sum(complete.cases(
      read.csv(as.character(paste0(directory,"/","f",id[i],".csv"))))))
  }
  df <- data.frame(x, y)
  colnames(df) <- c("id","ret2")
  return(df)
}

You can call this function like this:

myfunc("name-of-your-directory",25:87)


An explanation of the above code. You have to break down your problem into steps:

  1. You need a vector of the id's, that's done by x <- id
  2. For each id you want the number of complete cases. In order to get that, you have to read the file first. That's done by read.csv(as.character(paste0(directory,"/","f",id[i],".csv"))). To get the number of complete cases for that file, you have to wrap the read.csv code inside sum and complete.cases.
  3. Now you want to add that number to a vector. Therefore you need an empty vector (y <- vector()) to which you can add the number of complete cases from step 2. That's done by wrapping the code from step 2 inside y <- c(y, "code step 2"). With this you add the number of complete cases for each id to the vector y.
  4. The final step is to combine these two vectors into a dataframe with df <- data.frame(x, y) and assign some meaningfull colnames.

By including the steps 1, 2 and 3 (except the y <- vector() part) in a for-loop, you can iterate over the list of specified id's. Creating the empty vector with y <- vector() has to be done before the for-loop, so that the for-loop can add values to y.

这篇关于从函数返回数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆