在R函数的内部和外部分割data.frame [英] Splitting data.frame inside and outside an R function

查看:94
本文介绍了在R函数的内部和外部分割data.frame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有3个data.frame( A B1 B2 ).我分别通过变量study.name split并得到我的所需输出,显示为out1out2out3:

I have 3 data.frames (A, B1 and B2). I split each by variable study.name and get my desired output shown as out1, out2, out3:

J <- split(A, A$study.name);      out1 <- do.call(rbind, c(J, make.row.names = F))
M <- split(B1, B1$study.name);    out2 <- do.call(rbind, c(M, make.row.names = F))
N <- split(B2, B2$study.name);    out3 <- do.call(rbind, c(N, make.row.names = F))

但是我想知道为什么不能从函数foo中获得相同的输出? (请参见下文)

But I'm wondering why I can't achieve the same output from my function foo? (see below)

 A <- read.csv("https://raw.githubusercontent.com/izeh/m/master/irr.csv", h = T)  ## data A
B1 <- read.csv('https://raw.githubusercontent.com/izeh/m/master/irr2.csv', h = T) ## data B1
B2 <- read.csv("https://raw.githubusercontent.com/izeh/m/master/irr4.csv", h = T) ## data B2

 foo <- function(...){      ## The unsuccessful function `foo`

    r <- list(...)

 ## r <- Can we HERE delete rows and columns that are ALL `NA` or EMPTY in `r`?

    J <- unlist(lapply(seq_along(r), function(i) split(r[[i]], r[[i]]$study.name)), recursive = FALSE)

    lapply(seq_along(J), function(i)do.call(rbind, c(J[[i]], make.row.names = FALSE)) )
}

foo(B1, B2) # Example without success

推荐答案

我们可以在执行split

foo <- function(...){  
    r <- list(...)

    lapply(r, function(dat) {

       m1 <- is.na(dat)|dat == ""
      i1 <- rowSums(m1) < ncol(m1)
      j1 <- colSums(m1) < nrow(m1)
      dat1 <- dat[i1, j1]
      facColumns <- sapply(dat1, is.factor)
      dat1[facColumns] <- lapply(dat1[facColumns], as.character)
      dat1$study.name <- factor(dat1$study.name, levels = unique(dat1$study.name))  
      l1 <- split(dat1, dat1$study.name)


          do.call(rbind, c(l1, make.row.names = FALSE))

     }

    )


}

lapply(foo(B1, B2), head, 2)
#[[1]]
#  study.name group.name outcome ESL prof scope type
#1 Shin.Ellis   ME.short       1   1    2     1    1
#2 Shin.Ellis    ME.long       1   1    2     1    1

#[[2]]
#  study.name group.name outcome ESL prof scope type
#1 Shin.Ellis   ME.short       1   1    2     1    1
#2 Shin.Ellis    ME.long       1   1    2     1    1

或使用单个对象作为参数

or using a single object as argument

lapply(foo(A), head, 2)
#[[1]]
#  study.name group.name outcome ESL prof scope type ESL.1 prof.1 scope.1 type.1
#1 Shin.Ellis   ME.short       1   1    2     1    1     1      2       1      1
#2 Shin.Ellis    ME.long       1   1    2     1    1     1      2       1      1

这篇关于在R函数的内部和外部分割data.frame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆