R sapply vs apply vs lapply + as.data.frame [英] R sapply vs apply vs lapply + as.data.frame

查看:167
本文介绍了R sapply vs apply vs lapply + as.data.frame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一些Date列,并尝试清除明显不正确的日期.我使用

I'm working with some Date columns and trying to cleanse for obviously incorrect dates. I've written a function using the safe.ifelse function mentioned here.

这是我的玩具数据集:

df1 <- data.frame(id = 1:25
    , month1 = seq(as.Date('2012-01-01'), as.Date('2014-01-01'), by = 'month'  )
    , month2 = seq(as.Date('2012-01-01'), as.Date('2014-01-01'), by = 'month'  )
    , month3 = seq(as.Date('2012-01-01'), as.Date('2014-01-01'), by = 'month'  )
    , letter1 = letters[1:25]
    )

这对于单列工作正常:

df1$month1 <- safe.ifelse(df1$month1 > as.Date('2013-10-01'), as.Date('2013-10-01'), df1$month1)

由于我有多个列,所以我想使用一个函数并申请同时处理所有Date列:

As I have multiple columns I'd like to use a function and apply to take care of all Date columns at once:

capDate <- function(x){
today1 <- Sys.Date()
    safe.ifelse <- function(cond, yes, no){ class.y <- class(yes)
                                  X <- ifelse(cond,yes,no)
                                  class(X) <-class.y; return(X)}

    x <- safe.ifelse(as.Date(x) > as.Date(today1), as.Date(today1), as.Date(x))
 }

但是,当我尝试使用sapply()

df1[,dateCols1] <- sapply(df1[,dateCols1], capDate)

apply()

df1[,dateCols1] <- apply(df1[,dateCols1],2, capDate))

Date列丢失其Date格式.我发现解决此问题的唯一方法是使用lapply(),然后再转换回data.frame().谁能解释一下?

the Date columns lose their Date formatting. The only way I've found to get around this is by using lapply() and then converting back to a data.frame(). Can anyone explain this?

df1[,dateCols1] <- as.data.frame(lapply(df1[,dateCols1], capDate))

推荐答案

sapplyapply都将结果转换为矩阵. as.data.frame(lapply(...))是一种遍历数据帧列的安全方法.

Both sapply and apply convert the result to matrices. as.data.frame(lapply(...)) is a safe way to loop over data frame columns.

as.data.frame(
  lapply(
    df1, 
    function(column) 
    {
      if(inherits(column, "Date")) 
      {
        pmin(column, Sys.Date())
      } else column
    }
  )
)


使用plyr中的ddply可以使工作更清洁.


It's a little cleaner with ddply from plyr.

library(plyr)
ddply(
  df1, 
  .(id), 
  colwise(
    function(column) 
    {
      if(inherits(column, "Date")) 
      { 
        pmin(column, Sys.Date()) 
      } else column
    }
  )
)

这篇关于R sapply vs apply vs lapply + as.data.frame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆