R sapply vs apply vs lapply + as.data.frame [英] R sapply vs apply vs lapply + as.data.frame

查看:32
本文介绍了R sapply vs apply vs lapply + as.data.frame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一些 Date 列并尝试清除明显不正确的日期.我已经使用提到的 safe.ifelse 函数编写了一个函数 此处.

I'm working with some Date columns and trying to cleanse for obviously incorrect dates. I've written a function using the safe.ifelse function mentioned here.

这是我的玩具数据集:

df1 <- data.frame(id = 1:25
    , month1 = seq(as.Date('2012-01-01'), as.Date('2014-01-01'), by = 'month'  )
    , month2 = seq(as.Date('2012-01-01'), as.Date('2014-01-01'), by = 'month'  )
    , month3 = seq(as.Date('2012-01-01'), as.Date('2014-01-01'), by = 'month'  )
    , letter1 = letters[1:25]
    )

这适用于单列:

df1$month1 <- safe.ifelse(df1$month1 > as.Date('2013-10-01'), as.Date('2013-10-01'), df1$month1)

由于我有多个列,我想使用一个函数并申请一次处理所有 Date 列:

As I have multiple columns I'd like to use a function and apply to take care of all Date columns at once:

capDate <- function(x){
today1 <- Sys.Date()
    safe.ifelse <- function(cond, yes, no){ class.y <- class(yes)
                                  X <- ifelse(cond,yes,no)
                                  class(X) <-class.y; return(X)}

    x <- safe.ifelse(as.Date(x) > as.Date(today1), as.Date(today1), as.Date(x))
 }

但是当我尝试使用 sapply()

df1[,dateCols1] <- sapply(df1[,dateCols1], capDate)

apply()

df1[,dateCols1] <- apply(df1[,dateCols1],2, capDate))

Date 列丢失了它们的 Date 格式.我发现解决这个问题的唯一方法是使用 lapply(),然后转换回 data.frame().谁能解释一下?

the Date columns lose their Date formatting. The only way I've found to get around this is by using lapply() and then converting back to a data.frame(). Can anyone explain this?

df1[,dateCols1] <- as.data.frame(lapply(df1[,dateCols1], capDate))

推荐答案

sapplyapply 都将结果转换为矩阵.as.data.frame(lapply(...)) 是一种循环数据框列的安全方法.

Both sapply and apply convert the result to matrices. as.data.frame(lapply(...)) is a safe way to loop over data frame columns.

as.data.frame(
  lapply(
    df1, 
    function(column) 
    {
      if(inherits(column, "Date")) 
      {
        pmin(column, Sys.Date())
      } else column
    }
  )
)

<小时>

使用 plyr 中的 ddply 更简洁一些.


It's a little cleaner with ddply from plyr.

library(plyr)
ddply(
  df1, 
  .(id), 
  colwise(
    function(column) 
    {
      if(inherits(column, "Date")) 
      { 
        pmin(column, Sys.Date()) 
      } else column
    }
  )
)

这篇关于R sapply vs apply vs lapply + as.data.frame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆