如何在ffdf中使用apply或sapply或lapply? [英] How to use apply or sapply or lapply with ffdf?

查看:105
本文介绍了如何在ffdf中使用apply或sapply或lapply?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有一种方法可以直接将应用类型构造应用于ffdf对象的列?我试图在不将其转换为标准数据帧的情况下计算每列中的NA.我可以使用以下方法获取各个列的不适用计数:

Is there a way to use an apply type construct directly to the columns of a ffdf object? I am trying to count the NAs in each column without having to turn it into a standard data frame. I can get the na count for the individual columns using:

sum(is.na(ffdf$columnname))

但是有一种方法可以一次对数据框中的所有列执行此操作,例如:

But is there a way to do this for all the columns in the dataframe at once, something like:

lapply(ffdf, function(x){sum(is.na(x))})

当我运行它时,我得到:

When I run this I get:

$virtual
[1] 0

$physical
[1] 0

$row.names
[1] 0

我无法在ff文档中找到lapply或sapply的特殊版本.还有一种简单的方法可以一次性计算整个ffdf的资产净值吗?

I have not been able to find a special version of lapply or sapply in the ff documentation. Further is there a simple way to count the NAs over the entire ffdf in one go?

推荐答案

ffdf基本上是一个包含虚拟",物理","row.names"元素的列表. 如果您对物理元素不满意,那么您将拥有所需的东西.

An ffdf is basically a list with elements "virtual", "physical", "row.names". If you do an lapply over the physical element, you have what you want.

require(ffbase)
myffdf <- as.ffdf(iris)
lapply(physical(myffdf), FUN=function(x) sum(is.na(x)))

由于is.na并且sum是通用的,因此基本上将使用ffbase包中的is.na.ffsum.ff,以便根据计算机的处理能力将数据分块加载到RAM中.

As is.na and sum is generic, this will basically use is.na.ff and sum.ff from package ffbase such that data is loaded into RAM chunkwise according to what your computer can handle.

这篇关于如何在ffdf中使用apply或sapply或lapply?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆