将data.frame转换为ff [英] convert data.frame to ff

查看:164
本文介绍了将data.frame转换为ff的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用as.ffdf将data.frame转换为ff对象,如

I would like to convert a data.frame to a ff object, with as.ffdf as described here

df.apr=as.data.frame(df.apr) # from data.table to data.frame
cols=df.apr[1,] 
cols=sapply(cols,class)
df_apr=as.ffdf(df.apr,vmode=cols)

出现错误:

Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered,
: vmode 'numeric' not implemented

没有'vmode'参数,将给出以下错误:

without the 'vmode' argument, the following error is given:

Error in ff(initdata = initdata, length = length, levels = levels, ordered = ordered, 
: vmode 'character' not implemented

写出表格然后直接读入ff可以工作

writing away to a table and then reading directly into ff works however:

write.table(df.apr,file='df_apr.txt',sep='\t',row.names=F)
df.apr.ff=read.table.ffdf(file='df_apr.txt',header=F,VERBOSE=T)

但是这很费时间[而且笨拙]. 有更好的方法吗?

but this is time consuming [and clumsy]. is there a better way?

推荐答案

如果您想知道可以在ff中使用的所有可能的vmode,请在控制台上键入以下内容.

If you want to know all possible vmodes which can be used in ff type the following at the console.

require(ff)
.vimplemented

您将看到数字和字符模式不在其中.数字转换为双精度,字符转换为因数.因此,在您的问题中,您实际上不需要自己指定vmodes.只要将字符编码为因子,就可以在data.frame上使用as.ffdf.这样就可以了.

You'll see that numeric and character modes are not in these. Numerics are converted to doubles, characters to factors. So in your question, you really don't need to specify the vmodes yourself. As long as the characters are coded as factors, you can use as.ffdf on your data.frame. So this will work.

df.apr=as.data.frame(df.apr, stringsAsFactors=TRUE)
df_apr=as.ffdf(df.apr)

仅供参考.如果您的数据来自平面文件,请考虑使用read.table.ffdf,或者如果它来自SQL数据源,则可以使用ETLUtils包中的read.dbi.ffdf或read.odbc.ffdf.如果它是通过Hive来自Hadoop的,则可以使用ETLUtils软件包中的read.jdbc.ffdf.

FYI. If your data is coming from flat files, consider using read.table.ffdf or if it is coming from an SQL data source, you can used read.dbi.ffdf or read.odbc.ffdf from the ETLUtils package. If it is coming from Hadoop through Hive, you can use read.jdbc.ffdf from the ETLUtils package.

这篇关于将data.frame转换为ff的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆