如何按日期对R中的大数据帧(ffdf)进行子集化? [英] How to subset a large data frame (ffdf) in R by date?
本文介绍了如何按日期对R中的大数据帧(ffdf)进行子集化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试按日期将FFDF子集化.在下面,我已经使用普通数据帧成功创建了这样的子集.但是在将其应用于FFDF时我需要一些帮助.代码注释中列出了我的尝试以及错误消息.提前非常感谢!
I am trying to subset an FFDF by a date. Below, I have successfully created such a subset using a normal data frame. But I needed some help in applying this to an FFDF. My attempt, along with the error message, is listed in the code comment. Many Thanks in advance!
#Create a normal data frame (in production this is read directly into an ffdf
#through a csv file)
start <- c("01/01/2010", "01/01/2011", "01/01/2012", "01/01/2012", "01/01/2012")
end <- c("31/12/2010", "31/12/2011", "31/12/2012", "31/12/2012", "31/12/2012")
amount <- c(10,20,30,40,50)
df <- data.frame(start,end,amount)
#Ensure subsetting works on a normal data frame
#convert type to proper date (this has to be done in production after csv file
#has been read in)
df$start <- as.Date(df$start, format="%d/%m/%Y")
df$end <- as.Date(df$end, format="%d/%m/%Y")
#Subset
df <- subset(df, start == as.Date("2012-01-01",format="%Y-%m-%d"))
#Works :) Now let's try with ffdf
ffdf <- as.ffdf(df)
#Type conversion for dates (again, applied in production after mammoth csv has
#been read in)
ffdf$start <- as.Date(ffdf$start, format="%m/%d/%Y")
ffdf$end <- as.Date(ffdf$end, format="%m/%d/%Y")
#Subset
ffdf <- subset.ff(ffdf, start==as.Date("2012-01-01",format="%Y-%m-%d"))
#ERROR: Error in ffdf(x = x) : ffdf components must be atomic ff objects
推荐答案
使用来自ffbase包的subset.ffdf.子集是R中的通用函数,并且ffbase为ffdf对象实现了子集.因此,您可以像使用常规数据框一样使用子集.
Use subset.ffdf from package ffbase. Subset is a generic function in R, and ffbase implements it for ffdf objects. So you can just use subset as you would do with a regular data frame.
df <- data.frame(start=c("01/01/2010", "01/01/2011", "01/01/2012", "01/01/2012", "01/01/2012"),end=c("31/12/2010", "31/12/2011", "31/12/2012", "31/12/2012", "31/12/2012"),amount=c(10,20,30,40,50))
df$start <- as.Date(df$start, "%d/%m/%Y")
df$end<- as.Date(df$end, "%d/%m/%Y")
require(ffbase)
myffdf <- as.ffdf(df)
test <- subset(myffdf , start==as.Date("2012-01-01",format="%Y-%m-%d"))
test
这篇关于如何按日期对R中的大数据帧(ffdf)进行子集化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文