将选择crteria添加到read.table [英] Add selection crteria to read.table

查看:189
本文介绍了将选择crteria添加到read.table的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们使用 read.table 导入数据集的以下简化版本:

Let's take the following simplified version of a dataset that I import using read.table:

a<-as.data.frame(c("M","M","F","F","F"))
b<-as.data.frame(c(25,22,33,17,18))
df<-cbind(a,b)
colnames(df)<-c("Sex","Age")

实际上我的数据集非常大,我只对一小部分数据感兴趣,即有关18岁或以下女性的数据。在上面的例子中,这只是最后两次观察。

In reality my dataset is extremely large and I'm only interested in a small proportion of the data i.e. the data concerning Females aged 18 or under. In the example above this would be just the last 2 observations.

我的问题是,我可以立即导入这些观察而不导入其余数据然后使用 subset 来优化我的数据库。我的计算机容量有限,因此我一直在使用 scan 以块的形式导入我的数据,但非常耗时。

My question is, can I just import these observations immediately without importing the rest of the data then using subset to refine my database. My computer's capacities are limited and so I have been using scan to import my data in chunks but this is extremely time consuming.

有更好的解决方案吗?

推荐答案

这与@ Drew75的回答几乎相同但是我要用它来说明SQLite的一些问题:

This is almost the same as @Drew75's answer but I'm including it to illustrate some gotcha's with SQLite:

# example: large-ish data.frame
df <- data.frame(Sex=sample(c("M","F"),1e6,replace=T),
                 Age=sample(18:75,1e6,replace=T))
write.csv(df, "myData.csv", quote=F, row.names=F)  # note: non-quoted strings

library(sqldf)
myData <- read.csv.sql(file="myData.csv",       # looks for char M (no qoutes)
                       sql="select * from file where Sex='M'", eol = "\n")
nrow(myData)
# [1] 500127

write.csv(df, "myData.csv", row.names=F)        # quoted strings...
myData <- read.csv.sql(file="myData.csv",       # this fails
                       sql="select * from file where Sex='M'", eol = "\n")
nrow(myData)
# [1] 0
myData <- read.csv.sql(file="myData.csv",       # need quotes in the char literal
                       sql="select * from file where Sex='\"M\"'", eol = "\n")
nrow(myData)
# [1] 500127

这篇关于将选择crteria添加到read.table的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆