我如何读取dBase文件的一部分 [英] How can I read a part of a dBase file

查看:147
本文介绍了我如何读取dBase文件的一部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的dBase文件(1.64Gb)。使用标准 foreign :: read.dbf()函数将整个文件加载到R中需要很长时间。我只想在数据集中加载一些变量。有人解决方案吗?

I have a very large dBase file (1.64Gb). It takes a very long time to load the whole file in R using the standard foreign::read.dbf() function. I would like to load only a few variables in the dataset. Does anyone have a solution ?

推荐答案

我认为 read.dbf(...) foreign 中的code>函数用于读取shapefile的 *。dbf 部分,其中在文件的一部分中进行大小写读取确实没有任何意义。您似乎想做些不同的事情。

I think the read.dbf(...) function in package foreign was intended for reading the *.dbf part of a shapefile, in which case reading in part of the file really doesn't make sense. You seem to want to do something different.

使用 RODBC 可能 的工作方式,具体取决于系统的配置方式。如果您正在运行Windows,并且已经安装了dBASE ODBC驱动程序,那么这可能会为您工作(注意:安装MSOffice时,它将设置一个名为 dBase Files的用户dsn,可以从<$访问c $ c> RODBC 。因此,如果您已安装MSOffice,则应该可以使用...)。

Using RODBC might work, depending on how your system is configured. If you're running Windows, and if you have the dBASE ODBC drivers installed, this will probably work for you (note: When you install MSOffice, it sets up a user dsn called "dBase Files", which should be accessible from RODBC. So if you have MSOffice installed, this should work...).

重要注意 :仅在运行32位版本的R时,此方法才有效。这是因为没有64位dBASE ODBC驱动程序。通常,当您下载64位R时,会同时获得32位和64位版本,因此只需在它们之间进行切换即可。

Important Note: This will only work if you are running a 32-bit version of R. This is because there are no 64-bit dBASE ODBC drivers. Generally, when you download 64-bit R you get both 32- and 64-bit versions, so it's just a matter of switching between them.

library(RODBC)
# setwd("< directory with your files >")
conn <- odbcConnect(dsn="dBASE Files")
df   <- sqlFetch(conn,"myTable",max=10)   # grab first ten rows
head(df)
#       LENGTH COASTLN010
# 1 0.02482170          1
# 2 0.01832134          2
# 3 0.03117752          3
# 4 0.04269755          4
# 5 0.02696307          5
# 6 0.05047828          6

sqlQuery(conn,"select * from myTable where LENGTH<0.008")
#       LENGTH COASTLN010
# 1 0.00625200        186
# 2 0.00634897        379
# 3 0.00733319       1583
# 4 0.00369786       1617
# 5 0.00722233       1618
# 6 0.00524176       1636

上面的示例只是为了让您了解如何使用 RODBC 。在此示例中,我在包含所有文件的目录中有一个文件 myTable.dbf ,此dbf有两列 LENGTH COASTLN010 (此文件实际上是海岸线shapefile的一部分,但这无关紧要...)。

The example above is just meant to give you an idea of how to use RODBC. In this example, I have a file, myTable.dbf in the "directory with all your files", and this dbf has two columns, LENGTH and COASTLN010 (this file actually is part of a coastline shapefile, but that is irrelevant...).

如果这不起作用,请尝试:

If this doesn't work try:

 conn <- odbcConnectDbase("myTable.dbf")

这篇关于我如何读取dBase文件的一部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆