大型CSV文件无法完全读入R data.frame [英] Large csv file fails to fully read in to R data.frame

查看:237
本文介绍了大型CSV文件无法完全读入R data.frame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将相当大的csv文件加载到R中.它具有约50列和200万行.

I am trying to load a fairly large csv file into R. It has about 50 columns and 2million row.

我的代码很基本,我以前用它来打开文件,但没有一个这么大.

My code is pretty basic, and I have used it to open files before but none this large.

mydata <- read.csv('file.csv', header = FALSE, sep=",", stringsAsFactors = FALSE)

结果是它读取数据,但在1080000行左右后停止.这也是excel停止的地方.他们是让R读取整个文件的方法吗?为什么它会停止约一半.

The result is that it reads in the data but stops after 1080000 rows or so. This is roughly where excel stops as well. Is their way to get R to read the whole file in? Why is it stopping around half way.

更新:(11/30/14) 与数据提供者交谈后,发现他们可能是文件损坏的问题.提供了一个新文件,该文件也较小,可以轻松加载到R中.

Update: (11/30/14) After speaking with the provider of the data it was discovered that they may have been some corruption issue with the file. A new file was provided which also is smaller and loads into R easily.

推荐答案

此外,"read.csv()"最多读取1080000行,从library(data.table)读取"fread"应该很容易.如果没有,则存在另外两个选项,或者尝试使用library(h20)或使用"fread",您可以使用select选项来读取所需的列(或者读入两半,进行一些清理,然后可以将它们合并回去).

As, "read.csv()" read up to 1080000 rows, "fread" from library(data.table) should read it with ease. If not, there exists two other options, either try with library(h20) or with "fread" you can use select option to read required columns (or read in two halves, do some cleaning and can merge them back).

这篇关于大型CSV文件无法完全读入R data.frame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆