比较fread与read.table的速度以读取100M中的前1M行 [英] Comparing speed of fread vs. read.table for reading the first 1M rows out of 100M

查看:361
本文介绍了比较fread与read.table的速度以读取100M中的前1M行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个14GB的data.txt文件。我通过读取前100万行来比较 read read.table 的速度。看起来 fread 慢得多,尽管不应该这样。显示百分比计数需要一些时间。

I have a 14GB data.txt file. I was comparing the speed of fread and read.table by reading the first 1M rows. It looks like fread is much slower although it is not supposed to be. It takes some time until the percentage counts show up.

可能是什么原因?我以为应该是超级快...我正在使用Windows OS计算机。

What could be the reason? I thought it was supposed to be super fast... I am using a Windows OS computer.

推荐答案

fread mmap s文件。这需要一些时间,并将映射整个文件。这意味着后续的读入操作会更快。

fread mmaps the file. This takes some time, and will map the whole file. This means subsequent "read-ins" will be faster.

read.table 不会 mmap 整个文件。它可以逐行读取文件,并在行1000000处停止。

read.table does not mmap the whole file. It can read in the file line by line [and stop at line 1000000].

您可以在 mmap mmap()与阅读块

You can see some background on mmap at mmap() vs. reading blocks

fread 帮助中的示例突出了这种行为

The examples in the help from fread highlight this behaiviour

这篇关于比较fread与read.table的速度以读取100M中的前1M行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆