比较fread与read.table的速度以读取100M中的前1M行 [英] Comparing speed of fread vs. read.table for reading the first 1M rows out of 100M
问题描述
我有一个14GB的data.txt文件。我通过读取前100万行来比较 read
和 read.table
的速度。看起来 fread
慢得多,尽管不应该这样。显示百分比计数需要一些时间。
I have a 14GB data.txt file. I was comparing the speed of fread
and read.table
by reading the first 1M rows. It looks like fread
is much slower although it is not supposed to be. It takes some time until the percentage counts show up.
可能是什么原因?我以为应该是超级快...我正在使用Windows OS计算机。
What could be the reason? I thought it was supposed to be super fast... I am using a Windows OS computer.
推荐答案
fread
mmap
s文件。这需要一些时间,并将映射整个文件。这意味着后续的读入操作会更快。
fread
mmap
s the file. This takes some time, and will map the whole file. This means subsequent "read-ins" will be faster.
read.table
不会 mmap
整个文件。它可以逐行读取文件,并在行1000000处停止。
read.table
does not mmap
the whole file. It can read in the file line by line [and stop at line 1000000].
您可以在 mmap $ c $上看到一些背景c>
mmap()
与阅读块
You can see some background on mmap
at mmap()
vs. reading blocks
fread
帮助中的示例突出了这种行为
The examples in the help from fread
highlight this behaiviour
这篇关于比较fread与read.table的速度以读取100M中的前1M行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!