比较 fread 与 read.table 读取 100M 中前 1M 行的速度 [英] Comparing speed of fread vs. read.table for reading the first 1M rows out of 100M

查看:13
本文介绍了比较 fread 与 read.table 读取 100M 中前 1M 行的速度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 14GB 的 data.txt 文件.我通过读取前 1M 行来比较 freadread.table 的速度.看起来 fread 慢得多,尽管它不应该如此.百分比计数出现需要一些时间.

I have a 14GB data.txt file. I was comparing the speed of fread and read.table by reading the first 1M rows. It looks like fread is much slower although it is not supposed to be. It takes some time until the percentage counts show up.

可能是什么原因?我认为它应该超级快......我使用的是 Windows 操作系统计算机.

What could be the reason? I thought it was supposed to be super fast... I am using a Windows OS computer.

推荐答案

fread mmaps 文件.这需要一些时间,并且会映射整个文件.这意味着后续的读入"会更快.

fread mmaps the file. This takes some time, and will map the whole file. This means subsequent "read-ins" will be faster.

read.table 不会mmap 整个文件.它可以逐行读取文件[并在第 1000000 行停止].

read.table does not mmap the whole file. It can read in the file line by line [and stop at line 1000000].

您可以在 mmap() 上查看 mmap 的一些背景知识 vs. 读块

You can see some background on mmap at mmap() vs. reading blocks

fread 帮助中的示例突出显示了这种行为

The examples in the help from fread highlight this behaiviour

这篇关于比较 fread 与 read.table 读取 100M 中前 1M 行的速度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆