有没有一种真正有效(快速)的方式来读取 python 中的大文本文件? [英] Is there a really efficient (FAST) way to read large text files in python?

查看:66
本文介绍了有没有一种真正有效(快速)的方式来读取 python 中的大文本文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望尽可能快地在 python 中打开并从大型文本文件中获取数据(它几乎有 62603143 行 - 大小为 550MB).因为我不想给我的电脑带来压力,所以我按照以下方式来做,

I am looking to open and fetch data from a large text file in python as fast as possible (It almost has 62603143 lines - size 550MB). As I don't want to stress my computer, I am doing it by following way ,

import time
start = time.time()
for line in open(filePath):
    #considering data as last element in file
    if data in line:
        do_something(data)
end = time.time()
print "processing time = %s" % (count, end-start)

但正如我通过上述方法所做的那样,读取完整文件几乎需要 18 秒(我的计算机具有 Intel i3 处理器和 4 GB RAM).同样,如果文件大小更大,则需要更多时间并且从用户的角度考虑它非常大.我在论坛上阅读了很多意见,提到了多个 Stack Overflow 问题,但没有获得从大文件中读取和获取数据的快速有效的方法.Python中真的有办法在几秒钟内读取大文本文件吗?

But as I am doing by above method its taking almost 18 seconds to read full file ( My computer has Intel i3 processor and 4 GB RAM ). Likewise if file size is more it is taking more time and considering user point of view its very large. I read lot of opinions on forums, referred multiple Stack Overflow questions but didn't get the fast and efficient way to read and fetch the data from large files. Is there really any way in Python to read large text files in few seconds?

推荐答案

不,没有比 Python 更快的逐行处理文件的方法.

No, there is no faster way of processing a file line by line, not from Python.

您的瓶颈在于硬件,而不是您读取文件的方式.Python 已经尽其所能(在拆分为换行符之前,使用缓冲区以更大的块读取文件).

Your bottleneck is your hardware, not how you read the file. Python is already doing everything it can (using a buffer to read the file in larger chunks before splitting into newlines).

我建议您将磁盘升级为 SSD.

I suggest upgrading your disk to an SSD.

这篇关于有没有一种真正有效(快速)的方式来读取 python 中的大文本文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆