Python不会读取整个文本文件 [英] Python Does Not Read Entire Text File

查看:253
本文介绍了Python不会读取整个文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我的主要目标是能够用另一个字符串替换文件中字符串的出现。有没有一种方法可以访问文件中的所有行



问题是,当我尝试读取一个大的文本文件例如,我会做一个非常简单的命令,比如:

$($ $ $ $ $ $ $ $ $ $ $ $ $ $) b
$ b

  newfile = open(newfile.txt,w)
f = open(filename.txt,r)
换行
换行= pre>

它只写入原始文件的第一个382 MB。有没有人以前遇到过这个问题?



我尝试了一些不同的解决方案,例如:

  import fileinput 
为我,在枚举线(fileinput.input(filename.txt,inplace = 1)
sys.stdout.write(line.replace(string1,string2)

但是它具有相同的效果,例如使用

  f.read(10000)



我已经把它缩小到很可能是一个阅读的问题,而不是一个写作的问题,因为它发生的只是打印出来的线,我知道有更多的线,当我打开一个全文编辑器,如Vim,我可以看到最后一行应该是什么,它不是python打印的最后一行。



任何人都可以提供任何建议或尝试吗? p>

我目前正在使用Windows XP的32位版本,内存为3.25 GB,运行Python 2.7 p

*找到编辑解决方案(感谢Lattyware)。使用Itera

$ p $ def read_in_chunks(file,chunk_size = 1000):
True:
data = file。 read(chunk_size)
if not data:break
yield data


解决方案

试试:

  f = open(filename.txt,rb)

在Windows上, rb 表示以二进制模式打开文件。根据文档,文本模式与二进制模式仅影响行尾字符。但是(如果我没有记错的话),我相信在Windows上以文本模式打开文件也会使用EOF(十六进制1A)做一些事情。

你也可以指定模式, code> fileinput

  fileinput.input(filename.txt,inplace = 1,mode =rb)


I'm running into a problem that I haven't seen anyone on StackOverflow encounter or even google for that matter.

My main goal is to be able to replace occurences of a string in the file with another string. Is there a way there a way to be able to acess all of the lines in the file.

The problem is that when I try to read in a large text file (1-2 gb) of text, python only reads a subset of it.

For example, I'll do a really simply command such as:

newfile = open("newfile.txt","w")
f = open("filename.txt","r")
for line in f:
    replaced = line.replace("string1", "string2")
    newfile.write(replaced)

And it only writes the first 382 mb of the original file. Has anyone encountered this problem previously?

I tried a few different solutions such as using:

import fileinput
for i, line in enumerate(fileinput.input("filename.txt", inplace=1)
   sys.stdout.write(line.replace("string1", "string2")

But it has the same effect. Nor does reading the file in chunks such as using

f.read(10000)

I've narrowed it down to mostly likely being a reading in problem and not a writing problem because it happens for simply printing out lines. I know that there are more lines. When I open it in a full text editor such as Vim, I can see what the last line should be, and it is not the last line that python prints.

Can anyone offer any advice or things to try?

I'm currently using a 32-bit version of Windows XP with 3.25 gb of ram, and running Python 2.7

*Edit Solution Found (Thanks Lattyware). Using an Iterator

def read_in_chunks(file, chunk_size=1000): 
   while True: 
      data = file.read(chunk_size) 
      if not data: break 
      yield data

解决方案

Try:

f = open("filename.txt", "rb")

On Windows, rb means open file in binary mode. According to the docs, text mode vs. binary mode only has an impact on end-of-line characters. But (if I remember correctly) I believe opening files in text mode on Windows also does something with EOF (hex 1A).

You can also specify the mode when using fileinput:

fileinput.input("filename.txt", inplace=1, mode="rb")

这篇关于Python不会读取整个文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆