蟒蛇循环输入文件 [英] python looping through input file

查看:195
本文介绍了蟒蛇循环输入文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题与Python中的文件输入相关,使用 open()。我有一个带有3行的文本文件 mytext.txt
我试图用这个文件做两件事:打印行,并打印行数。

我试了下面的代码:

  input_file = open('mytext.txt','r')
count_lines = 0
for input_file :
print line
for input_file中的行:
count_lines + = 1
print'行数:',count_lines
pre>

结果:它正确地打印了3行,但打印行数:0(而不是3)




我找到了两种方法来解决这个问题,并打印 3



1)我使用一个循环而不是两个

$ $ $ $ $ $ $ $ input_file = open('mytext.txt' ,'r')
count_lines = 0
用于input_file中的行:
print line
count_lines + = 1
print'行数:',count_lines



2)在第一个循环之后,我再次定义了input_file

  INPUT_FILE = open('mytext.txt','r')
count_lines = 0
用于输入文件中的行:
打印行
input_file = open('mytext.txt',' r')
在input_file中的行:
count_lines + = 1
print'行数:',count_lines






对我来说,好像定义 input_file = ... 是只有一个循环有效,就好像它在循环中被删除后一样。但我不明白为什么,可能它还不是100%清楚,如何在Python中处理 variable = open(filename)



顺便说一句,我看到在这种情况下,最好只使用一个循环。不过,我觉得我必须清楚这个问题,因为可能会有这样的情况发生。/ b> 解决方案

文件句柄是一个迭代器。在迭代文件之后,指针将被定位在EOF(文件的结尾),并且迭代器将引发退出循环的StopIteration。如果你试图使用一个迭代器来指向EOF的文件,它将会提高StopIteration并退出:这就是为什么它在第二个循环中计数为零的原因。你可以用 input_file.seek(0)来重放文件指针而不用重新打开它。

也就是说,在同一个循环中有更多的I / O效率,否则你必须再次从磁盘读取整个文件来计算行数。这是一个很常见的模式:

  with open('filename.ext')as input_file:
for i, (enumerate)中的行(input_file):
print line,
print{0} line print(s)print.format(i + 1)
__ enter __ 和 __ exit __ code>用 c>语句接口。这是类似于以下语法的语法糖:

  input_file = open('filename.txt')
try:
为i,枚举行(input_file):
打印行,
finally:
input_file.close()
打印{0}行。格式(i + 1)

我认为cPython会在收集垃圾时关闭文件句柄,不知道这对每个实现都适用 - 恕我直言,最好是明确关闭资源句柄。


My question is related to file-input in Python, using open(). I have a text file mytext.txt with 3 lines. I am trying to do two things with this file: print the lines, and print the number of lines.

I tried the following code:

input_file = open('mytext.txt', 'r')
count_lines = 0
for line in input_file:
    print line
for line in input_file:
    count_lines += 1
print 'number of lines:', count_lines

Result: it prints the 3 lines correctly, but prints "number of lines: 0" (instead of 3)


I found two ways to solve it, and get it to print 3:

1) I use one loop instead of two

input_file = open('mytext.txt', 'r')
count_lines = 0
for line in input_file:
    print line
    count_lines += 1
print 'number of lines:', count_lines

2) after the first loop, I define input_file again

input_file = open('mytext.txt', 'r')
count_lines = 0
for line in input_file:
    print line
input_file = open('mytext.txt', 'r')
for line in input_file:
    count_lines += 1
print 'number of lines:', count_lines


To me, it seems like the definition input_file = ... is valid for only one looping, as if it was deleted after I use it for a loop. But I don't understand why, probably it is not 100% clear to me yet, how variable = open(filename) treated in Python.

By the way, I see that in this case it is better to use only one loop. However, I feel I have to get this question clear, since there might be cases when I can/must make use of it.

解决方案

The file handle is an iterator. After iterating over the file, the pointer will be positioned at EOF (end of file) and the iterator will raise StopIteration which exits the loop. If you try to use an iterator for a file where the pointer is at EOF it will just raise StopIteration and exit: that is why it counts zero in the second loop. You can rewind the file pointer with input_file.seek(0) without reopening it.

That said, counting lines in the same loop is more I/O efficient, otherwise you have to read the whole file from disk a second time just to count the lines. This is a very common pattern:

with open('filename.ext') as input_file:
    for i, line in enumerate(input_file):
        print line,
print "{0} line(s) printed".format(i+1)

In Python 2.5, the file object has been equipped with __enter__ and __exit__ to address the with statement interface. This is syntactic sugar for something like:

input_file = open('filename.txt')
try:
    for i, line in enumerate(input_file):
        print line,
finally:
    input_file.close()
print "{0} line(s) printed".format(i+1)

I think cPython will close file handles when they get garbage collected, but I'm not sure this holds true for every implementation - IMHO it is better practice to explicitly close resource handles.

这篇关于蟒蛇循环输入文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆