使用seek和next()调用读取文件时,是否有回退的方法? [英] Is there a way to go back when reading a file using seek and calls to next()?

查看:217
本文介绍了使用seek和next()调用读取文件时,是否有回退的方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个Python脚本来读取文件,当我到达文件的某个部分时,读取该部分中这些行的最终方法取决于该部分中也提供的信息.所以我在这里找到了此处我可以使用类似的

I'm writing a Python script to read a file, and when I arrive at a section of the file, the final way to read those lines in the section depends on information that's given also in that section. So I found here that I could use something like

fp = open('myfile')
last_pos = fp.tell()
line = fp.readline()
while line != '':
  if line == 'SPECIAL':
  fp.seek(last_pos)
  other_function(fp)
  break
last_pos = fp.tell()
line = fp.readline()

但是,我当前代码的结构如下:

Yet, the structure of my current code is something like the following:

fh = open(filename)

# get generator function and attach None at the end to stop iteration
items = itertools.chain(((lino,line) for lino, line in enumerate(fh, start=1)), (None,))
item = True

  lino, line = next(items)

  # handle special section
  if line.startswith['SPECIAL']:

    start = fh.tell()

    for i in range(specialLines):
      lino, eline = next(items)
      # etc. get the special data I need here

    # try to set the pointer to start to reread the special section  
    fh.seek(start)

    # then reread the special section

但是这种方法会产生以下错误:

But this approach gives the following error:

next()调用禁用了演讲位置

telling position disabled by next() call

有办法防止这种情况吗?

Is there a way to prevent this?

推荐答案

将文件用作迭代器(例如,在文件上调用next()或在for循环中使用它)使用内部缓冲区;实际文件读取位置沿文件更远,使用.tell()不会给您下一行要产生的位置.

Using the file as an iterator (such as calling next() on it or using it in a for loop) uses an internal buffer; the actual file read position is further along the file and using .tell() will not give you the position of the next line to yield.

如果需要来回搜索,解决方案是不直接在文件对象上使用next(),而仅使用file.readline().您仍然可以使用迭代器,使用iter()的两个参数版本:

If you need to seek back and forth, the solution is not to use next() directly on the file object but use file.readline() only. You can still use an iterator for that, use the two-argument version of iter():

fileobj = open(filename)
fh = iter(fileobj.readline, '')

fileiterator()上调用next()将调用fileobj.readline(),直到该函数返回空字符串.实际上,这创建了一个使用内部缓冲区的文件迭代器.

Calling next() on fileiterator() will invoke fileobj.readline() until that function returns an empty string. In effect, this creates a file iterator that doesn't use the internal buffer.

演示:

>>> fh = open('example.txt')
>>> fhiter = iter(fh.readline, '')
>>> next(fhiter)
'foo spam eggs\n'
>>> fh.tell()
14
>>> fh.seek(0)
0
>>> next(fhiter)
'foo spam eggs\n'

请注意,您的enumerate链可以简化为:

Note that your enumerate chain can be simplified to:

items = itertools.chain(enumerate(fh, start=1), (None,))

尽管我很茫然,为什么您认为这里需要一个(None,)哨兵; StopIteration仍然会被提高,尽管稍后会再调用一次next().

although I am in the dark why you think a (None,) sentinel is needed here; StopIteration will still be raised, albeit one more next() call later.

要读取specialLines个计数行,请使用itertools.islice():

To read specialLines count lines, use itertools.islice():

for lino, eline in islice(items, specialLines):
    # etc. get the special data I need here

您可以直接在fh上循环,而无需使用无限循环,并且next()也可以在此处调用:

You can just loop directly over fh instead of using an infinite loop and next() calls here too:

with open(filename) as fh:
    enumerated = enumerate(iter(fileobj.readline, ''), start=1):
    for lino, line in enumerated:
        # handle special section
        if line.startswith['SPECIAL']:
            start = fh.tell()

            for lino, eline in islice(items, specialLines):
                # etc. get the special data I need here

            fh.seek(start)

但是请注意,即使回头搜索,行号仍然会增加!

but do note that your line numbers will still increment even when you seek back!

不过,您可能希望重构代码,而无需重新读取文件的各个部分.

You probably want to refactor your code to not need to re-read sections of your file, however.

这篇关于使用seek和next()调用读取文件时,是否有回退的方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆