为什么不截断正确的文件当前位置? [英] Why isn't truncate defaulting properly to the current position for files?

查看:318
本文介绍了为什么不截断正确的文件当前位置?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

另一个问题的答案中,观察到奇怪的行为,特定于Python 3. $ b


调整 truncate 命令以字节(或未指定大小的当前位置)流式传输给定大小。当前的流位置没有改变。此调整大小可以扩展或减少当前的文件大小。在扩展的情况下,新文件区的内容取决于平台(在大多数系统上,附加字节是零填充的)。 新的文件大小被返回。


然而...

 >>> open('temp.txt','w')。write('ABCDE \\\
FGHIJ\\\
KLMNO\\\
PQRST\\\
UVWXY\\\
Z\\\
')
32
>> > f = open('temp.txt','r +')
>>> f.readline()
'ABCDE \\\
'
>>> f.tell()
6#如预期的那样,readline
>>>后的当前位置是6。 f.truncate()
32#?!

不是在当前位置截断(6),而是在文件末尾截断一点也不)。这是通过检查磁盘上的文件来验证的。

这个过程在Python 2中按预期工作(文件被截断为6个字节),在Python 3中也使用 StringIO 而不是文件。为什么Python 3中的文件无法按预期工作?这是一个错误?



(编辑:如果显式 f.seek(6)给出在 truncate 之前。)

解决方案

 >>> open('temp.txt','w')。write('ABCDE \\\
FGHIJ\\\
KLMNO\\\
PQRST\\\
UVWXY\\\
Z\\\
')
32
>> > f = open('temp.txt','r +')
>>> f.readline()
'ABCDE \\\
'
>>> f.seek(6)
>>> f.truncate()

这个问题解决了这个问题,想法,但是如果还没有报告,这将是一件好事情。



这些是唯一的与我可以找到Python3和Python2之间的函数truncate()(除了显然在截断函数本身中的相关函数调用):

  33,34c33,34 
<除了AttributeError作为错误之外:
<从err
---
>中引发TypeError(需要一个整数)。除了AttributeError:
>引发TypeError(需要一个整数)
54c54
< 截断大小pos,pos是一个int。
---
> 截断大小到位。

我确定有人会用手指主题,但我认为它更多地涉及到 flush()调用,以及一旦调用flush之后如何处理缓冲区。就好像在刷新所有的I / O之后它不会重置到之前的位置一样。这是一个疯狂的假设,没有技术的东西来备份,但这将是我的第一个猜测。
$ b

签入到 flush() / code>情况,下面是两者之间的唯一区别,其中Python2执行以下Python3不执行的操作(甚至缺少源代码):

  def _flush_unlocked(self):
如果self.closed:
增加ValueError(flush of closed file)
while self._write_buf:
试试:
n = self.raw.write(self._write_buf)
除了BlockingIOError:
raise RuntimeError(self.raw应该实现RawIOBase:它
应该不会引发BlockingIOError)
,除了IOError e:
if e.errno!= EINTR:
raise
continue
如果n是None:
raise BlockingIOError(
errno.EAGAIN,
write不能完成没有阻塞,0)
如果n> len(self._write_buf)或n < 0:
提高IOError(write()返回不正确的字节数)
del self._write_buf [:n]

它是 BufferedWriter 的函数,它似乎在这个I / O操作中使用。

现在我迟到了,所以要碰碰运气,会很有趣,看看你们在同一时间找到什么!

In an answer to another question, an odd behavior was observed, specific to Python 3. The documentation for the truncate command states (emphasis mine):

Resize the stream to the given size in bytes (or the current position if size is not specified). The current stream position isn’t changed. This resizing can extend or reduce the current file size. In case of extension, the contents of the new file area depend on the platform (on most systems, additional bytes are zero-filled). The new file size is returned.

However...

>>> open('temp.txt', 'w').write('ABCDE\nFGHIJ\nKLMNO\nPQRST\nUVWXY\nZ\n')
32
>>> f = open('temp.txt', 'r+')
>>> f.readline()
'ABCDE\n'
>>> f.tell()
6                   # As expected, current position is 6 after the readline
>>> f.truncate()
32                  # ?!

Instead of truncating at the current position (6), it truncated at the end of the file (i.e. not at all). This was verified by checking the file on disk.

This process works as expected (file truncated to 6 bytes) in Python 2, and also in Python 3 using a StringIO instead of a file. Why is it not working as expected with files in Python 3? Is this a bug?

(Edit: it also works properly if an explicit f.seek(6) is given right before the truncate.)

解决方案

>>> open('temp.txt', 'w').write('ABCDE\nFGHIJ\nKLMNO\nPQRST\nUVWXY\nZ\n')
32
>>> f = open('temp.txt', 'r+')
>>> f.readline()
'ABCDE\n'
>>> f.seek(6) 
>>> f.truncate()

This fixes the issue if nothing else, as to why this happens I have no idea but it would be a good thing to report this up-stream if it isn't already.

These are the only textural differences to the truncate() functions between Python3 and Python2 that I could find (except for related function calls within the truncate function itself obviously):

33,34c33,34
<             except AttributeError as err:
<                 raise TypeError("an integer is required") from err
---
>             except AttributeError:
>                 raise TypeError("an integer is required")
54c54
<         """Truncate size to pos, where pos is an int."""
---
>         """Truncate size to pos."""

I'm sure someone will slap my fingers on the subject, but I think it's more related to the flush() calls and how the buffer is handled once you call flush. Almost as if it doesn't reset to it's previous position after flushing all the I/O. it's a wild assumption with no technical stuff to back it up yet, but it would be my first guess.

Checked into the flush() situation, here's the only difference between the two, of which Python2 performs the following operation that Python3 does not (even lacks the source code for it):

def _flush_unlocked(self):
    if self.closed:
        raise ValueError("flush of closed file")
    while self._write_buf:
        try:
            n = self.raw.write(self._write_buf)
        except BlockingIOError:
            raise RuntimeError("self.raw should implement RawIOBase: it "
                               "should not raise BlockingIOError")
        except IOError as e:
            if e.errno != EINTR:
                raise
            continue
        if n is None:
            raise BlockingIOError(
                errno.EAGAIN,
                "write could not complete without blocking", 0)
        if n > len(self._write_buf) or n < 0:
            raise IOError("write() returned incorrect number of bytes")
        del self._write_buf[:n]

It's a function of BufferedWriter which appears to be used in this I/O operation.
Now I'm late for a date so gotta dash, will be interesting to see what you guys find in the mean time!

这篇关于为什么不截断正确的文件当前位置?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆