readinto()的替代? [英] readinto() replacement?

查看:141
本文介绍了readinto()的替代?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Python中的一个直接的方法复制文件通常是这样的:

Copying a File using a straight-forward approach in Python is typically like this:

def copyfileobj(fsrc, fdst, length=16*1024):
    """copy data from file-like object fsrc to file-like object fdst"""
    while 1:
        buf = fsrc.read(length)
        if not buf:
            break
        fdst.write(buf)

(这code段是从shutil.py,顺便说一句)。

(This code snippet is from shutil.py, by the way).

不幸的是,这有缺点的在我的特殊用例(包括线程和非常大的缓冲区)[斜体部分后来添加] 的。首先,这意味着具有读的各()调用新的存储块被分配并且当buf中的下一次迭代这个存储器被释放被覆盖,只能用于同样的目的再次分配新的内存。这会降低整个过程,并把主机上不必要的负载。

Unfortunately, this has drawbacks in my special use-case (involving threading and very large buffers) [Italics part added later]. First, it means that with each call of read() a new memory chunk is allocated and when buf is overwritten in the next iteration this memory is freed, only to allocate new memory again for the same purpose. This can slow down the whole process and put unnecessary load on the host.

要避免这种情况,我使用的,不幸的是,被记录为德precated和不使用的file.readinto()方法:

To avoid this I'm using the file.readinto() method which, unfortunately, is documented as deprecated and "don't use":

def copyfileobj(fsrc, fdst, length=16*1024):
    """copy data from file-like object fsrc to file-like object fdst"""
    buffer = array.array('c')
    buffer.fromstring('-' * length)
    while True:
        count = fsrc.readinto(buffer)
        if count == 0:
            break
        if count != len(buffer):
            fdst.write(buffer.toString()[:count])
        else:
            buf.tofile(fdst)

我的解决方案工作,但也有两个缺点,以及:首先,readinto()不被使用。它可能走(说的文档)。其次,随着readinto()我不能决定我要多少字节读入缓冲区和buffer.tofile()我不能决定我想多少写,因此对于最后一个块繁琐的特殊情况(这也是不必要的昂贵的)。

My solution works, but there are two drawbacks as well: First, readinto() is not to be used. It might go away (says the documentation). Second, with readinto() I cannot decide how many bytes I want to read into the buffer and with buffer.tofile() I cannot decide how many I want to write, hence the cumbersome special case for the last block (which also is unnecessarily expensive).

我看了array.array.fromfile(),但它不能被用来读取一切有(读取,然后引发EOFError,不把手伸到处理的项目数)。此外,它是用于结束特殊情况的问题没有解决。

I've looked at array.array.fromfile(), but it cannot be used to read "all there is" (reads, then throws EOFError and doesn't hand out the number of processed items). Also it is no solution for the ending special-case problem.

有没有做我想要做一个适当的方式?也许我只是俯瞰简单的缓冲类或类似这我想要做什么。

Is there a proper way to do what I want to do? Maybe I'm just overlooking a simple buffer class or similar which does what I want.

推荐答案

普通的Python code不会需要关闭这样的调整,因为这 - 但是如果你真的需要所有的性能调整到Python的$读取里面的文件C $ C(如,你在你重写写了一些服务器COE和已经工作的性能或内存使用情况),我宁愿直接调用使用ctypes的操作系统 - 这样就拥有了一份执行最低水平,因为我想太多。

Normal Python code would not be in need off such tweaks as this - however if you really need all that performance tweaking to read files from inside Python code (as in, you are on the rewriting some server coe you wrote and already works for performance or memory usage) I'd rather call the OS directly using ctypes - thus having a copy performed as low level as I want too.

它甚至有可能是简单的调用CP的可执行文件作为外部进程是你的情况少了关卡(并会采取所有的操作系​​统和文件系统级优化的全面优势,为你)。

It may even be possible that simple calling the "cp" executable as an external process is less of a hurdle in your case (and it would take full advantages of all OS and filesystem level optimizations for you).

这篇关于readinto()的替代?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆