NamedTemporaryFile 速度平平 [英] NamedTemporaryFile speed underwhelming

查看:46
本文介绍了NamedTemporaryFile 速度平平的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 NamedTemporaryFile 并将此对象传递给外部在使用 Popen 收集输出之前要使用的程序.我希望这会比在硬盘上创建真实文件更快,并尽可能避免 IO.我创建的临时文件的大小很小,大约 KB 左右,我发现创建临时文件实际上比使用普通文件进行读/写要慢.我在这里缺少什么技巧吗?当我使用 NamedTemporaryFile 时,幕后发生了什么?

Im trying to use a NamedTemporaryFile and pass this object to an external program to use, before collecting the output using Popen. My hope was that this would be quicker than creating a real file on the hard-disk and avoid as much IO as possible. This size of the temp files I am creating are small, on the order of a KB or so, and I am finding that creating a temp file to work with is actually slower than using a normal file for reading/writing. Is there a trick I am missing here? What is going on behind the scenes when I use a NamedTemporaryFile?

# Using named temp file
with tempfile.NamedTemporaryFile(delete=False) as temp:  # delete=False to keep a reference to the file for process calls
    for idx, item in enumerate(r):
        temp.write(">{}\n{}\n".format(idx, item[1]))
>>> 8.435 ms

# Using normal file io
with open("test.fa", "w") as temp:
    for idx, item in enumerate(r):
        temp.write(">{}\n{}\n".format(idx, item[1]))
>>> 0.506 ms

#--------

# Read using temp file
[i for i in open(name, "r")]
>>> 1.167 ms

[i for i in open("test.fa", "r")]
>>> 0.765 ms

做一些分析,似乎几乎所有的时间都花在了创建临时对象上.在这个例子中使用 tempfile.NamedTemporaryFile(delete=False) 需要超过 8 毫秒

Doing a bit of profiling it seems almost the entire time is spent creating the temp object. Using tempfile.NamedTemporaryFile(delete=False) takes over 8 ms in this example

推荐答案

我会尽量回答你的问题,虽然我对 Python 运行时效率不是很有经验.

I will try to answer your question although I am not very experienced with Python runtime efficiency.

钻取 Python 的 tempfile.py 您可以找到可能需要一些时间的线索._mkstemp_inner 函数可能会打开几个文件并为每个文件引发异常.您的目录包含的临时文件越多,您获得的文件名冲突就越多,这需要的时间就越长.尝试清空您的临时目录.

Drilling in the code of Python's tempfile.py you can find a clue about what might take some time. The _mkstemp_inner function might open a few files and raise an exception for each one. The more temp files your directory contains, the more file name collisions you get, the longer this takes. Try to empty your temp directory.

def _mkstemp_inner(dir, pre, suf, flags):
    """Code common to mkstemp, TemporaryFile, and NamedTemporaryFile."""

    names = _get_candidate_names()

    for seq in range(TMP_MAX):
        name = next(names)
        file = _os.path.join(dir, pre + name + suf)
        try:
            fd = _os.open(file, flags, 0o600)
            _set_cloexec(fd)
            return (fd, _os.path.abspath(file))
        except OSError as e:
            if e.errno == _errno.EEXIST:
                continue # try again
            raise

    raise IOError(_errno.EEXIST, "No usable temporary file name found")

希望有所帮助.

这篇关于NamedTemporaryFile 速度平平的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆