Python子流程:它们如何/何时关闭文件? [英] Python Subprocess: how/when do they close file?

查看:100
本文介绍了Python子流程:它们如何/何时关闭文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道为什么子进程会打开这么多文件.我有一个示例,其中某些文件似乎永远保持打开状态(在子进程完成之后,甚至在程序崩溃之后).

I wonder why subprocesses keep so many files open. I have an example in which some files seem to remain open forever (after the subprocess finishes and even after the program crashes).

考虑以下代码:

import aiofiles
import tempfile

async def main():
    return [await fds_test(i) for i in range(2000)]

async def fds_test(index):
    print(f"Writing {index}")
    handle, temp_filename = tempfile.mkstemp(suffix='.dat', text=True)
    async with aiofiles.open(temp_filename, mode='w') as fp:
        await fp.write('stuff')
        await fp.write('other stuff')
        await fp.write('EOF\n')

    print(f"Reading {index}")
    bash_cmd = 'cat {}'.format(temp_filename)
    process = await asyncio.create_subprocess_exec(*bash_cmd.split(), stdout=asyncio.subprocess.DEVNULL, close_fds=True)
    await process.wait()
    print(f"Process terminated {index}")

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

这会依次产生一个进程.我希望与此同时打开的文件数也将是1.事实并非如此,有时我会收到以下错误:

This spawns processes one after the other (sequentially). I expect the number of files simultaneously opened by this to also be one. But it's not the case and at some point I get the following error:

/Users/cglacet/.pyenv/versions/3.8.0/lib/python3.8/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
   1410             # Data format: "exception name:hex errno:description"
   1411             # Pickle is not used; it is complex and involves memory allocation.
-> 1412             errpipe_read, errpipe_write = os.pipe()
   1413             # errpipe_write must not be in the standard io 0, 1, or 2 fd range.
   1414             low_fds_to_close = []

OSError: [Errno 24] Too many open files

我尝试运行不带选项stdout=asyncio.subprocess.DEVNULL的相同代码,但仍然崩溃. 建议的答案可能是问题出在哪里,并且错误也指向errpipe_read, errpipe_write = os.pipe()行.但这似乎不是问题所在(在没有该选项的情况下运行会产生相同的错误).

I tried running the same code without the option stdout=asyncio.subprocess.DEVNULL but it still crashes. This answer suggested it might be where the problem comes from and the error also points at the line errpipe_read, errpipe_write = os.pipe(). But it doesn't seem like this is the problem (running without that option gives the same error).

如果您需要更多信息,请参见lsof | grep python的概述:

In case you need more information, here is an overview from the output of lsof | grep python:

python3.8 19529 cglacet    7u      REG                1,5        138 12918796819 /private/var/folders/sn/_pq5fxn96kj3m135j_b76sb80000gp/T/tmpuxu_o4mf.dat
# ... 
# ~ 2000 entries later : 
python3.8 19529 cglacet 2002u      REG                1,5        848 12918802386 /private/var/folders/sn/_pq5fxn96kj3m135j_b76sb80000gp/T/tmpcaakgz3f.dat

这些是我的子进程正在读取的临时文件. lsof的其余输出似乎是合法的东西(打开了库,例如pandas/numpy/scipy/etc).

These are the temporary files that my subprocesses are reading. The rest of the output from lsof seems like legit stuff (libraries opened, like pandas/numpy/scipy/etc.).

现在我有些疑问:也许问题出在aiofiles异步上下文管理器?也许是不是不关闭文件而不是create_subprocess_exec的那个?

Now I have some doubt: maybe the problem comes from aiofiles asynchronous context manager? Maybe it's the one not closing the files and not create_subprocess_exec?

这里有一个类似的问题,但没有人真正尝试解释/解决问题(仅建议增加限制):

There is a similar question here, but nobody really try to explain/solve the problem (and only suggest increasing the limit) : Python Subprocess: Too Many Open Files. I would really like to understand what is going on, my first goal is not necessarily to temporarily solve the problem (in the future I want to be able to run function fds_test as many times as needed). My goal is to have a function that behave as expected. I probably have to change either my expectation or my code, that's why I ask this question.

此处注释中所建议,我还尝试运行python -m test test_subprocess -m test_close_fds -v,它给出了:

As suggested in the comments here, I also tried to run python -m test test_subprocess -m test_close_fds -v which gives:

== CPython 3.8.0 (default, Nov 28 2019, 20:06:13) [Clang 11.0.0 (clang-1100.0.33.12)]
== macOS-10.14.6-x86_64-i386-64bit little-endian
== cwd: /private/var/folders/sn/_pq5fxn96kj3m135j_b76sb80000gp/T/test_python_52961
== CPU count: 8
== encodings: locale=UTF-8, FS=utf-8
0:00:00 load avg: 5.29 Run tests sequentially
0:00:00 load avg: 5.29 [1/1] test_subprocess
test_close_fds (test.test_subprocess.POSIXProcessTestCase) ... ok
test_close_fds (test.test_subprocess.Win32ProcessTestCase) ... skipped 'Windows specific tests'

----------------------------------------------------------------------

Ran 2 tests in 0.142s

OK (skipped=1)

== Tests result: SUCCESS ==

1 test OK.

Total duration: 224 ms
Tests result: SUCCESS

因此,似乎应该正确关闭文件了,我在这里有点迷失了.

So it seems files should be correctly closed, I'm a bit lost here.

推荐答案

问题并非来自create_subprocess_exec,此代码中的问题是

The problem doesn't come from create_subprocess_exec the problem in this code is that tempfile.mkstemp() actually opens the file:

mkstemp()将一个包含操作系统级别句柄的元组返回到一个打开的文件(如os.open()将返回的那样)…

mkstemp() returns a tuple containing an OS-level handle to an open file (as would be returned by os.open()) …

我认为它只会创建文件.为了解决我的问题,我仅添加了对os.close(handle)的调用.这消除了错误,但有点奇怪(两次打开文件).因此,我将其重写为:

I thought it would only create the file. To solve my problem I simply added a call to os.close(handle). Which removes the error but is a bit weird (opens a file twice). So I rewrote it as:

import aiofiles
import tempfile
import uuid


async def main():
    await asyncio.gather(*[fds_test(i) for i in range(10)])

async def fds_test(index):
    dir_name = tempfile.gettempdir()
    file_id = f"{tempfile.gettempprefix()}{uuid.uuid4()}"
    temp_filename = f"{dir_name}/{file_id}.dat"

    async with aiofiles.open(temp_filename, mode='w') as fp:
        await fp.write('stuff')

    bash_cmd = 'cat {}'.format(temp_filename)
    process = await asyncio.create_subprocess_exec(*bash_cmd.split(), close_fds=True)
    await process.wait()


if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

现在,我想知道为什么错误是由subprocess而不是tempfile.mkstemp引起的,也许是因为它的子进程打开了太多文件,以致于临时文件的创建不可能突破极限……

Now I wonder why the error was raised by subprocess and not tempfile.mkstemp, maybe because it subprocess opens so much more files that it makes it unlikely that the temporary file creation is what breaks the limit …

这篇关于Python子流程:它们如何/何时关闭文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆