ProcessPoolExecutor日志记录无法在Windows上登录内部函数,但不能在Unix/Mac上登录 [英] ProcessPoolExecutor logging fails to log inside function on Windows but not on Unix / Mac
问题描述
当我在Windows计算机上运行以下脚本时,没有看到来自log_pid
函数的任何日志消息,但是当我在Unix/Mac上运行时却看到了.我之前曾读过,与Mac相比,Windows上的多重处理有所不同,但是我不清楚我应该进行哪些更改才能使此脚本在Windows上运行.我正在运行Python 3.6.
When I run the following script on a Windows computer, I do not see any of the log messages from the log_pid
function, however I do when I run on Unix / Mac. I've read before that multiprocessing is different on Windows compared to Mac, but it's not clear to me what changes should I make to get this script to work on Windows. I'm running Python 3.6.
import logging
import sys
from concurrent.futures import ProcessPoolExecutor
import os
def log_pid(x):
logger.info('Executing on process: %s' % os.getpid())
def do_stuff():
logger.info('this is the do stuff function.')
with ProcessPoolExecutor(max_workers=4) as executor:
executor.map(log_pid, range(0, 10))
def main():
logger.info('this is the main function.')
do_stuff()
if __name__ == '__main__':
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logger = logging.getLogger(__name__)
logger.info('Start of script ...')
main()
logger.info('End of script ...')
推荐答案
Unix processes are created via the fork
strategy where the child gets cloned from the parent and continues its execution right at the moment the parent forked.
在Windows上是完全不同的:创建一个空白进程,并启动一个新的Python解释器.然后,解释器将加载log_pid
函数所在的模块并执行.
On Windows is quite different: a blank process is created and a new Python interpreter gets launched. The interpreter will then load the module where the log_pid
function is located and execute it.
这意味着__main__
部分不会由新产生的子进程执行.因此,不会创建logger
对象,并且log_pid
函数相应地崩溃.您看不到错误,因为您忽略了计算结果.尝试如下修改逻辑.
This means the __main__
section is not executed by the newly spawned child process. Hence, the logger
object is not created and the log_pid
function crashes accordingly. You don't see the error because you ignore the result of your computation. Try to modify the logic as follows.
def do_stuff():
logger.info('this is the do stuff function.')
with ProcessPoolExecutor(max_workers=4) as executor:
iterator = executor.map(log_pid, range(0, 10))
list(iterator) # collect the results in a list
问题将变得显而易见.
Traceback (most recent call last):
File "C:\Program Files (x86)\Python36-32\lib\concurrent\futures\process.py", line 175, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "C:\Program Files (x86)\Python36-32\lib\concurrent\futures\process.py", line 153, in _process_chunk
return [fn(*args) for args in chunk]
File "C:\Program Files (x86)\Python36-32\lib\concurrent\futures\process.py", line 153, in <listcomp>
return [fn(*args) for args in chunk]
File "C:\Users\cafama\Desktop\pool.py", line 8, in log_pid
logger.info('Executing on process: %s' % os.getpid())
NameError: name 'logger' is not defined
在处理进程池(无论是concurrent.futures
还是multiprocessing
的池)时,请始终收集计算结果,以避免无提示的bug引起混淆.
When dealing with process pools (whether concurrent.futures
or multiprocessing
ones) always collect the result of the computation to avoid silent bugs to cause confusion.
要解决此问题,只需将logger
创建的内容移动到模块的顶层,一切就可以在所有平台上正常工作.
To fix the problem, just move the logger
creation at the top level of the module and everything will work on all platforms.
import logging
import sys
from concurrent.futures import ProcessPoolExecutor
import os
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logger = logging.getLogger(__name__)
def log_pid(x):
logger.info('Executing on process: %s' % os.getpid())
...
这篇关于ProcessPoolExecutor日志记录无法在Windows上登录内部函数,但不能在Unix/Mac上登录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!