为什么我的承诺(启动块)中的所有 shell 进程都没有运行?(这是一个错误吗?) [英] Why don't all the shell processes in my promises (start blocks) run? (Is this a bug?)
问题描述
我想运行多个 shell 进程,但是当我尝试运行超过 63 个时,它们挂起.当我将线程池中的 max_threads
减少到 n
时,它在运行 n
th shell 命令后挂起.
正如您在下面的代码中看到的,问题不在于 start
块本身,而在于包含 shell
start 块> 命令:
#!/bin/env perl6我的 $*SCHEDULER = ThreadPoolScheduler.new( max_threads => 2 );我的@processes;# 此循环生成的 Promise 在等待时按预期工作对于@*ARGS ->$项目{@processes.append(开始 { 说计划处理 $item"});}# 以下循环生成的第 n 个 Promise 在等待时挂起(其中 n = max_thread)对于@*ARGS ->$项目{@processes.append(开始 { shell "echo 'processing $item'" });}等待(@processes);
Running ./process_items foo bar baz
给出以下输出,挂在 processing bar
之后,它就在 n
之后th(此处为 2
nd)线程已使用 shell
运行:
处理 foo 的计划加工条规划计划加工巴兹处理 foo加工棒
我做错了什么?或者这是一个错误?
在 CentOS 7 上测试的 Perl 6 发行版:
乐堂之星2018.06
乐堂之星2018.10
乐堂之星 2019.03-RC2
乐堂之星 2019.03
对于 Rakudo Star 2019.03-RC2,use v6.c
和 use v6.d
没有任何区别.
shell
和 run
子使用 Proc
,它在Proc::Async
的术语.这在内部使用线程池.通过对 shell
的阻塞调用填充池,线程池耗尽,因此无法处理事件,导致挂起.
直接使用 Proc::Async
执行此任务会好得多.使用 shell
和大量真实线程的方法不能很好地扩展;每个 OS 线程都有内存开销、GC 开销等等.由于生成一堆子进程不受 CPU 限制,因此相当浪费;实际上,只需要一两个真正的线程.因此,在这种情况下,也许在执行效率低下时实施对您造成的阻碍并不是最糟糕的事情.
我注意到使用 shell
和线程池的原因之一是试图限制并发进程的数量.但这不是一个非常可靠的方法.仅仅因为当前线程池实现设置了默认的最大 64 个线程并不意味着它总是会这样做.
这是一个并行测试运行器的示例,它一次最多运行 4 个进程,收集它们的输出并封装它.它可能比您需要的多一点,但它很好地说明了整体解决方案的形状:
my $degree = 4;我的@tests = dir('t').grep(/\.t$/);反应{子运行一{我的 $test = @tests.shift//返回;我的 $proc = Proc::Async.new('perl6', '-Ilib', $test);我的@output = "文件:$test";每当 $proc.stdout.lines {push @output, "OUT: $_";}每当 $proc.stderr.lines {push @output, "错误:$_";}我的 $finished = $proc.start;每当 $finished {push @output, "退出:{.exitcode}";说@output.join("\n");运行一();}}为 1..$degree 运行一个;}
这里的关键是当一个进程结束时调用 run-one
,这意味着你总是用一个新的进程替换一个退出的进程,维护 - 只要有事情要做- 一次最多运行 4 个进程.react
块在所有进程完成后自然结束,因为订阅的事件数量降为零.
I want to run multiple shell processes, but when I try to run more than 63, they hang. When I reduce max_threads
in the thread pool to n
, it hangs after running the n
th shell command.
As you can see in the code below, the problem is not in start
blocks per se, but in start
blocks that contain the shell
command:
#!/bin/env perl6
my $*SCHEDULER = ThreadPoolScheduler.new( max_threads => 2 );
my @processes;
# The Promises generated by this loop work as expected when awaited
for @*ARGS -> $item {
@processes.append(
start { say "Planning on processing $item" }
);
}
# The nth Promise generated by the following loop hangs when awaited (where n = max_thread)
for @*ARGS -> $item {
@processes.append(
start { shell "echo 'processing $item'" }
);
}
await(@processes);
Running ./process_items foo bar baz
gives the following output, hanging after processing bar
, which is just after the n
th (here 2
nd) thread has run using shell
:
Planning on processing foo Planning on processing bar Planning on processing baz processing foo processing bar
What am I doing wrong? Or is this a bug?
Perl 6 distributions tested on CentOS 7:
Rakudo Star 2018.06
Rakudo Star 2018.10
Rakudo Star 2019.03-RC2
Rakudo Star 2019.03
With Rakudo Star 2019.03-RC2, use v6.c
versus use v6.d
did not make any difference.
The shell
and run
subs use Proc
, which is implemented in terms of Proc::Async
. This uses the thread pool internally. By filling up the pool with blocking calls to shell
, the thread pool becomes exhausted, and so cannot process events, resulting in the hang.
It would be far better to use Proc::Async
directly for this task. The approach with using shell
and a load of real threads won't scale well; every OS thread has memory overhead, GC overhead, and so forth. Since spawning a bunch of child processes is not CPU-bound, this is rather wasteful; in reality, just one or two real threads are needed. So, in this case, perhaps the implementation pushing back on you when doing something inefficient isn't the worst thing.
I notice that one of the reasons for using shell
and the thread pool is to try and limit the number of concurrent processes. But this isn't a very reliable way to do it; just because the current thread pool implementation sets a default maximum of 64 threads does not mean it always will do so.
Here's an example of a parallel test runner that runs up to 4 processes at once, collects their output, and envelopes it. It's a little more than you perhaps need, but it nicely illustrates the shape of the overall solution:
my $degree = 4;
my @tests = dir('t').grep(/\.t$/);
react {
sub run-one {
my $test = @tests.shift // return;
my $proc = Proc::Async.new('perl6', '-Ilib', $test);
my @output = "FILE: $test";
whenever $proc.stdout.lines {
push @output, "OUT: $_";
}
whenever $proc.stderr.lines {
push @output, "ERR: $_";
}
my $finished = $proc.start;
whenever $finished {
push @output, "EXIT: {.exitcode}";
say @output.join("\n");
run-one();
}
}
run-one for 1..$degree;
}
The key thing here is the call to run-one
when a process ends, which means that you always replace an exited process with a new one, maintaining - so long as there are things to do - up to 4 processes running at a time. The react
block naturally ends when all processes have completed, due to the fact that the number of events subscribed to drops to zero.
这篇关于为什么我的承诺(启动块)中的所有 shell 进程都没有运行?(这是一个错误吗?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!