使用最大进程数并行化 Bash 脚本 [英] Parallelize Bash script with maximum number of processes
问题描述
假设我在 Bash 中有一个循环:
Lets say I have a loop in Bash:
for foo in `some-command`
do
do-something $foo
done
do-something
受 cpu 限制,我有一个漂亮的闪亮 4 核处理器.我希望能够同时运行多达 4 个 do-something
.
do-something
is cpu bound and I have a nice shiny 4 core processor. I'd like to be able to run up to 4 do-something
's at once.
天真的方法似乎是:
for foo in `some-command`
do
do-something $foo &
done
这将一次运行所有 do-something
s,但有一些缺点,主要是do-something也可能有一些重要的I/O执行all 可能会慢一点.另一个问题是这个代码块会立即返回,所以当所有do-something
都完成后,没有办法做其他工作.
This will run all do-something
s at once, but there are a couple downsides, mainly that do-something may also have some significant I/O which performing all at once might slow down a bit. The other problem is that this code block returns immediately, so no way to do other work when all the do-something
s are finished.
您将如何编写此循环,以便始终同时运行 X 个do-something
?
How would you write this loop so there are always X do-something
s running at once?
推荐答案
根据你想做什么,xargs 也可以提供帮助(这里:用 pdf2ps 转换文档):
Depending on what you want to do xargs also can help (here: converting documents with pdf2ps):
cpus=$( ls -d /sys/devices/system/cpu/cpu[[:digit:]]* | wc -w )
find . -name *.pdf | xargs --max-args=1 --max-procs=$cpus pdf2ps
来自文档:
--max-procs=max-procs
-P max-procs
Run up to max-procs processes at a time; the default is 1.
If max-procs is 0, xargs will run as many processes as possible at a
time. Use the -n option with -P; otherwise chances are that only one
exec will be done.
这篇关于使用最大进程数并行化 Bash 脚本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!