xargs输出缓冲-P并行 [英] xargs output buffering -P parallel
问题描述
我有一个bash函数,可以像这样使用xargs -P并行调用
I have a bash function that i call in parallel using xargs -P like so
echo ${list} | xargs -n 1 -P 24 -I@ bash -l -c 'myAwesomeShellFunction @'
一切正常,但由于明显的原因(没有缓冲),输出混乱了
Everything works fine but output is messed up for obvious reasons (no buffering)
试图找到一种有效缓冲输出的方法.我以为可以使用awk,但是我写的脚本不够好,在Google上找不到任何值得的东西吗?有人可以帮我用sed或awk编写此输出缓冲区"吗?没什么好想的,只是累积输出并在过程终止后将其吐出.我不在乎shell函数的执行顺序,只需要缓冲它们的输出即可……
Trying to figure out a way to buffer output effectively. I was thinking I could use awk, but I'm not good enough to write such a script and I can't find anything worthwhile on google? Can someone help me write this "output buffer" in sed or awk? Nothing fancy, just accumulate output and spit it out after process terminates. I don't care the order that shell functions execute, just need their output buffered... Something like:
echo ${list} | xargs -n 1 -P 24 -I@ bash -l -c 'myAwesomeShellFunction @ | sed -u ""'
P.s.我尝试按照每个使用stdbuf > https://unix.stackexchange.com/questions/25372/turn-off-buffering-管道内,但没有用,我在o和e上指定了缓冲,但输出仍未缓冲:
P.s. I tried to use stdbuf as per https://unix.stackexchange.com/questions/25372/turn-off-buffering-in-pipe but did not work, i specified buffering on o and e but output still unbuffered:
echo ${list} | xargs -n 1 -P 24 -I@ stdbuf -i0 -oL -eL bash -l -c 'myAwesomeShellFunction @'
这是我的第一次尝试,它仅捕获输出的第一行:
Here's my first attempt, this only captures first line of output:
$ bash -c "echo stuff;sleep 3; echo more stuff" | awk '{while (( getline line) > 0 )print "got ",$line;}'
$ got stuff
推荐答案
如果您的输出超过一页(通常为4kb),则这不是相当原子,但是在大多数情况下,它会做:
This isn't quite atomic if your output is longer than a page (4kb typically), but for most cases it'll do:
xargs -P 24 bash -c 'for arg; do printf "%s\n" "$(myAwesomeShellFunction "$arg")"; done' _
这里的妙处是命令替换:$(...)
创建一个子shell(shell的fork()
ed-off副本),在其中运行代码...
,然后将其读取以替换为外脚本中的相关位置.
The magic here is the command substitution: $(...)
creates a subshell (a fork()
ed-off copy of your shell), runs the code ...
in it, and then reads that in to be substituted into the relevant position in the outer script.
请注意,我们不需要-n 1
(如果您要处理大量的参数-如果数量较少,则可能会改善并行化),因为我们要遍历与您的每个参数一样多的参数已传递24个并行bash
实例.
Note that we don't need -n 1
(if you're dealing with a large number of arguments -- for a small number it may improve parallelization), since we're iterating over as many arguments as each of your 24 parallel bash
instances is passed.
如果要使其真正原子化,可以使用锁定文件来实现:
If you want to make it truly atomic, you can do that with a lockfile:
# generate a lockfile, arrange for it to be deleted when this shell exits
lockfile=$(mktemp -t lock.XXXXXX); export lockfile
trap 'rm -f "$lockfile"' 0
xargs -P 24 bash -c '
for arg; do
{
output=$(myAwesomeShellFunction "$arg")
flock -x 99
printf "%s\n" "$output"
} 99>"$lockfile"
done
' _
这篇关于xargs输出缓冲-P并行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!