在bash中使用xargs进行并行处理 [英] Parallel processing with xargs in bash

查看:178
本文介绍了在bash中使用xargs进行并行处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个小脚本,我可以在该脚本中找到每个openstack的租户,并借助python获取一些输出.生成报告花费的时间太长,建议使用xargs.我以前的代码如下.

I had a small script where I would source into each openstack's tenant and fetch some output with the help of python. It took too long for the reports to get generated and I was suggested to use xargs. My earlier code was like below.

#!/bin/bash
cd /scripts/cloud01/floating_list

rm -rf ./reports/openstack_reports/
mkdir -p ./reports/openstack_reports/

source ../creds/base
for tenant in A B C D E F G H I J K L M N O P Q R S T
do
  source ../creds/$tenant
  python ../tools/openstack_resource_list.py > ./reports/openstack_reports/$tenant.html

done
lftp -f ./lftp_script

现在,我将xargs放入脚本中,该脚本看起来像这样.

Now I have put xargs in the script and the script looks something like this.

#!/bin/bash
cd /scripts/cloud01/floating_list

rm -rf ./reports/openstack_reports/
mkdir -p ./reports/openstack_reports/

source ../creds/base

# Need xargs idea below
cat tenants_list.txt | xargs -P 8 -I '{}' # something that takes the tenant name and source
TENANT_NAME={}
python ../tools/openstack_resource_list.py > ./reports/openstack_reports/$tenant.html
lftp -f ./lftp_script

在此脚本中,我应该如何实现source ../creds/$tenant?因为在处理每个租户时,也需要同时获取它,而且我不确定如何将xargs包含在其中以进行并行执行.

In this script how am I supposed to implement source ../creds/$tenant? Because while each tenant is dealt with, it needs to be sourced as well and I am not sure how to include that with xargs for parallel execution.

推荐答案

xargs无法轻松运行shell函数...但是可以运行shell.

xargs can't easily run a shell function ... but it can run a shell.

# If the tenant names are this simple, don't put them in a file
printf '%s\n' {A..T} |
xargs -P 8 -I {} bash -c 'source ../creds/"$0"
      python ../tools/openstack_resource_list.py > ./reports/openstack_reports/"$0".html' {}

有些奇怪的是,bash -c '...'之后的参数在脚本内显示为$0.

Somewhat obscurely, the argument after bash -c '...' gets exposed as $0 inside the script.

如果要将租户保留在文件中,xargs -a filename是避免的好方法cat 的无用用法,尽管并非所有xargs实现都可移植. (使用xargs ... <filename重定向显然是完全可移植的.)

If you want to keep the tenants in a file, xargs -a filename is a good way to avoid the useless use of cat, though it's not portable to all xargs implementations. (Redirecting with xargs ... <filename is obviously completely portable.)

为了提高效率,您可以重构脚本以遍历尽可能多的参数:

For efficiency, you could refactor the script to loop over as many arguments as possible:

printf '%s\n' {A..T} |
xargs -n 3 -P 8 bash -c 'for tenant; do
      source ../creds/"$tenant"
      python ../tools/openstack_resource_list.py > ./reports/openstack_reports/"$tenant".html
  done' _

这将最多运行8个并行shell实例,每个实例最多分配3个租户(因此实际上只有7个实例),尽管参数数量很少,但性能差异可能微不足道.

This will run a maximum of 8 parallel shell instances with a maximum of 3 tenants assigned to each (so in actual fact only 7 instances), though with this small number of arguments, the difference in performance is probably negligible.

因为我们现在实际上正在接收一个参数列表,所以我们将_用作填充$0的值(只是因为需要将其设置为某种值才能正确地放置真实的参数)

Because we are now actually receiving a list of arguments, we pass _ as the value to populate $0 with (just because it needs to be set to something, in order to get the real arguments in place properly).

如果source可能进行的修改并不能总是保证在下一次迭代中被source覆盖(例如,某些承租人的变量需要为其他某些承租人取消设置?),这会使事情变得复杂,但是,如果您确实确实需要帮助解决该问题,则可以提出一个单独的问题;或者只是回到第一个变体,即每个租户都在单独的shell实例中运行.

If the source might make modifications which are not always guaranteed to be overwritten by the source in the next iteration (say, some tenants have variables which need to be unset for some other tenants?) that complicates matters, but maybe post a separate question if you really actually need help resolving that; or just fall back to the first variant where each tenant is run in a separate shell instance.

这篇关于在bash中使用xargs进行并行处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆