如何与bash同时处理文件? [英] How to process files concurrently with bash?

查看:225
本文介绍了如何与bash同时处理文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有10K个文件和一个处理单个文件的 bash 脚本。现在我想只用 K 脚本并行处理所有这些文件。我不想(显然)不止一次地处理任何文件。

你会如何建议在 bash

GNU并行。例如,用这个命令:

  find。 -type f -print0 |并行-0 -P 3 ./myscript {1} 

您将传递当前目录中的所有文件(及其子目录)作为 myscript 的参数,一次一个。 -0 选项将分隔符设置为空字符, -P 选项设置作业的数量并行执行。并行进程的默认数量等于系统中的核心数量。还有其他的选项可以在集群中进行并行处理,这些都是这里。 / p>

Suppose I have 10K filesa and a bash script which processes a single file. Now I would like to process all these files concurrently with only K script running in parallel. I do not want (obviously) to process any file more than once.

How would you suggest implement it in bash ?

解决方案

One way of executing a limited number of parallel jobs is with GNU parallel. For example, with this command:

find . -type f -print0 | parallel -0 -P 3 ./myscript {1}

You will pass all files in the current directory (and its subdirectories) as parameters to myscript, one at a time. The -0 option sets the delimiter to be the null character, and the -P option sets the number of jobs that are executed in parallel. The default number of parallel processes is equal to the number of cores in the system. There are other options for parallel processing in clusters etc, which are documented here.

这篇关于如何与bash同时处理文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆