如何与bash同时处理文件? [英] How to process files concurrently with bash?
问题描述
bash
脚本。现在我想只用 K
脚本并行处理所有这些文件。我不想(显然)不止一次地处理任何文件。 你会如何建议在 bash
?
find。 -type f -print0 |并行-0 -P 3 ./myscript {1}
您将传递当前目录中的所有文件(及其子目录)作为 myscript
的参数,一次一个。 -0
选项将分隔符设置为空字符, -P
选项设置作业的数量并行执行。并行进程的默认数量等于系统中的核心数量。还有其他的选项可以在集群中进行并行处理,这些都是这里。 / p>
Suppose I have 10K filesa and a bash
script which processes a single file. Now I would like to process all these files concurrently with only K
script running in parallel. I do not want (obviously) to process any file more than once.
How would you suggest implement it in bash
?
One way of executing a limited number of parallel jobs is with GNU parallel. For example, with this command:
find . -type f -print0 | parallel -0 -P 3 ./myscript {1}
You will pass all files in the current directory (and its subdirectories) as parameters to myscript
, one at a time. The -0
option sets the delimiter to be the null character, and the -P
option sets the number of jobs that are executed in parallel. The default number of parallel processes is equal to the number of cores in the system. There are other options for parallel processing in clusters etc, which are documented here.
这篇关于如何与bash同时处理文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!