击:对多个文件进行并行的md5sum校验 [英] Bash: parallelize md5sum checksum on many files
问题描述
可以说,我有一个64核服务器,我需要计算的md5sum
中所有文件的的/ mnt /数据
,结果存到一个文本文件:
lets say, I have a 64-core server, and I need to compute md5sum
of all files in /mnt/data
, and store the results in a text file:
find /mnt/data -type f -exec md5sum {} \; > md5.txt
使用上述命令的问题是,只有一个处理在任何给定时间运行。我想利用我的64核的全部力量。理想情况下,我想确保,在任何给定的时间,64个并行 MD5
进程正在运行(但不超过64)。
The problem with the above command is, that only one process runs at any given time. I would like to harness the full power of my 64-cores. Ideally, I would like to makes sure, that at any given time, 64 parallel md5
processes are running (but not more than 64).
也。我需要输出的所有进程将被存储到一个文件中。
Also. I would need output from all the processes to be stored into one file.
请注意:我不是在寻找一种方式来计算的md5sum
一个文件并行。我正在寻找一种方式来计算的并行64种不同的文件,64 MD5SUMS只要有来自来找到任何文件
。
NOTE: I am not looking for a way to compute md5sum
of one file in parallel. I am looking for a way to compute 64 md5sums of 64 different files in parallel, as long as there are any files coming from find
.
推荐答案
使用 GNU并行
。你可以找到关于如何实现更多的例子,它这里。
find /mnt/data -type f | parallel -j 64 md5sum > md5.txt
这篇关于击:对多个文件进行并行的md5sum校验的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!