GNU与rsync并行 [英] GNU parallel with rsync

查看:124
本文介绍了GNU与rsync并行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用sshGNU parallel并行运行rsync的某些实例.我正在运行的命令是这样的:

I'm trying to run some instances of rsync in parallel using ssh with GNU parallel. The command I'm running is like this:

find /tmp/tempfolder -type f -name 'chunck.*' | sort | parallel --gnu -j 4 -v ssh -i access.pem user@server echo {}\; rsync -Havessh -auz -0 --files-from={} ./ user@server:/destination/path

/tmp/tempfolder包含前缀为chunck的文件,并且包含实际的文件列表.

/tmp/tempfolder contains files with the prefix chunck and they contain the actual file lists.

使用此命令,我得到了rsync的4个调用,但是它们需要一些时间才能开始运行,并且不能全部同时启动,也不能并行运行.

With this command, I got the 4 calls for rsync alright, but they take a while to start running and don't start all together and don't run in parallel.

我在做什么错了?

推荐答案

您是否确定,rsync确实不是并行运行的?
在命令运行时使用ps | grep rsync进行检查,将显示实际上同时运行了哪些rsync和多少rsync.

Are you sure the rsyncs are really not running in parallel ?
Checking with ps | grep rsync while the command is running will show which and how many rsyncs are actually running simultaneously.

默认情况下,parallel保留每个作业的打印输出,直到完成为止,这样不同命令的输出才不会混合在一起:

By default, parallel holds printing output from each job until it's finished so that the different commands' output don't get all mixed up together:

--group  Group output. Output from each jobs is grouped together and is only printed when the command
         is finished. stderr (standard error) first followed by stdout (standard output). This takes
         some CPU time. In rare situations GNU parallel takes up lots of CPU time and if it is
         acceptable that the outputs from different commands are mixed together, then disabling
         grouping with -u can speedup GNU parallel by a factor of 10.

         --group is the default. Can be reversed with -u.

我的猜测是rsync实际上实际上是并行运行的,但是从输出中看,它们好像是串行运行的. -u选项更改了这一点.

My guess is the rsyncs are actually running in parallel, but from the output it feels like they're running serial. -u option changes that.

-

例如使用此cmd:

$ for i in 1 2 3 ; do echo a$i ; sleep 1 ; done
a1
a2
a3

默认情况下,并行完成之前,我们不会收到任何反馈:

By default in parallel we get no feedback until it's all done:

$ (echo a ; echo b ; echo c ) | parallel 'for i in 1 2 3 ; do echo {}$i ; sleep 1 ; done  ' 
a1
a2
a3
b1
b2
b3
c1
c2
c3

使用-u的内容会立即打印:

Whereas with -u stuff get printed right away:

$ (echo a ; echo b ; echo c ) | parallel -u 'for i in 1 2 3 ; do echo {}$i ; sleep 1 ; done  ' 
a1
b1
c1
a2
b2
c2
a3
b3
c3

在两种情况下,它都需要3秒钟才能运行,因此它实际上是同时运行的...

In both cases it took 3s to run though so it's really running simultaneously...

这篇关于GNU与rsync并行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆