通过ftp直接将程序输出上传到远程文件 [英] Upload output of a program directly to a remote file by ftp

查看:109
本文介绍了通过ftp直接将程序输出上传到远程文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些程序可以生成大量数据,具体来说就是加密tarball.我想将结果上传到远程ftp服务器上.

I have some program that generates a lot of data, to be specific encrypting tarballs. I want to upload result on a remote ftp server.

文件很大(大约60GB),所以我不想在tmp目录和时间上浪费硬盘空间.

Files are quite big (about 60GB), so I don't want to waste hdd space for tmp dir and time.

有可能吗?我检查了ncftput util,但没有从标准输入读取的选项.

Is it possible? I checked ncftput util, but there is not option to read from a standard input.

推荐答案

我想您可以使用任何使用命名管道的上传程序来做到这一点,但是我预见如果上传的某些部分出现错误,则会出现问题并且您必须重新启动上传:数据丢失了,即使您只丢失了1个字节,也无法重新开始上传.这也适用于从stdin读取策略.

I guess you could do that with any upload program using named pipe, but I foresee problems if some part of the upload goes wrong and you have to restart your upload: the data is gone and you cannot start back your upload, even if you only lost 1 byte. This also applied to a read from stdin strategy.

我的策略如下:

  1. 使用mkfifo创建命名管道.
  2. 启动加密过程,在后台写入该命名管道.不久,管道缓冲区将已满,并且尝试将数据写入管道时,加密过程将被阻止.当我们稍后将要从管道读取数据时,它应该解除阻塞.
  3. 从命名管道中读取一定数量的数据(比如说1 GB),并将其放入文件中.实用程序dd可以用于此目的.
  4. 通过ftp以标准方式上传该文件.然后,您可以处理重试和网络错误.上传完成后,删除文件.
  5. 返回步骤3,直到从管道中获得EOF.这将意味着加密过程已完成写入管道的操作.
  6. 在服务器上,将文件附加到一个空文件中,并在附加文件后一一删除.使用touch next_file; for f in ordered_list_of_files; do cat $f >> next_file; rm $f; done或某些变体可以做到这一点.
  1. Create a named pipe using mkfifo.
  2. Start the encryption process writing to that named pipe in the background. Soon, the pipe buffer will be full and the encryption process will be blocked trying to write data to the pipe. It should unblock when we will read data from the pipe later.
  3. Read a certain amount of data from the named pipe (let say 1 GB) and put this in a file. The utility dd could be used for that.
  4. Upload that file though ftp doing it the standard way. You then can deal with retries and network errors. Once the upload is completed, delete the file.
  5. Go back to step 3 until you get a EOF from the pipe. This will mean that the encryption process is done writing to the pipe.
  6. On the server, append the files in order to an empty file, deleting the files one by one once it has been appended. Using touch next_file; for f in ordered_list_of_files; do cat $f >> next_file; rm $f; done or some variant should do it.

您当然可以在上传前一个文件时准备下一个文件,以最大程度地使用并发性.瓶颈将是您的加密算法(CPU),网络带宽或磁盘带宽.

You can of course prepare the next file while you upload the previous file to use concurrency at its maximum. The bottleneck will be either your encryption algorithm (CPU), you network bandwidth, or your disk bandwidth.

此方法将在客户端浪​​费2 GB的磁盘空间(或更少或更多,具体取决于文件大小),在服务器端浪费1 GB的磁盘空间.但是,可以确定的是,如果上传快结束时挂起,则不必再次执行此操作.

This method will waste you 2 GB of disk space on the client side (or less or more depending the size of the files), and 1 GB of disk space on the server side. But you can be sure that you will not have to do it again if your upload hang near the end.

如果您想对传输结果有双重把握,则可以在将文件写入客户端磁盘时计算文件的哈希值,并且只有在服务器端验证了哈希值之后才删除客户端文件.您可以在使用dd ... | tee local_file | sha1sum将文件写入磁盘的同时,在客户端计算哈希值.在服务器端,您必须先计算哈希,然后再处理猫,如果哈希不好,就不要做猫,因此,如果不读取文件两次(一次哈希,一次读取一次),我将看不到如何做.给猫).

If you want to be double sure about the result of the transfer, you could compute hash of you files while writing them to disk on the client side, and only delete the client file once you have verify the hash on the server side. The hash can be computed on the client side at the same time you are writing the file to disk using dd ... | tee local_file | sha1sum. On the server side, you would have to compute the hash before doing the cat, and avoid doing the cat if the hash is not good, so I cannot see how to do it without reading the file twice (once for the hash, and once for the cat).

这篇关于通过ftp直接将程序输出上传到远程文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆