在Python中,如何将字符串写入远程计算机上的文件? [英] In Python, how to write a string to a file on a remote machine?

查看:764
本文介绍了在Python中,如何将字符串写入远程计算机上的文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Machine1上,我有一个Python2.7脚本,该脚本在RAM中计算一个很大的二进制字符串(最大10MB),我想将其写入远程机器Machine2上的磁盘文件.做这个的最好方式是什么?

On Machine1, I have a Python2.7 script that computes a big (up to 10MB) binary string in RAM that I'd like to write to a disk file on Machine2, which is a remote machine. What is the best way to do this?

约束:

  • 两台机器均为Ubuntu 13.04.它们之间的连接速度很快-它们在同一网络上.

  • Both machines are Ubuntu 13.04. The connection between them is fast -- they are on the same network.

Machine2上可能尚不存在目标目录,因此可能需要创建它.

The destination directory might not yet exist on Machine2, so it might need to be created.

如果很简单,我想避免将RAM中的字符串写入Machine1上的临时磁盘文件.这样是否消除了可能使用系统调用进行rsync的解决方案?

If it's easy, I would like to avoid writing the string from RAM to a temporary disk file on Machine1. Does that eliminate solutions that might use a system call to rsync?

由于字符串是二进制的,因此它可能包含可以解释为换行符的字节.这似乎排除了可能使用系统调用Machine2上的echo命令的解决方案.

Because the string is binary, it might contain bytes that could be interpreted as a newline. This would seem to rule out solutions that might use a system call to the echo command on Machine2.

我希望它在Machine2上尽可能轻巧.因此,我想避免在Machine2上运行诸如ftp之类的服务,或者在该处进行其他配置活动.另外,我不太了解安全性,因此除非真正必要,否则我想避免打开其他端口.

I would like this to be as lightweight on Machine2 as possible. Thus, I would like to avoid running services like ftp on Machine2 or engage in other configuration activities there. Plus, I don't understand security that well, and so would like to avoid opening additional ports unless truly necessary.

我在Machine1和Machine2上设置了ssh密钥,并希望将其用于身份验证.

I have ssh keys set up on Machine1 and Machine2, and would like to use them for authentication.

Machine1正在运行多个线程,因此有可能多个线程在重叠的时间尝试写入Machine2上的同一文件.在这种情况下,我不介意两次写入文件(或多次写入)会导致效率低下,但是Machine2上生成的数据文件不应因同时写入而损坏.也许需要在Machine2上锁定操作系统?

Machine1 is running multiple threads, and so it is possible that more than one thread could attempt to write to the same file on Machine2 at overlapping times. I do not mind the inefficiency caused by having the file written twice (or more) in this case, but the resulting datafile on Machine2 should not be corrupted by simultaneous writes. Maybe an OS lock on Machine2 is needed?

我正在扎根rsync解决方案,因为它是一个独立的实体,我理解得很好,并且不需要在Machine2上进行任何配置.

I'm rooting for an rsync solution, since it is a self-contained entity that I understand reasonably well, and requires no configuration on Machine2.

推荐答案

您使用subprocess.Popen向Machine2打开新的SSH进程,然后将数据写入其STDIN.

You open a new SSH process to Machine2 using subprocess.Popen and then you write your data to its STDIN.

import subprocess

cmd = ['ssh', 'user@machine2',
       'mkdir -p output/dir; cat - > output/dir/file.dat']

p = subprocess.Popen(cmd, stdin=subprocess.PIPE)

your_inmem_data = 'foobarbaz\0' * 1024 * 1024

for chunk_ix in range(0, len(your_inmem_data), 1024):
    chunk = your_inmem_data[chunk_ix:chunk_ix + 1024]
    p.stdin.write(chunk)

我刚刚验证了它可以像广告中那样工作,并复制了所有10485760个虚拟字节.

I've just verified that it works as advertised and copies all of the 10485760 dummy bytes.

P.S.一个可能更干净/更优雅的解决方案是让Python程序将其输出写入sys.stdout并从外部对ssh进行管道传输:

P.S. A potentially cleaner/more elegant solution would be to have the Python program write its output to sys.stdout instead and do the piping to ssh externally:

$ python process.py | ssh <the same ssh command>

这篇关于在Python中,如何将字符串写入远程计算机上的文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆