MapReduce与paramiko如何打印标准输出流 [英] MapReduce with paramiko how to print stdout as it streams

查看:439
本文介绍了MapReduce与paramiko如何打印标准输出流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用paramiko创建了一个小Python脚本,它允许我运行MapReduce作业,而无需使用PuTTY或cmd窗口来启动作业。这很好,除非在作业完成之前我无法看到stdout。我怎么能设置这个,以便我可以看到生成的每一行stdout,就像我可以通过cmd窗口一样?

I have created a small Python script using paramiko that allows me to run MapReduce jobs without using PuTTY or cmd windows to initiate the jobs. This works great, except that I don't get to see stdout until the job completes. How can I set this up so that I can see each line of stdout as it is generated, just as I would be able to via cmd window?

以下是我的脚本:

Here is my script:

import paramiko

# Define connection info
host_ip = 'xx.xx.xx.xx'
user = 'xxxxxxxxx'
pw = 'xxxxxxxxx'

# Commands
list_dir = "ls /nfs_home/appers/cnielsen -l"
MR = "hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming.jar -files /nfs_home/appers/cnielsen/product_lookups.xml -file /nfs_home/appers/cnielsen/Mapper.py -file /nfs_home/appers/cnielsen/Reducer.py -mapper '/usr/lib/python_2.7.3/bin/python Mapper.py test1' -file /nfs_home/appers/cnielsen/Process.py -reducer '/usr/lib/python_2.7.3/bin/python Reducer.py' -input /nfs_home/appers/extracts/*/*.xml -output /user/loc/output/cnielsen/test51"
getmerge = "hadoop fs -getmerge /user/loc/output/cnielsen/test51 /nfs_home/appers/cnielsen/test_010716_0.txt"

client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(host_ip, username=user, password=pw)
##stdin, stdout, stderr = client.exec_command(list_dir)
##stdin, stdout, stderr = client.exec_command(getmerge)
stdin, stdout, stderr = client.exec_command(MR)

print "Executing command..."

for line in stdout:
    print '... ' + line.strip('\n')
for l in stderr:
    print '... ' + l.strip('\n')
client.close()


推荐答案

stdout.read()阻塞直到EOF。因此您必须以块的形式读取stdout / stderr才能立即获得输出。 此答案,特别是这个答案应该可以帮助你解决这个问题。我建议您修改 answer 2

this code is implicitly calling stdout.read() which blocks until EOF. you'll therefore have to read stdout/stderr in chunks to instantly get output. this answer and especially a modified version of this answer should help you resolving this issue. I'd recommend to adapt answer 2 for your use-case to prevent some common stalling scenarios.

这里是一个从< a href =https://stackoverflow.com/a/33334998/1729555>回答1

sin,sout,serr = ssh.exec_command("while true; do uptime; done")

def line_buffered(f):
    line_buf = ""
    while not f.channel.exit_status_ready():
        line_buf += f.read(1)
        if line_buf.endswith('\n'):
            yield line_buf
            line_buf = ''

for l in line_buffered(sout):   # or serr
    print l

这篇关于MapReduce与paramiko如何打印标准输出流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆