允许多个输入到python子进程 [英] allowing multiple inputs to python subprocess

查看:116
本文介绍了允许多个输入到python子进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

几年前我遇到一个几乎相同的问题:具有两个输入的Python子进程收到一个答案,但没有实施.我希望此转发可以帮助我和其他人清除一切.

I have a near-identical problem to one asked several years ago : Python subprocess with two inputs which received one answer but no implemention. I'm hoping that this repost may help clear things up for me and others.

如上所述,我想使用子过程来包装一个接受多个输入的命令行工具.特别是,我想避免将输入文件写入磁盘,而宁愿使用例如上面提到过的命名管道.那应该读为"learning how",因为我承认我以前从未尝试过使用命名管道.我将进一步说明,我目前输入的数据是两个熊猫数据帧,我想取回一个作为输出.

As in the above, I would like to use subprocess to wrap a command-line tool that takes multiple inputs. In particular, I want to avoid writing the input files to disk, but would rather use e.g. named pipes, as alluded to in the above. That should read "learn how to" as I admittedly I have never tried using named pipes before. I'll further state that the inputs I have are currently two pandas dataframes, and I'd like to get one back as output.

通用命令行实现:

/usr/local/bin/my_command inputfileA.csv inputfileB.csv -o outputfile

可以预见的是,我当前的实现无法正常工作.我看不到如何/何时通过命名管道将数据帧发送到命令进程,并且希望获得一些帮助!

My current implementation, predictably, doesn't work. I don't see how/when the dataframes get sent to the command process through the named pipes, and I'd appreciate some help!

import os
import StringIO
import subprocess
import pandas as pd
dfA = pd.DataFrame([[1,2,3],[3,4,5]], columns=["A","B","C"])
dfB = pd.DataFrame([[5,6,7],[6,7,8]], columns=["A","B","C"]) 

# make two FIFOs to host the dataframes
fnA = 'inputA'; os.mkfifo(fnA); ffA = open(fnA,"w")
fnB = 'inputB'; os.mkfifo(fnB); ffB = open(fnB,"w")

# don't know if I need to make two subprocesses to pipe inputs 
ppA  = subprocess.Popen("echo", 
                    stdin =subprocess.PIPE,
                    stdout=subprocess.PIPE,
                    stderr=subprocess.PIPE)
ppB  = subprocess.Popen("echo", 
                    stdin = suprocess.PIPE,
                    stdout=subprocess.PIPE,
                    stderr=subprocess.PIPE)

ppA.communicate(input = dfA.to_csv(header=False,index=False,sep="\t"))
ppB.communicate(input = dfB.to_csv(header=False,index=False,sep="\t"))


pope = subprocess.Popen(["/usr/local/bin/my_command",
                        fnA,fnB,"stdout"],
                        stdout=subprocess.PIPE,
                        stderr=subprocess.PIPE)
(out,err) = pope.communicate()

try:
    out = pd.read_csv(StringIO.StringIO(out), header=None,sep="\t")
except ValueError: # fail
    out = ""
    print("\n###command failed###\n")

os.unlink(fnA); os.remove(fnA)
os.unlink(fnB); os.remove(fnB)

推荐答案

您不需要其他进程即可将数据传递到子进程而无需将其写入磁盘:

You don't need additional processes to pass data to a child process without writing it to disk:

#!/usr/bin/env python
import os
import shutil
import subprocess
import tempfile
import threading
from contextlib import contextmanager    
import pandas as pd

@contextmanager
def named_pipes(count):
    dirname = tempfile.mkdtemp()
    try:
        paths = []
        for i in range(count):
            paths.append(os.path.join(dirname, 'named_pipe' + str(i)))
            os.mkfifo(paths[-1])
        yield paths
    finally:
        shutil.rmtree(dirname)

def write_command_input(df, path):
    df.to_csv(path, header=False,index=False, sep="\t")

dfA = pd.DataFrame([[1,2,3],[3,4,5]], columns=["A","B","C"])
dfB = pd.DataFrame([[5,6,7],[6,7,8]], columns=["A","B","C"])

with named_pipes(2) as paths:
    p = subprocess.Popen(["cat"] + paths, stdout=subprocess.PIPE)
    with p.stdout:
        for df, path in zip([dfA, dfB], paths):
            t = threading.Thread(target=write_command_input, args=[df, path]) 
            t.daemon = True
            t.start()
        result = pd.read_csv(p.stdout, header=None, sep="\t")
p.wait()

cat用于演示.您应该改用命令("/usr/local/bin/my_command").我假设您不能使用标准输入传递数据,而必须通过文件传递输入.从子流程的标准输出中读取结果.

cat is used for demonstration. You should use your command instead ("/usr/local/bin/my_command"). I assume that you can't pass the data using standard input and you have to pass input via files. The result is read from subprocess' standard output.

这篇关于允许多个输入到python子进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆