具有两个输入的 Python 子进程 [英] Python subprocess with two inputs

查看:33
本文介绍了具有两个输入的 Python 子进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在写一个Python程序,需要调用一个外部程序hmm3align,从命令行操作如下:

I am a writing a Python program that needs to call an external program, hmm3align, which operates as follows from the command line:

hmm3align hmm_file fasta_file -o output_file

所以通常情况下,程序需要两个输入文件并将结果写入第三个文件.我的程序实际上有多个调用外部程序的情况,但这是外部程序有两个文件输入的唯一情况.我的目的是避免写入和读取文件以允许这些外部程序相互通信;我更愿意在会话期间将所有数据存储为 Python 变量,并在需要时将这些变量提供给外部程序.

So normally, the program expects two input files and writes the results to a third file. My program actually has multiple cases where it is calling an external program, but this is the only case where the external program has two file inputs. My intention is to avoid writing and reading files to allow these external programs to communicate with one another; I would prefer to have all data stored as Python variables during the session and feed these variables to the external programs when needed.

在 Python 程序中需要调用 hmm3align 的地方,我已经有两个 Python 变量 hmm_model 和 fasta_model,它们分别包含通常包含在 hmm_file 和 fasta_file 中的信息.我想要做的是通过 stdin 传递 hmm_model 和 fasta_model 来调用 hmm3align(因为我认为这是将它们作为输入提供的唯一方法),然后将结果从 stdout 捕获到名为 align_results 的第三个 Python 变量中.为此,我创建了一个单独的函数,该函数使用 subprocess 模块,如下所示:

At the point in the Python program where hmm3align needs to be called, I already have two Python variables, hmm_model and fasta_model, that contain the info that would normally be included in hmm_file and fasta_file, respectively. What I want to do is call hmm3align by passing it hmm_model and fasta_model via stdin (because I think that's the only way possible to feed them as inputs) and then capture the results from stdout into a third Python variable named align_results. To do this, I created a separate function that uses the subprocess module as follows:

def hmmalign(hmm_model,fasta):
     args = ["/clusterfs/oha/software/bin/hmm3align",
             "-", "-",
             "-o", "/dev/stdout"]
     process = subprocess.Popen(args, shell=False, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
     return process.communicate(hmm_model,fasta)[0]

如您所见,我正在尝试通过标准输入发送这两个变量.args 列表中的两个-"是为了捕获这两个变量;我看过其他例子中使用的-",但它们的目的不明确,我可能会误解.

So as you can see, I am trying to send both variables via stdin. The two "-" in the args list are meant to capture these two variables; I have seen the "-" used in other examples but their purpose was not clear and I may be misunderstanding things.

果然,我在回溯结束时得到以下错误:

Sure enough, I get the following error at the end of the Traceback:

TypeError: communicate() takes at most 2 arguments (3 given)

所以我不能通过标准输入将两个单独的变量传递给程序.我应该提到的是,当该程序只需要一个输入文件时,我已经能够让子进程在类似的外部程序上工作.

So I cannot pass two separate variables via stdin to the program. I should mention that I have been able to make subprocess work on a similar external program when that program needed only one input file.

我如何使这项工作?是否可以使用具有多个输入的子流程?我看过文档,但没有看到这个问题的答案.提前致谢.

How do I make this work? Is it possible to use subprocess with more than one input? I have looked at the documentation and haven't seen this question answered. Thanks in advance.

推荐答案

标准输入是单个数据流;在 Unix 上,它是一个 文件描述符,连接到单向 管道输出端.按照惯例,从命令行指定的单个文件中读取的程序会将 - 理解为从 stdin 而不是从文件中读取的指令.但是,对于从两个文件读取的程序,无法从 stdin 读取两次,因为它是单个数据流.

Standard input is a single data stream; on Unix it is a file descriptor connected to the output end of a unidirectional pipe. By convention, programs that read from a single file specified on the command line will understand - as an instruction to read from stdin instead of from a file. However, for a program that reads from two files there is no way to read from stdin twice as it is a single stream of data.

还有其他文件描述符可用于通信(stdin 为 fd 0,stdout 为 fd 1,stderr 为 fd 2)但没有传统的方法来指定它们而不是文件.

There are other file descriptors that can be used for communication (stdin is fd 0, stdout is fd 1, stderr is fd 2) but there is no conventional way to specify them instead of files.

最有可能在这里工作的解决方案是命名管道(先进先出);在 Python 中,使用 os.mkfifo 创建命名管道并使用 os.unlink 删除它.然后,您可以将其名称传递给程序(它将显示为可以读取的文件),同时写入它(使用 open).

The solution that is most likely to work here is named pipes (FIFOs); in Python, use os.mkfifo to create a named pipe and os.unlink to delete it. You can then pass its name to the program (it will appear as a file that can be read from) while writing to it (using open).

这篇关于具有两个输入的 Python 子进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆