检测子进程何时等待输入 [英] Detecting when a child process is waiting for input

查看:125
本文介绍了检测子进程何时等待输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个Python程序,用于在Linux服务器上运行用户上传的任意代码(因此,在最坏的情况下,就是不安全,错误和崩溃的代码).除了安全性问题外,我的目标是确定代码(可能以任何语言编写,编译或解释的)是否将正确的东西写到stdoutstderr和其他文件上,并输入到程序的.之后,我需要向用户显示结果.

I'm writing a Python program for running user-uploaded arbitrary (and thus, at the worst case, unsafe, erroneous and crashing) code on a Linux server. The security questions aside, my objective is to determine, if the code (that might be in any language, compiled or interpreted) writes the correct things to stdout, stderr and other files on given input fed into the program's stdin. After this, I need to display the results to the user.

当前,我的解决方案是使用subprocess.Popen(...)生成带有stdoutstderrstdin的文件句柄的子进程. stdin句柄后面的文件包含程序在操作过程中读取的输入,并且在程序终止后,将读取stdoutstderr文件并检查其正确性.

Currently, my solution is to spawn the child process using subprocess.Popen(...) with file handles for the stdout, stderr and stdin. The file behind the stdin handle contains the inputs that the program reads during operation, and after the program has terminated, the stdout and stderr files are read and checked for correctness.

这种方法在其他方面可以完美地起作用,但是当我显示结果时,我无法组合给定的输入和输出,因此输入将出现在与从终端运行程序时相同的位置. IE.对于类似

This approach works otherwise perfectly, but when I display the results, I can't combine the given inputs and outputs so that the inputs would appear in the same places as they would when running the program from a terminal. I.e. for a program like

print "Hello."
name = raw_input("Type your name: ")
print "Nice to meet you, %s!" % (name)

包含程序stdout的文件的内容在运行后将为:

the contents of the file containing the program's stdout would, after running, be:

Hello.
Type your name: 
Nice to meet you, Anonymous!

假定包含stdin的文件的内容为Anonymous<LF>.因此,简而言之,对于给定的示例代码(等效地,对于任何其他代码),我想获得如下结果:

given that the contents the file containing the stdin were Anonymous<LF>. So, in short, for the given example code (and, equivalently, for any other code) I want to achieve a result like:

Hello.
Type your name: Anonymous
Nice to meet you, Anonymous!

因此,问题是要检测程序何时等待输入.

Thus, the problem is to detect when the program is waiting for input.

我尝试了以下方法来解决该问题:

I've tried the following methods for solving the problem:

这允许父进程沿着管道,但只能调用一次,因此不适合具有多个输出和输入的程序-正如可以从文档中推断出的那样.

This allows the parent process to separately send data along a pipe, but can only be called once, and is therefore not suitable for programs with multiple outputs and inputs - just as can be inferred from the documentation.

文档对此提出警告,并且Popen.stdout .read() .readline() 调用似乎在程序开始等待输入时无限阻塞.

The documentation warns against this, and the Popen.stdouts .read() and .readline() calls seem to block infinitely when the programs starts to wait for input.

这似乎没有任何改善.显然,管道始终可以读取或写入,因此select.select(...)在这里并没有太大帮助.

This doesn't seem to improve anything. Apparently the pipes are always ready for reading or writing, so select.select(...) doesn't help much here.

按照此答案中的建议,我尝试创建单独的)从未被读取.

As suggested in this answer, I have tried creating a separate Thread() that stores results from reading from the stdout into a Queue(). The output lines before a line demanding user input are displayed nicely, but the line on which the program starts to wait for user input ("Type your name: " in the example above) never gets read.

按照此处的指示,我已尝试 os.fdopen(...) 打开的主文件描述符与使用其他线程的结果相同:要求输入的行不会被读取.

As directed here, I've tried pty.openpty() to create a pseudo terminal with master and slave file descriptors. After that, I've given the slave file descriptor as an argument for the subprocess.Popen(...) call's stdout, stderr and stdin parameters. Reading through the master file descriptor opened with os.fdopen(...) yields the same result as using a different thread: the line demanding input doesn't get read.

使用@Antti Haapala的pty.fork()示例而不是subprocess.Popen(...)来创建子进程,这似乎也允许我阅读raw_input(...)创建的输出.

Using @Antti Haapala's example of pty.fork() for child process creation instead of subprocess.Popen(...) seems to allow me also read the output created by raw_input(...).

我还尝试了read()read_nonblocking()readline()方法(在此处记录)中使用pexpect生成的进程,但是使用read_nonblocking()获得的最佳结果与以前相同:希望用户输入内容之前输出的行无法读取.与使用pty.fork()创建的PTY相同:需要输入的行被读取.

I've also tried the read(), read_nonblocking() and readline() methods (documented here) of a process spawned with pexpect, but the best result, which I got with read_nonblocking(), is the same as before: the line with outputs before wanting the user to enter something doesn't get read. is the same as with a PTY created with pty.fork(): the line demanding input does get read.

编辑:通过使用sys.stdout.write(...)sys.stdout.flush()代替创建我的孩子的 master 程序中的print,似乎可以修复提示行没有显示-实际上在两种情况下都可以读取.

By using sys.stdout.write(...) and sys.stdout.flush() instead of printing in my master program, which creates the child, seemed to fix the prompt line not getting displayed - it actually got read in both cases, though.

我还尝试了 select.poll(...) ,但似乎管道或PTY主文件描述符始终准备好进行写入.

I've also tried select.poll(...), but it seemed that the pipe or PTY master file descriptors are always ready for writing.

  • 我还想到的是,经过一段时间而没有生成新的输出时,尝试提供输入.但是,这是有风险的,因为无法知道程序是否正处于进行大量计算的过程中.
  • 正如@Antti Haapala在他的回答中提到的那样,可以替换glibc的read()系统调用包装器,以将输入传递给主程序.但是,这不适用于静态链接程序或汇编程序. (尽管现在考虑到这一点,任何这样的调用都可以从源代码中截获,并用read()的修补版本替换-可能仍然难以实现.)
  • 修改Linux内核代码以将read() syscall传递给程序可能很疯狂...
  • What also crossed my mind is to try feeding the input when some time has passed without new output having been generated. This, however, is risky, because there's no way to know if the program is just in the middle of doing a heavy calculation.
  • As @Antti Haapala mentioned in his answer, the read() system call wrapper from glibc could be replaced to communicate the inputs to the master program. However, this doesn't work with statically linked or assembly programs. (Although, now that I think of it, any such calls could be intercepted from the source code and replaced with the patched version of read() - could be painstaking to implement still.)
  • Modifying the Linux kernel code to communicate the read() syscalls to the program is probably insane...

我认为PTY是必经之路,因为它伪造了一个终端,并且交互式程序在各处的终端上运行.问题是,如何?

I think the PTY is the way to go, since it fakes a terminal and interactive programs are run on terminals everywhere. The question is, how?

推荐答案

您是否已注意到,如果stdout是终端程序(isatty),则raw_input会将提示字符串写入stderr;如果stdout不是终端,那么提示符也会写入stdout,但是stdout将处于完全缓冲模式.

Have you noticed that raw_input writes the prompt string into stderr if stdout is terminal (isatty); if stdout is not a terminal, then the prompt too is written to stdout, but stdout will be in fully buffered mode.

在tty上具有stdout

write(1, "Hello.\n", 7)                  = 7
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(2, "Type your name: ", 16)         = 16
fstat(0, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 3), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fb114059000
read(0, "abc\n", 1024)                   = 4
write(1, "Nice to meet you, abc!\n", 23) = 23

stdout不在tty上

ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff8d9d3410) = -1 ENOTTY (Inappropriate ioctl for device)
# oops, python noticed that stdout is NOTTY.
fstat(0, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 3), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f29895f0000
read(0, "abc\n", 1024)                     = 4
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x7f29891c4bd0}, {0x451f62, [], SA_RESTORER, 0x7f29891c4bd0}, 8) = 0
write(1, "Hello.\nType your name: Nice to m"..., 46) = 46
# squeeze all output at the same time into stdout... pfft.

因此,所有写操作都同时被压缩到stdout中.读取输入后,情况更糟.

Thus all writes are squeezed into stdout all at the same time; and what is worse, after the input is read.

因此,真正的解决方案是使用pty.但是,您做错了.为了使pty工作,必须使用pty.fork()命令,而不是子进程. (这将非常棘手).我有一些这样的工作代码:

The real solution is thus to use the pty. However you are doing it wrong. For the pty to work, you must use the pty.fork() command, not subprocess. (This will be very tricky). I have some working code that goes like this:

import os
import tty
import pty

program = "python"

# command name in argv[0]
argv = [ "python", "foo.py" ]

pid, master_fd = pty.fork()

# we are in the child process
if pid == pty.CHILD:
    # execute the program
    os.execlp(program, *argv)

# else we are still in the parent, and pty.fork returned the pid of 
# the child. Now you can read, write in master_fd, or use select:
# rfds, wfds, xfds = select.select([master_fd], [], [], timeout)

请注意,根据子程序设置的终端模式,可能会出现不同类型的换行符,等等.

Notice that depending on the terminal mode set by the child program there might be different kinds of linefeeds coming out, etc.

现在有关等待输入"的问题,由于总是可以写入伪终端,因此无法真正解决.字符将在缓冲区中等待.同样,在阻塞之前,管道始终允许写入多达4K或32K或其他一些实现定义的数量.一种丑陋的方法是跟踪程序,并在程序进入读取系统调用时注意到它,fd = 0;另一种方法是使用替换的"read()"系统调用制作一个C模块,并将其链接到动态链接程序的glibc之前(如果可执行文件是静态链接的,或者直接通过汇编程序使用系统调用则失败...),并且然后在执行read(0,...)系统调用时将向python发出信号.总而言之,可能完全不值得麻烦.

Now about the "waiting for input" problem, that cannot be really helped as one can always write to a pseudoterminal; the characters will be put to wait in the buffer. Likewise, a pipe always allows one to write up to 4K or 32K or some other implementation defined amount, before blocking. One ugly way is to strace the program and notice whenever it enters the read system call, with fd = 0; the other would be to make a C module with a replacement "read()" system call and link it in before glibc for the dynamic linker (fails if the executable is statically linked or uses system calls directly with assembler...), and then would signal python whenever the read(0, ...) system call is executed. All in all, probably not worth the trouble exactly.

这篇关于检测子进程何时等待输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆