捕获通用换行符但保留原始换行符 [英] Catch universal newlines but preserve original
问题描述
所以这是我的问题,
我正在尝试编写一个使用 Python 的 subprocess
模块运行另一个进程的简单程序,我想捕获该进程的实时输出.
I'm trying to do a simple program that runs another process using Python's subprocess
module, and I want to catch real-time output of the process.
我知道可以这样做:
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
for line in iter(proc.stdout.readline, ""):
line = line.rstrip()
if line != "":
print(line)
问题是,该进程可能会生成带有回车符 \r
的输出,我想在我的程序中模拟这种行为.
The issue is, the process might generate output with a carriage return \r
, and I want to simulate that behavior in my program.
如果我在 Popen
中使用 universal_newlines
标志,那么我可以捕获使用回车生成的输出,但我不知道它是这样的,我只能用换行符定期"打印它.我想避免这种情况,因为这可能会产生大量输出.
If I use the universal_newlines
flag in Popen
, then I could catch the output that is generated with a carriage return, but I wouldn't know it was as such, and I could only print it "regularly" with a newline. I want to avoid that, as this could be a lot of output.
我的问题基本上是我是否可以像 \n
一样捕获 \r
输出,但将它与实际的 \n
输出区分开来
My question is basically if I could catch the \r
output like it is a \n
but differentiate it from actual \n
output
编辑
这是我尝试过的一些简化代码:
Here is some simplified code of what I tried:
文件download.py
:
import subprocess
try:
subprocess.check_call(
[
"aws",
"s3",
"cp",
"S3_LINK",
"TARGET",
]
)
except subprocess.CalledProcessError as err:
print(err)
raise SystemExit(1)
文件process_runner.py
:
import os
import sys
import subprocess
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
for char in iter(lambda: proc.stdout.read(1), ""):
sys.stdout.write(char)
download
中的代码使用了aws s3 cp
,它给出了下载进度的回车.我想在我的程序 process_runner
中模拟这种输出行为,它接收 download
的输出.
The code in download
uses aws s3 cp
, which gives carriage returns of the download progress. I want to simulate this behavior of output in my program process_runner
which receives download
's output.
起初我尝试迭代 readline
而不是 read(1)
.由于忽略了 CR,这不起作用.
At first I tried to iter readline
instead of read(1)
. That did not work due to the CR being overlooked.
推荐答案
一种可能的方法是通过既不指定 encoding
也不指定 error
和 of当然不是universal_newline
.然后,我们可以在二进制流周围使用 TextIOWrapper
,和 newline=''
.因为 TextIOWrapper 的文档说:
A possible way is to use the binary interface of Popen by specifying neither encoding
nor error
and of course not universal_newline
. And then, we can use a TextIOWrapper
around the binary stream, with newline=''
. Because the documentation for TextIOWrapper says:
... 如果换行符是 None
... 如果它是 ''
,则启用通用换行符模式,但行尾会返回给调用者未翻译>
... if newline is
None
... If it is''
, universal newlines mode is enabled, but line endings are returned to the caller untranslated
(符合 PEP 3116)
(which is conformant with PEP 3116)
您的原始代码可以更改为:
You original code could be changed to:
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
out = io.TextIOWrapper(proc.stdout, newline='')
for line in out:
# line is delimited with the universal newline convention and actually contains
# the original end of line, be it a raw \r, \n of the pair \r\n
...
这篇关于捕获通用换行符但保留原始换行符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!