为什么打印到标准输出这么慢?可以加速吗? [英] Why is printing to stdout so slow? Can it be sped up?

查看:25
本文介绍了为什么打印到标准输出这么慢?可以加速吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直对使用打印语句简单地输出到终端需要多长时间感到惊讶/沮丧.在最近的一些痛苦缓慢的日志记录之后,我决定研究它,并惊讶地发现几乎所有的时间都花在等待终端处理结果上.

能否以某种方式加快写入标准输出的速度?

我写了一个脚本('print_timer.py' 在这个问题的底部)来比较将 100k 行写入标准输出、文件和标准输出重定向到 /dev 时的时间/null.这是计时结果:

$ python print_timer.py这是一个测试这是一个测试<剪断了99997行>这是一个测试-----时序总结(每个 100k 行)-----打印:11.950 秒写入文件(+ fsync):0.122 秒使用 stdout =/dev/null 打印:0.050 秒

哇.为了确保 python 没有在幕后做任何事情,例如识别出我将 stdout 重新分配给/dev/null 或其他什么,我在脚本之外进行了重定向...

$ python print_timer.py >/开发/空-----时序总结(每个 100k 行)-----打印:0.053 秒写入文件(+fsync):0.108 秒使用 stdout =/dev/null 打印:0.045 秒

所以这不是python技巧,它只是终端.我一直都知道将输出转储到/dev/null 会加快速度,但从未想过这有那么重要!

让我惊讶的是 tty 的速度有多慢.写入物理磁盘怎么可能比写入屏幕"(大概是全 RAM 操作)快得多,并且与简单地使用/dev/null 转储到垃圾中一样快?

此链接 讨论终端将如何阻止 I/O 以便它可以解析[输入],更新其帧缓冲区,与X 服务器通信以滚动窗口等等"...但我不完全明白.什么可以花这么长时间?

我认为没有出路(缺少更快的 tty 实现?)但我想无论如何我都会问.

<小时>

更新:在阅读了一些评论后,我想知道我的屏幕尺寸实际上对打印时间有多大影响,它确实有一定的意义.上面真正慢的数字是我的 Gnome 终端放大到 1920x1200.如果我将其减小得非常小,我会得到...

-----时序总结(每个 100k 行)-----打印时间:2.920 秒写入文件(+fsync):0.121 秒使用 stdout =/dev/null 打印:0.048 秒

那当然更好(~4x),但不会改变我的问题.它只会添加到我的问题,因为我不明白为什么终端屏幕渲染应该减慢写入标准输出的应用程序.为什么我的程序需要等待屏幕渲染才能继续?

不是所有的终端/tty 应用程序都是平等的吗?我还没有尝试.在我看来,终端应该能够缓冲所有传入的数据,以不可见的方式解析/渲染它,并且仅以合理的帧速率渲染当前屏幕配置中可见的最新块.因此,如果我可以在大约 0.1 秒内将 +fsync 写入磁盘,那么终端应该能够以该顺序完成相同的操作(在执行过程中可能会有一些屏幕更新).

我仍然希望有一个可以从应用程序端更改的 tty 设置,以使这种行为对程序员更好.如果这严格来说是终端应用程序问题,那么这可能甚至不属于 StackOverflow?

我错过了什么?

<小时>

这里是用于生成时序的python程序:

导入时间、sys、tty导入操作系统行数 = 100000line = "这是一个测试"摘要 = ""cmd = "打印"startTime_s = time.time()对于范围内的 x(lineCount):印刷线t = time.time() - startTime_s摘要 += "%-30s:%6.3f s
" % (cmd, t)#添加一个换行符以匹配上面的行输出...行 += "
"cmd = "写入文件 (+fsync)"fp = file("out.txt", "w")startTime_s = time.time()对于范围内的 x(lineCount):fp.write(行)os.fsync(fp.fileno())t = time.time() - startTime_s摘要 += "%-30s:%6.3f s
" % (cmd, t)cmd = "打印标准输出 =/dev/null"sys.stdout = file(os.devnull, "w")startTime_s = time.time()对于范围内的 x(lineCount):fp.write(行)t = time.time() - startTime_s摘要 += "%-30s:%6.3f s
" % (cmd, t)打印 >>sys.stderr, "-----"打印 >>sys.stderr,时序总结(每个 100k 行)"打印 >>sys.stderr, "-----"打印 >>sys.stderr,总结

解决方案

感谢所有评论!在你的帮助下,我最终自己回答了这个问题.不过,回答你自己的问题感觉很脏.

问题 1:为什么打印到标准输出很慢?

答案:打印到标准输出本身很慢.这是您使用的终端很慢.它与应用程序端的 I/O 缓冲几乎为零(例如:python 文件缓冲).见下文.

问题 2:可以加速吗?

答案: 是的,它可以,但似乎不是从程序方面(对标准输出进行打印"的方面).要加快速度,请使用速度更快的其他终端模拟器.

说明...

我尝试了一个名为 wterm 的自我描述的轻量级"终端程序,并获得了显着更好的结果.下面是我的测试脚本的输出(在问题的底部),当在 wterm 中以 1920x1200 运行在同一系统上时,基本打印选项使用 gnome-terminal 需要 12 秒:

<前>-----时序总结(每个 100k 行)-----打印时间:0.261 秒写入文件(+fsync):0.110 秒使用 stdout =/dev/null 打印:0.050 秒

0.26s 比 12s 好多了!我不知道 wterm 在如何按照我建议的方式呈现到屏幕上是否更智能(以合理的帧速率呈现可见"尾部),或者它是否只是比 gnome-terminal 做得更少.不过,就我的问题而言,我已经得到了答案.gnome-terminal 很慢.

所以 - 如果你有一个长时间运行的脚本,你觉得它很慢,并且它会向标准输出输出大量文本......尝试不同的终端,看看它是否更好!

请注意,我几乎从 ubuntu/debian 存储库中随机提取了 wterm.此链接 可能是同一个终端,但我不确定.我没有测试任何其他终端模拟器.

<小时>

更新:因为我不得不抓痒痒,所以我用相同的脚本和全屏 (1920x1200) 测试了一大堆其他终端模拟器.我手动收集的统计数据在这里:

<前>冬季 0.3saterm 0.3s接收时间 0.3 秒mxvt 0.4s控制台 0.6s夜宵 0.7slx终端7sxterm 9s侏儒终端 12sxfce4-终端 12svala-terminal 18sxvt 48s

记录的时间是手动收集的,但它们非常一致.我记录了最佳(ish)值.YMMV,显然.

作为奖励,这是对一些可用的各种终端模拟器的有趣之旅!我很惊讶我的第一个替代"测试结果是最好的.

I've always been amazed/frustrated with how long it takes to simply output to the terminal with a print statement. After some recent painfully slow logging I decided to look into it and was quite surprised to find that almost all the time spent is waiting for the terminal to process the results.

Can writing to stdout be sped up somehow?

I wrote a script ('print_timer.py' at the bottom of this question) to compare timing when writing 100k lines to stdout, to file, and with stdout redirected to /dev/null. Here is the timing result:

$ python print_timer.py
this is a test
this is a test
<snipped 99997 lines>
this is a test
-----
timing summary (100k lines each)
-----
print                         :11.950 s
write to file (+ fsync)       : 0.122 s
print with stdout = /dev/null : 0.050 s

Wow. To make sure python isn't doing something behind the scenes like recognizing that I reassigned stdout to /dev/null or something, I did the redirection outside the script...

$ python print_timer.py > /dev/null
-----
timing summary (100k lines each)
-----
print                         : 0.053 s
write to file (+fsync)        : 0.108 s
print with stdout = /dev/null : 0.045 s

So it isn't a python trick, it is just the terminal. I always knew dumping output to /dev/null sped things up, but never figured it was that significant!

It amazes me how slow the tty is. How can it be that writing to physical disk is WAY faster than writing to the "screen" (presumably an all-RAM op), and is effectively as fast as simply dumping to the garbage with /dev/null?

This link talks about how the terminal will block I/O so it can "parse [the input], update its frame buffer, communicate with the X server in order to scroll the window and so on"... but I don't fully get it. What can be taking so long?

I expect there is no way out (short of a faster tty implementation?) but figure I'd ask anyway.


UPDATE: after reading some comments I wondered how much impact my screen size actually has on the print time, and it does have some significance. The really slow numbers above are with my Gnome terminal blown up to 1920x1200. If I reduce it very small I get...

-----
timing summary (100k lines each)
-----
print                         : 2.920 s
write to file (+fsync)        : 0.121 s
print with stdout = /dev/null : 0.048 s

That is certainly better (~4x), but doesn't change my question. It only adds to my question as I don't understand why the terminal screen rendering should slow down an application writing to stdout. Why does my program need to wait for screen rendering to continue?

Are all terminal/tty apps not created equal? I have yet to experiment. It really seems to me like a terminal should be able to buffer all incoming data, parse/render it invisibly, and only render the most recent chunk that is visible in the current screen configuration at a sensible frame rate. So if I can write+fsync to disk in ~0.1 seconds, a terminal should be able to complete the same operation in something of that order (with maybe a few screen updates while it did it).

I'm still kind of hoping there is a tty setting that can be changed from the application side to make this behaviour better for programmer. If this is strictly a terminal application issue, then this maybe doesn't even belong on StackOverflow?

What am I missing?


Here is the python program used to generate the timing:

import time, sys, tty
import os

lineCount = 100000
line = "this is a test"
summary = ""

cmd = "print"
startTime_s = time.time()
for x in range(lineCount):
    print line
t = time.time() - startTime_s
summary += "%-30s:%6.3f s
" % (cmd, t)

#Add a newline to match line outputs above...
line += "
"

cmd = "write to file (+fsync)"
fp = file("out.txt", "w")
startTime_s = time.time()
for x in range(lineCount):
    fp.write(line)
os.fsync(fp.fileno())
t = time.time() - startTime_s
summary += "%-30s:%6.3f s
" % (cmd, t)

cmd = "print with stdout = /dev/null"
sys.stdout = file(os.devnull, "w")
startTime_s = time.time()
for x in range(lineCount):
    fp.write(line)
t = time.time() - startTime_s
summary += "%-30s:%6.3f s
" % (cmd, t)

print >> sys.stderr, "-----"
print >> sys.stderr, "timing summary (100k lines each)"
print >> sys.stderr, "-----"
print >> sys.stderr, summary

解决方案

Thanks for all the comments! I've ended up answering it myself with your help. It feels dirty answering your own question, though.

Question 1: Why is printing to stdout slow?

Answer: Printing to stdout is not inherently slow. It is the terminal you work with that is slow. And it has pretty much zero to do with I/O buffering on the application side (eg: python file buffering). See below.

Question 2: Can it be sped up?

Answer: Yes it can, but seemingly not from the program side (the side doing the 'printing' to stdout). To speed it up, use a faster different terminal emulator.

Explanation...

I tried a self-described 'lightweight' terminal program called wterm and got significantly better results. Below is the output of my test script (at the bottom of the question) when running in wterm at 1920x1200 in on the same system where the basic print option took 12s using gnome-terminal:

-----
timing summary (100k lines each)
-----
print                         : 0.261 s
write to file (+fsync)        : 0.110 s
print with stdout = /dev/null : 0.050 s

0.26s is MUCH better than 12s! I don't know whether wterm is more intelligent about how it renders to screen along the lines of how I was suggesting (render the 'visible' tail at a reasonable frame rate), or whether it just "does less" than gnome-terminal. For the purposes of my question I've got the answer, though. gnome-terminal is slow.

So - If you have a long running script that you feel is slow and it spews massive amounts of text to stdout... try a different terminal and see if it is any better!

Note that I pretty much randomly pulled wterm from the ubuntu/debian repositories. This link might be the same terminal, but I'm not sure. I did not test any other terminal emulators.


Update: Because I had to scratch the itch, I tested a whole pile of other terminal emulators with the same script and full screen (1920x1200). My manually collected stats are here:

wterm           0.3s
aterm           0.3s
rxvt            0.3s
mrxvt           0.4s
konsole         0.6s
yakuake         0.7s
lxterminal        7s
xterm             9s
gnome-terminal   12s
xfce4-terminal   12s
vala-terminal    18s
xvt              48s

The recorded times are manually collected, but they were pretty consistent. I recorded the best(ish) value. YMMV, obviously.

As a bonus, it was an interesting tour of some of the various terminal emulators available out there! I'm amazed my first 'alternate' test turned out to be the best of the bunch.

这篇关于为什么打印到标准输出这么慢?可以加速吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆