击：慢重定向和过滤器 [英] Bash: Slow redirection and filter

查看：98 发布时间：2016/8/4 8:58:22 bash logging filtering redirect pipe

本文介绍了击：慢重定向和过滤器的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个调用产生的输出一个巨大的量程序的bash脚本。很多这样的数据是从我还没有创建一个Python包来了，它的输出我真的不能控制，也没有引起我的兴趣。

I have a bash script that calls a program which generates a humongous amount of output. A lot of this data is coming from a Python package that I have not created and whose output I can't really control, nor interests me.

我试图通过过滤外部Python包生成的输出和重定向干净输出到一个日志文件。如果我用普通的管道和grep前pressions，我失去了很多信息块。我读到的东西，可以与重定向实际发生（ 1 和<一个href=\"http://stackoverflow.com/questions/4290684/using-named-pipes-with-bash-problem-with-data-loss\">2).

I tried to filter the output generated by that external Python package and redirect the "cleaned" output to a log file. If I used regular pipes and grep expressions, I lost many chunks of information. I read that is something that can actually happen with the redirections (1 and 2).

为了解决这个问题，我做了重定向这样的：

In order to fix that, I made the redirections like this:

#!/bin/bash
regexTxnFilterer="\[txn\.-[[:digit:]]+\]"
regexThreadPoolFilterer="\[paste\.httpserver\.ThreadPool\]"
bin/paster serve --reload --pid-file="/var/run/myServer//server.pid"  parts/etc/debug.ini 2>&1 < "/dev/null" | while IFS='' read -r thingy ; do
        if [[ ! "$thingy" =~ $regexTxnFilterer ]] && [[ ! "$thingy" =~ $regexThreadPoolFilterer ]]; then
                  echo "$thingy" >> "/var/log/myOutput.log" 
        fi
done

这不会丢失任何信息（至少不是我可以告诉），并过滤我不需要（使用上述两种常规的前pressions）的字符串。

Which doesn't lose any information (at least not that I could tell) and filters the strings I don't need (using the two regular expressions above).

问题是，它提供的应用程序（我执行斌/贴纸事）不能忍受缓慢。有什么办法来达到同样的效果，但有更好的表现？

The issue is that it has rendered the application (the bin/paster thing I'm executing) unbearably slow. Is there any way to achieve the same effect but with a better performance?

感谢你在前进！

更新@ 2012-04-13 ：作为 shellter 中的一个人指出评论这个问题，它可以是提供我要筛选的输出的例子是有用的。这里有一群人：

Update @2012-04-13: As shellter pointed out in one of the comments to this question, it may be useful to provide examples of the outputs I want to filter. Here's a bunch of them:

2012-04-13 19:30:37,996 DEBUG [txn.-1220917568] new transaction
2012-04-13 19:30:37,997 DEBUG [txn.-1220917568] commit <zope.sqlalchemy.datamanager.SessionDataManager object at 0xbf4062c>
2012-04-13 19:30:37,997 DEBUG [txn.-1220917568] commit
Starting server in PID 18262.
2012-04-13 19:30:38,292 DEBUG [paste.httpserver.ThreadPool] Started new worker -1269716112: Initial worker pool
2012-04-13 19:33:08,158 DEBUG [txn.-1244144784] new transaction
2012-04-13 19:33:08,158 DEBUG [txn.-1244144784] commit
2012-04-13 19:32:06,980 DEBUG [paste.httpserver.ThreadPool] Added task (0 tasks queued)
2012-04-13 19:32:06,980 INFO [paste.httpserver.ThreadPool] kill_hung_threads status: 10 threads (0 working, 10 idle, 0 starting) ave time N/A, max time 0.00sec, killed 0 workers

有涉及，虽然线程池几个不同的消息，但我没能赶上任何

There's a few more different messages involving the ThreadPool though, but I couldn't catch any.

推荐答案

这可能是更快地使用基于grep的，解决这个

It may be faster to use a grep-based solution to this

#!/bin/bash
regexTxnFilterer="\[txn\.-[[:digit:]]+\]"
regexThreadPoolFilterer="\[paste\.httpserver\.ThreadPool\]"
bin/paster serve --reload --pid-file="/var/run/myServer//server.pid"  parts/etc/debug.ini 2>&1 < "/dev/null" | grep -vf <(echo "$regexTxnFilterer"; echo "$regexThreadPoolFilterer") >> "/var/log/myOutput.log"

您环路可能很慢，因为回声$啄＆GT;＆GT; /var/log/myOutput.log行打开和每次执行时关闭的日志文件。我不认为会有的grep的正则表达式匹配和bash的之间有很大的性能差异，但如果有它不会让我感到吃惊。

Your loop may be slow because the echo "$thingy" >> "/var/log/myOutput.log" line is opening and closing the log file every time it executes. I wouldn't expect there to be a big performance difference between grep's regex matching and bash's, but if there was it wouldn't surprise me.

后期修改

有一个简单得多的解决造成开放性能问题/每行关闭输出一次的方式。为什么这并没有发生之前给我，我也没办法。只要移动＆GT;＆GT; 你的循环之外

There's a far simpler way to fix the performance issue caused by opening/closing the output once per line. Why this didn't occur to me before, I have no idea. Just move the >> to outside your loop

#!/bin/bash
regexTxnFilterer="\[txn\.-[[:digit:]]+\]"
regexThreadPoolFilterer="\[paste\.httpserver\.ThreadPool\]"
bin/paster serve --reload --pid-file="/var/run/myServer//server.pid"  parts/etc/debug.ini 2>&1 < "/dev/null" | while IFS='' read -r thingy ; do
        if [[ ! "$thingy" =~ $regexTxnFilterer ]] && [[ ! "$thingy" =~ $regexThreadPoolFilterer ]]; then
                  echo "$thingy"
        fi
done  >> "/var/log/myOutput.log"

我看不到任何令人信服的理由，这是不是比的grep 解决方案更快或更慢，但它更接近了很多原来的code和少一点神秘。

I can't see any compelling reason why this would be either faster or slower than the grep solution, but it's a lot closer to the original code and a little less cryptic.

这篇关于击：慢重定向和过滤器的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

击：慢重定向和过滤器 [英] Bash: Slow redirection and filter

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录关闭

击：慢重定向和过滤器 [英] Bash: Slow redirection and filter

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录 关闭

登录关闭