什么减慢了我的批处理文件? [英] what is slowing down my batch file?

查看:76
本文介绍了什么减慢了我的批处理文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在批处理文件中四处走动,想知道为什么在以下情况下输出文件所需的时间存在巨大差异:

fooling around a bit with batch files, and wondering why there is a huge difference in time required to output a file in below scenarios :

方案1:简单地遍历日志文件,对于每一行,始终使用第5个标记,除非它包含过滤字符串.

Scenario 1 : Simple traverse through a log file,and for every row always taking the 5th token, unless it contains a filter string.

(for /f "tokens=5" %%a in (test.log) do @echo(%%a) |  findstr /v "filter_1 filter_2" > !filter!.txt

这很好用,经过一个50M的文件,可以在10秒内返回一个较小的10Mb的文件.

This works great, going through a 50M file returns me a smaller 10Mb file in 10 seconds.

方案2:完全相同,但是在令牌的前面和结尾添加一些内容,以便我可以将其输出为xml文件而不是文本文件.为此,我不得不对其进行如下重建

Scenario 2 : Do exactly the same, but add something in front and end of the token so I can output as an xml file rather than a text file. To do so I had to rebuild it a bit as below

echo ^<rows^> > test.xml

>>test.xml (
for /f "tokens=5" %%a in (
    'findstr /v "filter1 filter2" test.log'
    ) do echo ^<r a="%%a"/^> 
)

echo ^</rows^> >> test.xml

对于小文件,它可以正常工作,但是对于大文件,它却像永远一样.无论如何,还是可以使用场景1的语法来实现场景2所需的功能,因为这似乎更加有效.

It works as expected for small files,but takes like forever for large files. Is there anyway to achieve what I want with scenario 2 but using the scenario 1 syntax, as that seems much more efficient.

推荐答案

FOR/F始终在开始任何迭代之前缓冲IN()子句的内容.对于读取文件以及处理命令输出都是如此.但是,我认为在缓冲命令输出的方式方面存在一些根本差异,这使得在大输出情况下其输出速度特别慢. MC ND对于为什么缓冲大输出是一个很好的解释.慢.

FOR /F always buffers the content of the IN() clause prior to beginning any iterations. This is true for both reading a file, as well as processing the output of a command. However, I believe there is some fundamental difference in how command output is buffered that makes it particularly slow with large output. MC ND has a nice explanation for why buffering of large output is so slow.

大多数人惊讶地发现有时最快的批处理解决方案是将命令输出写入临时文件,然后使用FOR/F读取临时文件.只要您的磁盘驱动器速度很快,速度就会很快.

Most people are surprised to learn that sometimes the fastest batch solution is to write the command output to a temp file, and then use FOR /F to read the temp file. This will be fast as long as your disk drive is fast.

我相信以下内容将大大加快工作速度:

I believe the following will speed things considerably:

findstr /v "filter1 filter2" test.log >test.log.mod
>test.xml (
  echo ^<rows^>
  for /f "tokens=5" %%A in (test.log.mod) do echo ^<r a="%%A"/^>
  echo ^</rows^>
)
del test.log.mod

另一个选择是将XML包装器添加到原始管道的左侧,然后适当地修改FINDSTR过滤器.但是上述解决方案可能仍然更快,具体取决于被过滤掉的行数.

Another option would be to add the XML wrapper to the left side of your original pipe, and then modify your FINDSTR filters appropriately. But the above solution may still be faster, depending on the number of lines that get filtered out.

(
  echo ^<rows^>
  for /f "tokens=5" %%A in (test.log) do echo ^<r a="%%A"/^>
  echo ^</rows^>
) | findstr /v /c:"modifiedFilter_1" /c:"modifiedFilter_2" > test.xml

如果过滤器是正则表达式,则FINDSTR也将需要/R选项.

The FINDSTR will also need the /R option if the filters are regular expressions.

但是更快的解决方案是使用Windows的sed或JScript/Batch混合实用程序,我的REPL.BAT或Aacini的FINDREPL.BAT.

But a far faster solution would be to use something like sed for Windows, or either of the JScript/Batch hybrid utilities, my REPL.BAT, or Aacini's FINDREPL.BAT.

这篇关于什么减慢了我的批处理文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆