批处理文件来保持十线中的一个 [英] batch file to keep one of ten lines
问题描述
我有n行的文件。 (N大于100百万美元)
I have a file with n lines. (n above 100 millions)
我要输出只有1 10行的文件,我不能将文件分割十个部分,只保留其中的一部分,因为它必须是一个小更随意。后来我做了一个统计分析,我不能创建在该数据的强烈的偏见)。
I want to output a file with only 1 of 10 lines, I can't split the file in ten part and keep only one part, as it must be a little more random. later I have to do a statistical analysis I can't afford to create a strong bias in the data).
我想读取文件,并为每个记录,如果记录号模10再输出。
I was thinking of reading the file and for each record if the record number mod 10 then output it.
约束条件是:
-
这是一个窗口(可能硬化)的计算机可能XP Vista或Windows Server 2003中。
it's a windows (likely hardened) computer possibly XP Vista or Windows server 2003.
没有开发工具可用
没有网络,USB,CD-ROM。读没有外部的沟通。
no network,usb,cd-rom. read no external communication.
所以我想Windows批处理文件(我不能假设的PowerShell和VBScript很可能已被删除)。而此刻在看FOR / F命令。
不过我不是专家,我不知道如何实现这一点。
Therefore I was thinking of windows batch file (I can't assume powershell, and vbscript is likely to have been removed). And at the moment looking at the FOR /F command. Still I am not an expert and I don't know how to achieve this.
的谢谢保罗的回答。
我重新格式化(与胡沙姆帮助)的答案,把它放在一个批处理文件:的
@echo off
setlocal
findstr/N . inputFile| findstr ^[0-9]*0: >temporaryFile
FOR /F "tokens=1,* delims=: " %%i in (temporaryfile) do echo %%j > outputFile
的感谢QUUX和大同为同类替代解决方案。对较大的文件进行快速测试后,但是保罗的回答是大约快8倍。我猜评价(在SET)是一种缓慢,即使逻辑似乎辉煌。的
推荐答案
好吧,我想我已经破解了:
Ok, I think I've cracked it:
findstr/N . path-to-log-file | findstr ^[0-9]*0:
(使用FINDSTR到行号码添加到行的开始,然后再只打印线,在零结尾行号)
(use findstr to add the line number to the beginning of the line, then again to print only lines with a line number ending in zero)
所以,你会得到一个线10,但ppended于各行的行号和结肠$ P $
So you'll get one line in 10, but with the linenumber and colon prepended to each line
<击>如果我可以只使用该剥离出来的命令行工具,想办法,我将修改这个答案:)击>
与删除的行号和结肠</ P>
Remove the line number and colon with
FOR /F "tokens=1,2* delims=: " %i in (file-with-linenumbers) do echo %j
保罗。
这篇关于批处理文件来保持十线中的一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!