DOS批处理脚本解析CSV文件和输出的文本文件 [英] DOS batch script to parse CSV file and output a text file

查看:1501
本文介绍了DOS批处理脚本解析CSV文件和输出的文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我见过的另一页的响应(http://stackoverflow.com/questions/6470570/help-in-writing-a-batch-script-to-parse-csv-file-and-output-a-text-file) - 辉煌code顺便说一句 -

I've seen a response on another page (http://stackoverflow.com/questions/6470570/help-in-writing-a-batch-script-to-parse-csv-file-and-output-a-text-file) - brilliant code BTW - :

@ECHO OFF
IF "%~1"=="" GOTO :EOF
SET "filename=%~1"
SET fcount=0
SET linenum=0
FOR /F "usebackq tokens=1-10 delims=," %%a IN ("%filename%") DO ^
CALL :process "%%a" "%%b" "%%c" "%%d" "%%e" "%%f" "%%g" "%%h" "%%i" "%%j"
GOTO :EOF

:trim
SET "tmp=%~1"
:trimlead
IF NOT "%tmp:~0,1%"==" " GOTO :EOF
SET "tmp=%tmp:~1%"
GOTO trimlead

:process
SET /A linenum+=1
IF "%linenum%"=="1" GOTO picknames

SET ind=0
:display
IF "%fcount%"=="%ind%" (ECHO.&GOTO :EOF)
SET /A ind+=1
CALL :trim %1
SETLOCAL ENABLEDELAYEDEXPANSION
ECHO !f%ind%!!tmp!
ENDLOCAL
SHIFT
GOTO display

:picknames
IF %1=="" GOTO :EOF
CALL :trim %1
SET /a fcount+=1
SET "f%fcount%=%tmp%"
SHIFT
GOTO picknames

它的工作原理为出色我在做格式文件示例:

It works brilliantly for an example csv file I made in the format:

头,名称,地点结果
一,二,三搜索
四,五,六搜索

Header,Name,Place
one,two,three
four,five,six

但实际的文件我想改变64场包括 - 所以我改变了令牌= 1-10令牌=​​ 1-64,并增加了一个%%权等多达64个变量(最后一个被称为%% BL例如)。但是现在,当我在我的'大'csv文件运行批处理(用64标记)什么也不会发生。没有错误(好),但没有输出! (坏)。如果有人能够帮助这将是梦幻般的...我SOOOO接近让整个应用程序的工作,如果我可以只钉这最后一点!或者,如果任何人有一些例如code,会做的令牌数量不确定的相似......最终,我想让这将是这样一个字符串:

However the actual file I want to change comprises of 64 fields - so I altered the tokens=1-10 to tokens=1-64 and increased the %%a etc right up to 64 variables (the last being called %%BL for example). Now, however, when I run the batch on my 'big' csv file (with the 64 tokens) nothing happens. No errors (good) but no output! (bad). If anyone can help that would be fantastic... am soooo close to getting the whole app working if I can just nail this last bit! Or if anyone has some example code that will do similar for an indefinite number of tokens... Ultimately I want to make a string which will be something like:

字段7,field12,field15,field18

field7,field12,field15,field18

在此先感谢,
杰夫

Thanks in advance, Jeff

推荐答案

重要更新 - 我不认为Windows批处理是为您的需要一个很好的选择,因为一个FOR / F无法分析更多超过31个代币。请参见下面的附录底部的一个解释。

Important update - I don't think Windows batch is a good option for your needs because a single FOR /F cannot parse more than 31 tokens. See the bottom of the Addendum below for an explanation.

但是,它是可以做到你想要什么批次。这种丑陋的code会给您可以访问全部64个令牌。

However, it is possible to do what you want with batch. This ugly code will give you access to all 64 tokens.

for /f "usebackq tokens=1-29* delims=," %%A in ("%filename%") do (
  for /f "tokens=1-26* delims=," %%a in ("%%^") do (
    for /f "tokens=1-9 delims=," %%1 in ("%%{") do (
      rem Tokens 1-26 are in variables %%A - %%Z
      rem Token  27 is in %%[
      rem Token  28 is in %%\
      rem Token  29 is in %%]
      rem Tokens 30-55 are in %%a - %%z
      rem Tokens 56-64 are in %%1 - %%9
    )
  )
)

本增编提供了上述的工作原理的重要信息。

The addendum provides important info on how the above works.

如果您只需要几个令牌US $ p $垫出来跻身64就行了,那么解决的办法是在稍微容易,你也许能避免使用疯狂的字​​符变量。但仍有工作要做仔细记账。

If you only need a few of the tokens spread out amongst the 64 on the line, then the solution is marginally easier in that you might be able to avoid using crazy characters as FOR variables. But there is still careful bookkeeping to be done.

例如,下面将让您使用代币5,27,46和64

For example, the following will give you access to tokens 5, 27, 46 and 64

for /f "usebackq tokens=5,27,30* delims=," %%A in ("%filename%") do (
  for /f "tokens=16,30* delims=," %%E in ("%%D") do (
    for /f "tokens=4 delims=," %%H in ("%%G") do (
      rem Token  5 is in %%A
      rem Token 27 is in %%B
      rem Token 46 is in %%E
      rem Token 64 is in %%H
    )
  )
)

原来的答案
变量被限制为单个字符,所以你%% BL策略不能正常工作。变量是区分大小写的。据微软则只能获取一个语句中的26个符号,但它是可能的,如果你使用的不仅仅是阿尔法更多得到更多。它是一种痛苦,因为你需要一个ASCII表中,找出哪些人物去的地方。 FOR不允许不过只是任何字符,并在一个单一FOR / F可以分配为31 +1令牌的最大数量。任何试图解析和分配超过31会悄悄地失败,因为你已经发现了。

Original Answer FOR variables are limited to a single character, so your %%BL strategy can't work. The variables are case sensitive. According to Microsoft you are limited to capturing 26 tokens within one FOR statement, but it is possible to get more if you use more than just alpha. Its a pain because you need an ASCII table to figure out which characters go where. FOR does not allow just any character however, and the maximum number of tokens that a single FOR /F can assign is 31 +1. Any attempt to parse and assign more than 31 will quietly fail, as you have discovered.

值得庆幸的是,我不认为你需要,许多令牌。您只需指定你想要的标记选项,它的标记。

Thankfully, I don't think you need that many tokens. You simply specify which tokens you want with the TOKENS option.

for /f "usebackq tokens=7,12,15,18 delims=," %%A in ("%filename%") do echo %%A,%%B,%%C,%%D

会给你7日,12日,15日和18日的令牌。

will give you your 7th, 12th, 15th and 18th tokens.

附录

我进行了一些测试,并可以报告以下的(更新响应杰布的评论)的:

I performed some tests, and can report the following (updated in response to jeb's comment):

大多数字符可被用作一个FOR变量,包括扩展ASCII 128-254。但有些字符不能被用来定义在FOR语句的第一部分的变量,但可以在DO子句中使用。一些不能被用于任何。有些人没有限制,但需要特殊的语法。

Most characters can be used as a FOR variable, including extended ASCII 128-254. But some characters cannot be used to define a variable in the first part of a FOR statement, but can be used in the DO clause. A few can't be used for either. Some have no restrictions, but require special syntax.

以下是有限制的,或需要特殊语法字符的摘要。注意:如&LT尖括号内的文字;空间方式> 重新presents单个字符

The following is a summary of characters that have restrictions or require special syntax. Note that text within angle brackets like <space> represents a single character.

Dec  Hex   Character   Define     Access
  0  0x00  <nul>         No       No
 09  0x09  <tab>         No       %%^<tab>  or  "%%<tab>"
 10  0x0A  <LF>          No       %%^<CR><LF><CR><LF>  or  %%^<LF><LF>
 11  0x0B  <VT>          No       %%<VT>
 12  0x0C  <FF>          No       %%<FF>
 13  0x0D  <CR>          No       No
 32  0x20  <space>       No       %%^<space>  or  "%%<space>"
 34  0x22  "             %%^"     %%"  or  %%^"
 36  0x24  $             %%$      %%$ works, but %%~$ does not
 37  0x25  %             %%%%     %%~%%
 38  0x26  &             %%^&     %%^&  or  "%%&"
 41  0x29  )             %%^)     %%^)  or  "%%)"
 44  0x2C  ,             No       %%^,  or  "%%,"
 59  0x3B  ;             No       %%^;  or  "%%;"
 60  0x3C  <             %%^<     %%^<  or  "%%<"
 61  0x3D  =             No       %%^=  or  "%%="
 62  0x3E  >             %%^>     %%^>  or  "%%>"
 94  0x5E  ^             %%^^     %%^^  or  "%%^"
124  0x7C  |             %%^|     %%^|  or  "%%|"
126  0x7E  ~             %%~      %%~~ (%%~ may crash CMD.EXE if at end of line)
255  0xFF  <NB space>    No       No

特殊字符,例如 ^ &LT; &GT; | &安培; 必须是转义或引用。例如,下面的工作:

Special characters like ^ < > | & must be either escaped or quoted. For example, the following works:

for /f %%^< in ("OK") do echo "%%<" %%^<

某些字符不能被用来定义一个FOR变量。例如,下面给出一个语法错误:

Some characters cannot be used to define a FOR variable. For example, the following gives a syntax error:

for /f %%^= in ("No can do") do echo anything

但%% =可以通过使用令牌选项隐含定义,DO子句中访问像这样的值:

But %%= can be implicitly defined by using the TOKENS option, and the value accessed in the DO clause like so:

for /f "tokens=1-3" %%^< in ("A B C") do echo %%^< %%^= %%^>

百分比为奇 - 您可以定义使用 %%%% 变量。但是,除非你使用〜修饰符的价值不能被访问。这意味着封闭的报价不能preserved。

The % is odd - You can define a FOR variable using %%%%. But The value cannot be accessed unless you use the ~ modifier. This means enclosing quotes cannot be preserved.

for /f "usebackq tokens=1,2" %%%% in ('"A"') do echo %%%% %%~%%

以上收益率 %% A

的〜是一种具有潜在危险的变量。如果您尝试使用 %%〜在一行的末尾来访问变量,就可以得到未predictable效果,甚至可能会崩溃CMD.EXE!访问它没有任何限制,唯一可靠的方法是使用 %% ~~ ,这当然条任何封闭的报价。

The ~ is a potentially dangerous FOR variable. If you attempt to access the variable using %%~ at the end of a line, you can get unpredictable results, and may even crash CMD.EXE! The only reliable way to access it without restrictions is to use %%~~, which of course strips any enclosing quotes.

for /f %%~ in ("A") do echo This can crash because its the end of line: %%~

for /f %%~ in ("A") do echo But this (%%~) should be safe

for /f %%~ in ("A") do echo This works even at end of line: %%~~

如前所述,一个FOR / F可以解析和最多31个令牌分配。例如:

As already stated, a single FOR /F can parse and assign a maximum of 31 tokens. For example:

@echo off
setlocal enableDelayedExpansion
set "str="
for /l %%n in (1 1 35) do set "str=!str! %%n"
for /f "tokens=1-31" %%A in ("!str!") do echo A=%%A _=%%_

以上收益率 A = 1 _ = 31 注 - 令牌2-30的工作就好了,我只是想一个小例子

任何试图解析和分配超过31个代币没有设置ERRORLEVEL将静默失败。

Any attempt to parse and assign more than 31 tokens will silently fail without setting ERRORLEVEL.

@echo off
setlocal enableDelayedExpansion
set "str="
for /l %%n in (1 1 35) do set "str=!str! %%n"
for /f "tokens=1-32" %%A in ("!str!") do echo this example fails entirely

您可以分析并分配多达31个代币,其余分配给另一个标记如下:

You can parse and assign up to 31 tokens and assign the remainder to another token as follows:

@echo off
setlocal enableDelayedExpansion
set "str="
for /l %%n in (1 1 35) do set "str=!str! %%n"
for /f "tokens=1-31*" %%A in ("!str!") do echo A=%%A  _=%%_  `=%%`

以上收益率 A = 1 _ = 31 `= 32 33 34 35

现在对于真正的坏消息。单个FOR / F无法解析超过31令牌,因为我学到了当我看着的定情限制在在Windows批处理脚本命令号

And now for the really bad news. A single FOR /F can never parse more than 31 tokens, as I learned when I looked at Number of tokens limit in a FOR command in a Windows batch script

@echo off
setlocal enableDelayedExpansion
set "str="
for /l %%n in (1 1 35) do set "str=!str! %%n"
for /f "tokens=1,31,32" %%A in ("!str!") do echo A=%%A  B=%%B  C=%%C

在非常不幸的输出 A = 1 B = 31 C =%C

这篇关于DOS批处理脚本解析CSV文件和输出的文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆