删除/替换以逗号分隔的字符 [英] Deleting/replacing characters delimited by commas

查看:132
本文介绍了删除/替换以逗号分隔的字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试批量删除或以逗号(CSV)分隔的vbs文本,这些文本始终位于同一位置.它不会影响第一行,只会影响第二行.

I'm trying to delete by batch or vbs text delimited by commas (CSV) that are always in the same position. It would not affect the first line, only lines 2 onwards.

文件中的示例文本:

Code,Batch,File #,Reg Hours,O/T,Cost Number,Rate,Earnings,Earnings,Memo Code,Memo Amount,Earnings Code,Earnings Amount,Hours Code,Hours Amount,Earnings Code,Earnings Amount,Adjust Code,Adjust Amount
ABC,123,3980    ,78.52,,12331,10.00,,,,,,,, 
ABC,123,4026    ,29.38,,12331,10.00,,,,,,,, 
ABC,123,5065    ,64.46,,12331,10.00,,,,,,,, 
ABC,123,5125    ,80.00, 0.54,12331,11.00,,,,,,,, 

我想最后输入文字:

Code,Batch,File #,Reg Hours,O/T,Cost Number,Rate,Earnings,Earnings,Memo Code,Memo Amount,Earnings Code,Earnings Amount,Hours Code,Hours Amount,Earnings Code,Earnings Amount,Adjust Code,Adjust Amount
ABC,123,3980    ,78.52,,12331,,,,,,,,, 
ABC,123,4026    ,29.38,,12331,,,,,,,,, 
ABC,123,5065    ,64.46,,12331,,,,,,,,, 
ABC,123,5125    ,80.00, 0.54,12331,,,,,,,,, 

唯一的区别是费率"区域.它是从左边算起的第7个值,或者从右边算起的第9个值.第一行保持不变.

The only difference is the Rate area. It is the 7th separated value from the left, or 9th from the right. The first line remains intact.

批处理/vb是否有办法确定逗号分隔的值位置,删除值或将其替换为空",然后忽略第一行?

Is there a way for the batch/vbs to determine the comma separated value position, delete the value or replace it with 'nothing', and ignore the first line?

对于此示例,我们可以假定文件将始终命名为file.csv,并且位于D:\ location-'D:\ location \ file.csv'

For this example, we can assume the file will always be named file.csv, and located in D:\location - 'D:\location\file.csv'

谢谢!

推荐答案

REM <!-- language: lang-dos -->
@ECHO Off
SETLOCAL ENABLEDELAYEDEXPANSION
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q46534752.txt"
SET "outfile=%destdir%\outfile.txt"

:: Remove the output file

DEL "%outfile%" >NUL 2>nul

:: To reproduce the first line intact

FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO >"%outfile%" ECHO %%a&GOTO hdrdone

:hdrdone

(
REM to process the header line, remove the "skip=1" from the "for...%%a" command
FOR /f "usebackqskip=1delims=" %%a IN ("%filename1%") DO (
 REM step 1 - replace all commas with "|," to separate separators
 SET "line=%%a"
 SET "line=!line:,=|,!"
 FOR /f "tokens=1-7*delims=|" %%A IN ("!line!") DO (
  SET "line=%%A%%B%%C%%D%%E%%F%%H"
  ECHO !line:^|=!
 )
)
)>>"%outfile%"

GOTO :EOF

您需要根据自己的情况更改sourcedirdestdir的设置.

You would need to change the settings of sourcedir and destdir to suit your circumstances.

我使用了一个名为q46534752.txt的文件,其中包含您的数据用于测试.

I used a file named q46534752.txt containing your data for my testing.

生成定义为%outfile%

Produces the file defined as %outfile%

处理标题行是一个问题.所提供的代码应按您的要求执行,但是当该过程打算删除该列时,将列名保留在结果文件中似乎是不合逻辑的.要同时处理标题行,请删除第一行for行,然后从第二行中删除skip=1(跳过第一行).

Processing of the header line is an issue. The code as presented should do as you ask, but it seems illogical to retain the column name in the resultant file when the process is intended to remove that column. To process the header line also, delete the first for line and remove the skip=1 (which skips the first line) from the second.

根本问题是,批处理将一串定界符视为单个定界符,因此有必要将这些定界符分开.对于metavariable,这是不可能的,但是可以在循环中通过将metavariable转换为普通环境变量(line)并在delayed expansion模式下对该普通变量执行字符串替换仪式来完成.

The fundamental issue is that batch treats a string of delimiters as a single delimiter, so it's necessary to separate those delimiters. This is not possible against a metavariable, but can be done within a loop by transferring the metavariable into an ordinary environment variable (line) and performing the string-replace ceremony on that ordinary variable in delayed expansion mode.

所以-用|,替换每个,,然后使用|作为分隔符来处理结果字符串.请注意,对于第二个formetavariable处于不同的情况下-cmd区分大小写的少数情况之一.重建字符串,省略第7列(%% G),并使用*令牌,这意味着第八个令牌(%% H)在最高明确提及的令牌编号(7)和echo之后收到其余的行删除剩余的|个字符后.

So - replace each , with |,, then process the resultant string using | as a delimiter. Note that the metavariable is in a different case for the second for - one of the few occasions where cmd is case-sensitive. Reconstruct the string, omitting column 7 (%%G) and using the * token meaning the eighth token (%%H) receives the remainder-of-line after the highest explicitly-mentioned token number (7) and echo it after removing remaining | characters.

请注意,通常的策略是拒绝对SO的代码请求,并且仅以修复错误代码的方式进行响应.但是,在这种情况下,后续的浏览器可能会发现此响应是执行类似任务的关键,因此避免了不必要的发布.而且,我很无聊.

Note that it is normal policy to refuse code-requests on SO, and only respond in a manner to fix faulty code. In this case however, succeeding browsers may find this response to be the key to doing a similar task and hence refrain from posting unnecessarily. Also, I'm bored witless.

这篇关于删除/替换以逗号分隔的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆