从多个csv文件中删除标头 [英] Remove header from multiple csv files

查看:112
本文介绍了从多个csv文件中删除标头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我每天有来自不同服务器的多个csv文件.这些文件很大(超过200 MB).我必须删除所有这些csv文件的标题 并使用批处理文件将其替换为所需的列标题.

I have multiple csv files coming on daily basis from a different server. These files are huge(over 200 MB). I have to remove header for all these csv files and replace them with required column headers using batch file.

下面的代码可以很好地仅从一个文件中删除列标题:

The below code works fine to remove the column headers from one single file only:

@echo off
set "csv=mycsv.csv">"%csv%.new"
(
    for /f skip^=1^ usebackq^ delims^=^ eol^= %%A in ("%csv%") do echo %%A
)
move /y "%csv%.new" "%csv%" >nul

推荐答案

鉴于CSV文件不包含任何 TAB 字符(已被 SPACE 序列替换) (使用more命令输入字符),并且文件长度不超过65534行(在这种情况下,more需要用户交互),您可以尝试以下操作之一:

Given that the CSV files do not contain any TAB characters (which were replaced by sequences of SPACE characters by the used more command) and that no file is longer than 65534 lines (in which case more expects user interaction), you could try one of the following:

  1. 新的列标题是由另一个文件headerfile.csv提供的:

< "headerfile.csv" set /P "HEADER="
for %%F in ("*.csv") do (
    if /I not "%%~F"=="headerfile.csv" (
        > "%%~F.tmp" echo(%HEADER%
        >>"%%~F.tmp" more +1 "%%~F"
        move /Y "%%~F.tmp" "%%~F"
    )
)

如果headerfile.csv不在所有其他CSV文件所在的当前目录中,则可能不希望将其排除在处理之外;只需删除if查询即可.

You might not want to exclude headerfile.csv from being processed in case it is not located in the current directory where all the other CSV files are; simply remove the if query then.

新的列标题以字符串常量形式给出:

The new column header is given as a string constant:

set "HEADER=new,header,string,here"
for %%F in ("*.csv") do (
    > "%%~F.tmp" echo(%HEADER%
    >>"%%~F.tmp" more +1 "%%~F"
    move /Y "%%~F.tmp" "%%~F"
)


更新

这是不使用more命令的一种方法,因此它的限制不再适用.它还不使用for /F,它将每行的长度限制为8191字节/字符:


Update

Here is a way without using the more command, so its limitations do no longer apply. It does also not use for /F which would limit the length of each line to 8191 bytes/characters:

  1. 新的列标题是由另一个文件headerfile.csv提供的:

< "headerfile.csv" set /P "HEADER="
for %%F in ("*.csv") do (
    if /I not "%%~F"=="headerfile.csv" (
        > "%%~F.tmp" echo(%HEADER%
        >>"%%~F.tmp" < "%%~F" (set /P = & findstr "^")
        move /Y "%%~F.tmp" "%%~F"
    )
)

  • 新的列标题以字符串常量形式给出:

  • The new column header is given as a string constant:

    set "HEADER=new,header,string,here"
    for %%F in ("*.csv") do (
        > "%%~F.tmp" echo(%HEADER%
        >>"%%~F.tmp" < "%%~F" (set /P = & findstr "^")
        move /Y "%%~F.tmp" "%%~F"
    )
    

  • 请注意,标头行仍限制为8191字节/字符,因为它存储在变量中(以避免进行多个文件读取操作),并且还受相关的echo(%HEADER%命令行的限制,那个大小.为了克服此限制,只需将标头放入文本文件中,并在循环中在附加数据之前将其复制到%%~F.tmp.

    Note that the header line is still limited to 8191 bytes/characters, because it is stored in a variable (in order to avoid multiple file read operations), and also by the related echo(%HEADER% command line which is also limited to that size. To overcome this limit, place only the header into a text file and with in the loop, copy it to %%~F.tmp prior to appending the data.

    这篇关于从多个csv文件中删除标头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆