如何从不同的子文件夹合并多个CSV文件? [英] How to Merge multiple CSV files from different subfolders?

查看:431
本文介绍了如何从不同的子文件夹合并多个CSV文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道这是一个常见的​​问题,但我遇到了一些错误,并希望一些帮助。

I know this is a common question but I run into some bugs and hope for some help.

我要在多个子文件夹的1000 CSV文件合并过成一个文件。脚本是在 MainFolder ,并应通过子文件夹例如运行 01_2015 05_2015 和CSV文件合并到在 MainFolder

I want to merge over a 1000 csv files in multiple subfolders into one file. The Script is in the MainFolder and should run through the subfolder e.g. 01_2015 to 05_2015 and merge the csv files into one file in the MainFolder.

我有以下文件夹结构:

-MainFolder
    -01_2015
    -02_2015
    -03_2015
    -04_2015
    -05_2015

我使用(该脚本从<一个得到它href=\"http://stackoverflow.com/questions/32995274/how-to-merge-multiple-csv-files-which-are-under-different-subfolders-into-a-sing\">here ):

@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION

SET SUMMARY_FILE=sumfile.csv
IF EXIST "%SUMMARY_FILE%" (DEL "%SUMMARY_FILE%")


SET /A LINE_COUNT=1

FOR /F "usebackq tokens=*" %%f IN (`DIR /S /B *.csv`) DO (
    FOR /F "usebackq tokens=*" %%s IN ("%%~f") DO (
        ECHO !LINE_COUNT!,%%s >>"%SUMMARY_FILE%"
        SET /A LINE_COUNT=!LINE_COUNT! + 1
    )
)
EXIT /B 0

它实际上是通过超过1000文件运行。但文件没有得到合并。怎么办?

It is actually running through the over 1000 files. But the files don't get merged. What to do?

推荐答案

试试这个稍微修改code:

Try this slightly modified code:

@ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
PUSHD "%~dp0"

SET "SUMMARY_FILE=sumfile.csv"
DEL /F "%SUMMARY_FILE%" 2>nul

SET "LINE_COUNT=1"

FOR /F "tokens=*" %%f IN ('DIR /S /B *.csv 2^>nul') DO (
    FOR /F "usebackq tokens=* eol=ÿ" %%s IN ("%%~f") DO (
        >>"%SUMMARY_FILE%" ECHO !LINE_COUNT!%%s
        SET /A LINE_COUNT+=1
    )
)

POPD
ENDLOCAL

重定向&GT;&gt;中%SUMMARY_FILE%现在是从行号目前的CSV文件中的行输出到摘要文件该行的开头。这就避免了汇总文件附加在每行尾的空间。

The redirection >>"%SUMMARY_FILE%" is now at beginning of the line which outputs the line from current CSV file with line number into the summary file. This avoids appending a space at end of every line in summary file.

你有目录的写权限被设置为当前目录中运行的批处理文件?

Do you have write permissions in directory which is set as current directory on running the batch file?

我添加行 PUSHD%〜DP0,以确保该批处理文件的目录是当前目录开始处理之前,恢复previous工作目录与 POPD 退出批处理之前。

I added the line PUSHD "%~dp0" to make sure the directory of the batch file is the current directory before starting processing and restore previous working directory with POPD before exiting batch processing.

EOL = Y 用于定义一个角色最有可能在CSV文件中不存在的行字符,而不是结束; 这是默认的。德国CSV文件中包含; 作为分隔符

eol=ÿ is used to define a character most likely not existing in the CSV files as end of line character instead of ; which is the default. German CSV files contain ; as separator.

在code页面的 Windows的1252 ,即是code页的Windows 1252的最后一个字符。这个字节是OEM code页850 一个不换行空格。因此,与批处理文件显示或code页850或OEM code页面进行编辑437 EOL = 显示在浏览器/编辑器。

Character ÿ has decimal value 255 in code page Windows-1252, i.e. is the last character in code page Windows-1252. This byte is a non breaking space in OEM code page 850. So with batch file being displayed or edited with code page 850 or OEM code page 437, eol=  is displayed in viewer/editor.

没有分隔符应!LINE_COUNT之间使用! %%小号如果所有的线用分号已启动在CSV文件,这也是字段值之间的分隔符。否则,分隔符(逗号,分号,管道(逃跑),标签)应插入留给 %%小号

No separator should be used between !LINE_COUNT! and %%s if all lines start already with a semicolon in the CSV files which is also the separator between the field values. Otherwise the separator (comma, semicolon, pipe (escaped), tab) should be inserted left to %%s.

另一个问题是,如果CSV文件的Uni code文件带UTF-16 codeD。在这种情况下,不总结文件将作为命令的读取含有大量的空字节的CSV文件中的任何行创建。

Another problem would be if the CSV files are Unicode files encoded with UTF-16. In this case no summary file would be created as command FOR reads any line from the CSV files containing lots of null bytes.

这篇关于如何从不同的子文件夹合并多个CSV文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆