合并 CSV 文件:追加而不是合并 [英] Merging CSV files : Appending instead of merging
问题描述
所以基本上我想合并几个 CSV 文件.我使用以下脚本来做到这一点:
So basically i want to merge a couple of CSV files. Im using the following script to do that :
paste -d , *.csv > final.txt
然而,这在过去对我有用,但这次不起作用.它将数据彼此相邻而不是彼此下方附加.例如包含以下格式记录的两个文件
However this has worked for me in the past but this time it doesn't work. It appends the data next to each other as opposed to below each other. For instance two files that contain records in the following format
CreatedAt ID
Mon Jul 07 20:43:47 +0000 2014 4.86249E+17
Mon Jul 07 19:58:29 +0000 2014 4.86238E+17
Mon Jul 07 19:42:33 +0000 2014 4.86234E+17
合并时给予
CreatedAt ID CreatedAt ID
Mon Jul 07 20:43:47 +0000 2014 4.86249E+17 Mon Jul 07 18:25:53 +0000 2014 4.86215E+17
Mon Jul 07 19:58:29 +0000 2014 4.86238E+17 Mon Jul 07 17:19:18 +0000 2014 4.86198E+17
Mon Jul 07 19:42:33 +0000 2014 4.86234E+17 Mon Jul 07 15:45:13 +0000 2014 4.86174E+17
Mon Jul 07 15:34:13 +0000 2014 4.86176E+17
有谁知道这背后的原因是什么?或者我可以做些什么来强制合并以下记录?
Would anyone know what the reason behind this is? Or what i can do to force merge below records?
推荐答案
假设所有的 csv 文件都具有相同的格式并且都以相同的标题开头,您可以编写如下的小脚本来将所有文件附加到一个中并且只需要一次标题.
Assuming that all the csv files have the same format and all start with the same header, you can write a little script as the following to append all files in only one and to take only one time the header.
#!/bin/bash
OutFileName="X.csv" # Fix the output name
i=0 # Reset a counter
for filename in ./*.csv; do
if [ "$filename" != "$OutFileName" ] ; # Avoid recursion
then
if [[ $i -eq 0 ]] ; then
head -1 "$filename" > "$OutFileName" # Copy header if it is the first file
fi
tail -n +2 "$filename" >> "$OutFileName" # Append from the 2nd line each file
i=$(( $i + 1 )) # Increase the counter
fi
done
注意事项:
head -1
或head -n 1
命令打印文件的第一行(头部).tail -n +2
从第 2 行 (+2
) 开始打印文件的尾部- Test
[ ... ]
用于从输入列表中排除输出文件. - 每次都会重写输出文件.
- 命令
cat a.csv b.csv >X.csv
可以简单地用于将 a.csv 和 b csv 附加到单个文件中(但您将标题复制 2 倍).
- The
head -1
orhead -n 1
command print the first line of a file (the head). - The
tail -n +2
prints the tail of a file starting from the lines number 2 (+2
) - Test
[ ... ]
is used to exclude the output file from the input list. - The output file is rewritten each time.
- The command
cat a.csv b.csv > X.csv
can be simply used to append a.csv and b csv in a single file (but you copy 2 times the header).
paste
命令将文件一个粘贴到另一个的一侧.如果文件有空格作为行,您可以获得上面报告的输出.-d ,
的使用要求 paste command
定义由逗号分隔的字段 ,
,但对于您在上面报告的文件.
The paste
command pastes the files one on a side of the other. If a file has white spaces as lines you can obtain the output that you reported above.
The use of -d ,
asks to paste command
to define fields separated by a comma ,
, but this is not the case for the format of the files you reported above.
cat
命令改为连接文件并打印在标准输出上,这意味着它一个接一个地写入文件.
The cat
command instead concatenates files and prints on the standard output, that means it writes one file after the other.
单个选项的语法请参考man head
或man tail
(有些版本允许head -1
其他代替head -n 1
)...
Refer to man head
or man tail
for the syntax of the single options (some version allows head -1
other instead head -n 1
)...
这篇关于合并 CSV 文件:追加而不是合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!