合并CSV文件:追加,而不是合并 [英] Merging CSV files : Appending instead of merging
问题描述
所以基本上我想要合并一对夫妇的CSV文件。使用下面的脚本做,即时通讯:
So basically i want to merge a couple of CSV files. Im using the following script to do that :
paste -d , *.csv > final.txt
不过,这已为我工作在过去的这一次却行不通。其附加彼此相邻的数据,而不是下面彼此。比如两块包含以下格式的记录文件
However this has worked for me in the past but this time it doesn't work. It appends the data next to each other as opposed to below each other. For instance two files that contain records in the following format
CreatedAt ID
Mon Jul 07 20:43:47 +0000 2014 4.86249E+17
Mon Jul 07 19:58:29 +0000 2014 4.86238E+17
Mon Jul 07 19:42:33 +0000 2014 4.86234E+17
在合并给予
CreatedAt ID CreatedAt ID
Mon Jul 07 20:43:47 +0000 2014 4.86249E+17 Mon Jul 07 18:25:53 +0000 2014 4.86215E+17
Mon Jul 07 19:58:29 +0000 2014 4.86238E+17 Mon Jul 07 17:19:18 +0000 2014 4.86198E+17
Mon Jul 07 19:42:33 +0000 2014 4.86234E+17 Mon Jul 07 15:45:13 +0000 2014 4.86174E+17
Mon Jul 07 15:34:13 +0000 2014 4.86176E+17
请问有谁知道这背后的原因是什么?或者,我可以做什么来强制下面的记录合并?
Would anyone know what the reason behind this is? Or what i can do to force merge below records?
推荐答案
假设所有的CSV文件具有相同的格式,并全部以相同的标题,
你可以写一个小脚本如下为添加的所有文件只在一个和以只需要一次头
Assuming that all the csv files have the same format and all start with the same header, you can write a little script as the following to append all files in only one and to take only one time the header.
#!/bin/bash
OutFileName="X.csv" # Fix the output name
i=0 # Reset a counter
for filename in ./*.csv; do
if [ "$filename" != "$OutName" ] ; # Avoid recursion
then
if [[ $i -eq 0 ]] ; then
head -1 $filename > $OutFileName # Copy header if it is the first file
fi
tail -n +2 $filename >> $OutFileName # Append from the 2nd line each file
i=$(( $i + 1 )) # Increase the counter
fi
done
注:
- 的
头-1
或头-n 1
命令打印一个文件(头)的第一行 - 的
尾-n +2
打印文件从行开始尾号为2(+2
) - 测试
[...]
用于排除输入列表输出文件。 - 在输出文件被重写各一次。
- 命令
猫a.csv b.csv> X.csv
简直可以用追加a.csv和b CSV在单个文件(但复制2倍的标头)。
- The
head -1
orhead -n 1
command print the first line of a file (the head). - The
tail -n +2
prints the tail of a file starting from the lines number 2 (+2
) - Test
[ ... ]
is used to exclude the output file from the input list. - The output file is rewritten each time.
- The command
cat a.csv b.csv > X.csv
can be simply used to append a.csv and b csv in a single file (but you copy 2 times the header).
的粘贴
命令贴牌文件,其中一个在另一个的一侧。如果文件中有空格的线条,你可以获取你上述报告的输出。结果
使用 -d
的要求,以粘贴命令
来定义用逗号分隔的字段,
,但这不是你上面报告的文件格式的情况。
The paste
command pastes the files one on a side of the other. If a file has white spaces as lines you can obtain the output that you reported above.
The use of -d ,
asks to paste command
to define fields separated by a comma ,
, but this is not the case for the format of the files you reported above.
的猫
命令来代替并置在标准输出,这意味着它在其他以后写入一个文件,文件和打印。
The cat
command instead concatenates files and prints on the standard output, that means it writes one file after the other.
请参照男子头部
或男子尾
为单选项的语法(有的版本允许头-1
其他代替头-n 1
)...
Refer to man head
or man tail
for the syntax of the single options (some version allows head -1
other instead head -n 1
)...
这篇关于合并CSV文件:追加,而不是合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!