如何使用UNIX工具合并同一列中的行 [英] How to merge rows from the same column using unix tools
问题描述
我有一个文本文件,如下所示:
I have a text file that looks like the following:
1000000 45 M This is a line This is another line Another line
that breaks into that also breaks that has a blank
multiple rows into multiple rows - row below.
How annoying!
1000001 50 F I am another I am well behaved.
column that has
text spanning
multiple rows
我想将其转换为如下所示的csv文件:
I would like to convert this into a csv file that looks like:
1000000, 45, M, This is a line that breaks into multiple rows, This is another line that also breaks into multiple rows - How annoying!
1000001, 50, F, I am another column that has text spanning multiple rows, I am well behaved.
文本文件输出来自于1984年编写的程序,我无法修改输出.我希望它采用csv格式,以便我可以尽可能轻松地将其转换为Excel.我不确定从哪里开始,而不是重新发明轮子,而是希望有人可以指出我正确的方向.谢谢!
The text file output comes from a program that was written in 1984, and I have no way to modify the output. I want it in csv format so that I can convert it to Excel as painlessly as possible. I am not sure where to start, and rather than reinvent the wheel, was hoping someone could point me in the right direction. Thanks!
==编辑==
我修改了文本文件,使行之间具有\n
-也许这会有所帮助吗?
I've modified the text file to have \n
between rows - maybe this will be helpful?
==编辑2 ==
我将文本文件修改为具有空白行.
I've modified the text file to have a blank row.
推荐答案
使用GNU awk
gawk '
BEGIN { FIELDWIDTHS="11 6 5 22 22" }
length($1) == 11 {
if ($1 ~ /[^[:blank:]]/) {
if (f1) print_line()
f1=$1; f2=$2; f3=$3; f4=$4; f5=$5
}
else {
f4 = f4" "$4; f5 = f5" "$5
}
}
function rtrim(str) {
sub(/[[:blank:]]+$/, "", str)
return str
}
function print_line() {
gsub(/[[:blank:]]{2,}/, " ", f4); gsub(/"/, "&&", f4)
gsub(/[[:blank:]]{2,}/, " ", f5); gsub(/"/, "&&", f5)
printf "%s,%s,%s,\"%s\",\"%s\"\n", rtrim(f1), rtrim(f2), rtrim(f3),f4,f5
}
END {if (f1) print_line()}
' file
1000000,45,M,"This is a line that breaks into multiple rows ","This is another line that also breaks into multiple rows - How annoying!"
1000001,50,F,"I am another column that has text spanning multiple rows","I am well behaved. "
我已引用了最后两列(以防它们包含逗号),并将任何可能的内部双引号加倍.
I've quoted the last 2 columns in case they contain commas, and doubled any potential inner double quotes.
这篇关于如何使用UNIX工具合并同一列中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!