Awk:将孤立的文本合并到上面一行中的特定字段 [英] Awk: merge orphaned text to specific field in line above
问题描述
给出一个制表符分隔的文本文件,其中包含项目信息:
Given a tab-delimited text file containing item information:
41850 0.4 0.5 LG EN RP Billy Makes a Fridgewell, Norm
Friend
9338 0.4 0.5 LG EN RP Shine, The Musical! Mustard, Colonel
7255 0.5 0.5 LG EN RP Can You Play the Truman, Harriet
Jew's Harp
9314 0.5 0.5 LG EN RP Hi, Skippy Plum, Prof
请注意两行中的孤立"标题.使用Awk,如何将该孤儿重新合并到上面的标题字段中?
Note the "orphaned" titles on two of the lines. Using Awk, how can I merge this orphan back into the title field above?
伪awk:
awk '/^[[:digit:]]/{getline; ???
if next line ~ /^[[:alpha:]]/ title=$7 + previous
END{print $0}' <FILE
无论如何,步骤似乎是:
Anyway the steps seem to be:
任何一个
- 找到正常"行,
- 测试下一行是否为孤立"
- 如果是这样,请将孤立"附加到字段7 [标题字段],
- 打印行
或
- 找到孤儿"
- 以某种方式附加到上一行的字段7 [永远不会有两个连续的孤儿]
第一种方法对我来说似乎最简单---但是,我是这里的无知者.
The first way seems easiest to me --- but then, I'm the one in ignorance here.
推荐答案
我意识到这个问题被标记为awk
,但这可能是使用Perl更容易的时候之一:
I realize the question is tagged awk
, but this might be one of those times when it's easier with Perl:
perl -F"\t" -lane 'BEGIN { $, = "\t" }
if (/^\d{2}/) { print @saved if @saved; @saved = @F }
else { $saved[6].=" $_" };
END { print @saved }' foo.txt
尽管这是相同想法的awk版本(通过Ed Morton进行了一些改进):
Though here's an awk version of the same idea (with some improvements via Ed Morton):
awk -F"\t" '/^[0-9][0-9]/ { if (prefix) { print prefix"\t"title"\t"suffix }
prefix=$1
for ( i=2; i<=6; ++i ) prefix=prefix"\t"$i
title=$7; suffix=$8
next }
{ title = title" "$0 }
END { print prefix"\t"title"\t"suffix }' foo.txt
两个脚本都给了我这个输出,看起来像你想要的:
Both scripts give me this output, which looks like what you want:
41850 0.4 0.5 LG EN RP Billy Makes a Friend Fridgewell, Norm
9338 0.4 0.5 LG EN RP Shine, The Musical! Mustard, Colonel
7255 0.5 0.5 LG EN RP Can You Play the Jew's Harp Truman, Harriet
9314 0.5 0.5 LG EN RP Hi, Skippy Plum, Prof
这篇关于Awk:将孤立的文本合并到上面一行中的特定字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!