将元素附加到关联数组 awk [英] Appending an element to associative array awk

查看:38
本文介绍了将元素附加到关联数组 awk的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含几个字段的输入文件 (input.txt):

I've got an input file (input.txt) with a few fields:

A1  B1  C1  D1  E1
A2  B2  C2  D1  E2
A3  B3  C3  D2  E3
A4  B4  C4  D2  E4

我想追加关联数组的元素,

And I want to append elements of an associative array,

awk '{a[$4]=a[$4] $5; print a[$4]} END {for(b in a) {print a[b]}}' input.txt

我认为输出应该是(即 E2 连接到 E1,E4 连接到 E3):

I think the output should be (ie E2 is concatenated to E1, and E4 is concatenated to E3):

E1 E2
E3 E4

但输出是:

E2
E4

我不确定我的代码有什么问题?

I'm not sure what's wrong with my code?

推荐答案

您的输出与您的命令不一致,但我假设您需要以下内容:

Your output isn't consistent with your command, but I assume that you want the following:

  • 为每个唯一的第 4 列值建立一个第 5 列值的列表
  • 打印这些列表,前面是相应的第 4 列值

获得您想要的东西的天真修复是:

A naïve fix to get what you want would be:

$ awk '{a[$4]=a[$4] " " $5} END {for (b in a) { print b; print a[b]}}' input.txt
D1
 E1 E2
D2
 E3 E4

但是有两点需要注意:

  • 累积的第 5 列值将有一个前导空格 - 在这种情况下恰好有助于分组输出.
  • 由于使用 for (b in a) 枚举键,第 4 列值不会按照它们在输入中出现的顺序出现,因为 awk 枚举其 [always associative] 数组的键是基于内部散列值,它与数组元素的添加顺序没有保证关系(也不保证任何特定的一般顺序).
  • The accumulated 5-th column values will have a leading space - which happens to help with grouped output in this case.
  • Due to enumerating the keys with for (b in a), the 4th-column values will NOT appear in the order they appear in the input, because the order in which awk enumerates keys of its [always associative] arrays is based on internal hash values, which has no guaranteed relationship to the order in which array elements were added (nor does it guarantee any particular order in general).

这篇关于将元素附加到关联数组 awk的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆