有条件awk的HashMap的匹配查找 [英] Conditional Awk hashmap match lookup

查看:1199
本文介绍了有条件awk的HashMap的匹配查找的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个表格文件。一个文件包含的50个重点值仅名为 lookup_file.txt的映射。
另一个文件具有行30列,数以百万计的实际表格数据。 的data.txt
我想,以取代从 lookup_file.txt值第二个文件的id列。

I have 2 tabular files. One file contains a mapping of 50 key values only called lookup_file.txt. The other file has the actual tabular data with 30 columns and millions of rows. data.txt I would like to replace the id column of the second file with the values from the lookup_file.txt..

我怎样才能做到这一点?我想preFER在bash脚本用awk ..
此外,有没有一个HashMap的数据结构,我可以在bash用于存储50个键/值,而不是另一个文件?

How can I do this? I would prefer using awk in bash script.. Also, Is there a hashmap data-structure i can use in bash for storing the 50 key/values rather than another file?

推荐答案

假设你的文件有逗号分隔的字段和ID列是场3:

Assuming your files have comma-separated fields and the "id column" is field 3:

awk '
BEGIN{ FS=OFS="," }
NR==FNR { map[$1] = $2; next }
{ $3 = map[$3]; print }
' lookup_file.txt data.txt

如果任何这些假设是错误的,线索我们如果修订不明显...

If any of those assumptions are wrong, clue us in if the fix isn't obvious...

编辑:如果你想避免的(恕我直言忽略不计)NR == FNR测试性能的影响,这将是那些每一个罕见病例之一,当使用函数getline是恰当的:

and if you want to avoid the (IMHO negligible) NR==FNR test performance impact, this would be one of those every rare cases when use of getline is appropriate:

awk '
BEGIN{
   FS=OFS=","
   while ( (getline line < "lookup_file.txt") > 0 ) {
      split(line,f)
      map[f[1]] = f[2]
   }
}
{ $3 = map[$3]; print }
' data.txt

这篇关于有条件awk的HashMap的匹配查找的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆