如何将变量切成数组索引? [英] How to slice a variable into array indexes?

查看:116
本文介绍了如何将变量切成数组索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个典型的问题:给定值列表,检查它们是否存在于数组中.

There is this typical problem: given a list of values, check if they are present in an array.

awk中,技巧val in array确实运行良好.因此,典型的想法是将所有数据存储在一个数组中,然后继续进行检查.例如,这将打印数组中存在第一列值的所有行:

In awk, the trick val in array does work pretty well. Hence, the typical idea is to store all the data in an array and then keep doing the check. For example, this will print all lines in which the first column value is present in the array:

awk 'BEGIN {<<initialize the array>>} $1 in array_var' file

但是,初始化数组需要一些时间,因为val in array检查索引val是否在array中,而我们通常存储在array中的是一组值.

However, it is initializing the array takes some time because val in array checks if the index val is in array, and what we normally have stored in array is a set of values.

当从命令行提供值时,这变得更加相关,其中这些值是我们要包括为数组索引的元素.例如,在这个基本示例中(基于我最近的回答,这引起了我的好奇):

This becomes more relevant when providing values from command line, where those are the elements that we want to include as indexes of an array. For example, in this basic example (based on a recent answer of mine, which triggered my curiosity):

$ cat file
hello 23
bye 45
adieu 99
$ awk -v values="hello adieu" 'BEGIN {split(values,v); for (i in v) names[v[i]]} $1 in names' file
hello 23
adieu 99

  • split(values,v)将变量values切片为数组v[1]="hello"; v[2]="adieu"
  • for (i in v) names[v[i]]使用names["hello"]names["adieu"]的空值初始化另一个数组names[].这样,我们就准备好了
  • $1 in names检查第一列是否为names[]中的任何索引.
    • split(values,v) slices the variable values into an array v[1]="hello"; v[2]="adieu"
    • for (i in v) names[v[i]] initializes another array names[] with names["hello"] and names["adieu"] with empty value. This way, we are ready for
    • $1 in names that checks if the first column is any of the indexes in names[].
    • 如您所见,我们将其分割为一个临时变量v,稍后再初始化最终有用的变量names[].

      As you see, we slice into a temp variable v to later on initialize the final and useful variable names[].

      有没有更快的方法来初始化数组的索引,而不是先建立一个数组,然后将其值用作确定的索引?

      Is there any faster way to initialize the indexes of an array instead of setting one up and then using its values as indexes of the definitive?

      推荐答案

      不,这是最快的方法(由于哈希查找)和最可靠的方法(由于字符串比较).

      No, that is the fastest (due to hash lookup) and most robust (due to string comparison) way to do what you want.

      此:

      BEGIN{split(values,v); for (i in v) names[v[i]]}
      

      在启动时发生一次,在此期间几乎不会花费时间:

      happens once on startup and will take close to no time while this:

      $1 in array_var
      

      每行输入都会发生一次(因此,需要具有最佳性能的地方)是哈希查找,因此是将字符串值与一组字符串进行比较的最快方法.

      which happens once for every line of input (and so is the place that needs to have optimal performance) is a hash lookup and so the fastest way to compare a string value to a set of strings.

      这篇关于如何将变量切成数组索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆