如何将变量切成数组索引? [英] How to slice a variable into array indexes?
问题描述
有一个典型的问题:给定值列表,检查它们是否存在于数组中.
There is this typical problem: given a list of values, check if they are present in an array.
在awk
中,技巧val in array
确实运行良好.因此,典型的想法是将所有数据存储在一个数组中,然后继续进行检查.例如,这将打印数组中存在第一列值的所有行:
In awk
, the trick val in array
does work pretty well. Hence, the typical idea is to store all the data in an array and then keep doing the check. For example, this will print all lines in which the first column value is present in the array:
awk 'BEGIN {<<initialize the array>>} $1 in array_var' file
但是,初始化数组需要一些时间,因为val in array
检查索引val
是否在array
中,而我们通常存储在array
中的是一组值.
However, it is initializing the array takes some time because val in array
checks if the index val
is in array
, and what we normally have stored in array
is a set of values.
当从命令行提供值时,这变得更加相关,其中这些值是我们要包括为数组索引的元素.例如,在这个基本示例中(基于我最近的回答,这引起了我的好奇):
This becomes more relevant when providing values from command line, where those are the elements that we want to include as indexes of an array. For example, in this basic example (based on a recent answer of mine, which triggered my curiosity):
$ cat file
hello 23
bye 45
adieu 99
$ awk -v values="hello adieu" 'BEGIN {split(values,v); for (i in v) names[v[i]]} $1 in names' file
hello 23
adieu 99
-
split(values,v)
将变量values
切片为数组v[1]="hello"; v[2]="adieu"
-
for (i in v) names[v[i]]
使用names["hello"]
和names["adieu"]
的空值初始化另一个数组names[]
.这样,我们就准备好了 -
$1 in names
检查第一列是否为names[]
中的任何索引. split(values,v)
slices the variablevalues
into an arrayv[1]="hello"; v[2]="adieu"
for (i in v) names[v[i]]
initializes another arraynames[]
withnames["hello"]
andnames["adieu"]
with empty value. This way, we are ready for$1 in names
that checks if the first column is any of the indexes innames[]
.
如您所见,我们将其分割为一个临时变量v
,稍后再初始化最终有用的变量names[]
.
As you see, we slice into a temp variable v
to later on initialize the final and useful variable names[]
.
有没有更快的方法来初始化数组的索引,而不是先建立一个数组,然后将其值用作确定的索引?
Is there any faster way to initialize the indexes of an array instead of setting one up and then using its values as indexes of the definitive?
推荐答案
不,这是最快的方法(由于哈希查找)和最可靠的方法(由于字符串比较).
No, that is the fastest (due to hash lookup) and most robust (due to string comparison) way to do what you want.
此:
BEGIN{split(values,v); for (i in v) names[v[i]]}
在启动时发生一次,在此期间几乎不会花费时间:
happens once on startup and will take close to no time while this:
$1 in array_var
每行输入都会发生一次(因此,需要具有最佳性能的地方)是哈希查找,因此是将字符串值与一组字符串进行比较的最快方法.
which happens once for every line of input (and so is the place that needs to have optimal performance) is a hash lookup and so the fastest way to compare a string value to a set of strings.
这篇关于如何将变量切成数组索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!