在bash嵌套副阵列 [英] nested associate arrays in bash

查看:108
本文介绍了在bash嵌套副阵列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可以在一个构造一个关联数组,其元素包含在bash数组?例如,假设一个人以下的数组:

Can one construct an associative array whose elements contain arrays in bash? For instance, suppose one has the following arrays:

a=(a aa)
b=(b bb bbb)
c=(c cc ccc cccc)

可以创建一个关联数组来访问这些变量?例如,

Can one create an associate array to access these variables? For instance,

declare -A letters
letters[a]=$a
letters[b]=$b
letters[c]=$c

和然后通过命令访问各个元件,例如

and then access individual elements by a command such as

letter=${letters[a]}
echo ${letter[1]}

用于创建和访问关联数组元素不起作用这种模拟语法。做有效的前pressions实现同样的目标是否存在?

This mock syntax for creating and accessing elements of the associate array does not work. Do valid expressions accomplishing the same goals exist?

推荐答案

我觉得更直接的答案是不,bash的数组不能嵌套。
凡是模拟嵌​​套数组其实只是创造看中映射函数为(单层)阵列的密钥空间。

I think the more straightforward answer is "No, bash arrays cannot be nested." Anything that simulates nested arrays is actually just creating fancy mapping functions for the keyspace of the (single layered) arrays.

这并不是说这是很糟糕:它可能正是你想要的,尤其是当你不控制按键到你的阵列,这样做正确变得更加困难。
虽然我喜欢用分隔符@konsolebox给出的解决方案,它最终倒下,如果你的密钥空间包括像键P | Q
它确实有一个很好的好处,你可以在你的钥匙透明地运作,如数组[ABC | DEF] 来查找键 DEF 数组[ABC] ,这是非常清晰可读。
因为它依赖于密钥没有出现的分隔符,这仅仅是一个很好的方法,当你知道密钥空间是什么样子现在和code的所有未来的用途。这仅仅是一个安全的假设,当你有对数据实行严格控制。

Not that that's bad: it may be exactly what you want, but especially when you don't control the keys into your array, doing it properly becomes harder. Although I like the solution given by @konsolebox of using a delimiter, it ultimately falls over if your keyspace includes keys like "p|q". It does have a nice benefit in that you can operate transparently on your keys, as in array[abc|def] to look up the key def in array[abc], which is very clear and readable. Because it relies on the delimiter not appearing in the keys, this is only a good approach when you know what the keyspace looks like now and in all future uses of the code. This is only a safe assumption when you have strict control over the data.

如果您需要任何一种稳健的,我会建议串联阵列键哈希值。这是一个简单的技术,极有可能消除冲突,但如果你是在极其精心打造的数据进行操作,他们是可能的。

If you need any kind of robustness, I would recommend concatenating hashes of your array keys. This is a simple technique that is extremely likely to eliminate conflicts, although they are possible if you are operating on extremely carefully crafted data.

要借用Git是如何处理哈希一点,让我们钥匙作为我们的哈希键的sha512sums的前8个字符。
如果你感到紧张,你可以随时使用整个sha512sum,因为不存在SHA512没有已知的冲突。
采用全校验可以确保你是安全的,但它是一点点沉重的负担。

To borrow a bit from how Git handles hashes, let's take the first 8 characters of the sha512sums of keys as our hashed keys. If you feel nervous about this, you can always use the whole sha512sum, since there are no known collisions for sha512. Using the whole checksum makes sure that you are safe, but it is a little bit more burdensome.

所以,如果我想在数组存储元素[ABC]的语义[高清] 是我应该做的是保存在价值数组[$(keyhashABC)$(keyhashDEF)] ,其中 keyhash 是这样的:

So, if I want the semantics of storing an element in array[abc][def] what I should do is store the value in array["$(keyhash "abc")$(keyhash "def")"] where keyhash looks like this:

function keyhash () {
    echo "$1" | sha512sum | cut -c-8
}

您可以再拉出来使用相同的 keyhash 函数关联数组的元素。
有趣地,还有的keyhash你可以写它使用一个数组来存储哈希值,preventing额外调用sha512sum一个memoized版本,但在内存方面,如果剧本需要很多键获取昂贵的:

You can then pull out the elements of the associative array using the same keyhash function. Funnily, there's a memoized version of keyhash you can write which uses an array to store the hashes, preventing extra calls to sha512sum, but it gets expensive in terms of memory if the script takes many keys:

declare -A keyhash_array
function keyhash () {
    if [ "${keyhash_array["$1"]}" == "" ];
    then
        keyhash_array["$1"]="$(echo "$1" | sha512sum | cut -c-8)"
    fi
    echo "${keyhash_array["$1"]}"
}

在一个给定键的长度检查告诉我多少层深,它看起来到数组,因为这只是 len个/ 8 ,我可以看到一个子项嵌套阵通过列出键和修剪那些有正确的preFIX。
所以,如果我想所有的按键在数组[ABC] ,我应该真的是这样的:

A length inspection on a given key tells me how many layers deep it looks into the array, since that's just len/8, and I can see the subkeys for a "nested array" by listing keys and trimming those that have the correct prefix. So if I want all of the keys in array[abc], what I should really do is this:

for key in "${!array[@]}"
do
    if [[ "$key" == "$(keyhash "abc")"* ]];
    then
        # do stuff with "$key" since it's a key directly into the array
        :
    fi
done

有趣的是,这也意味着第一级密钥是有效的,可以包含的值。因此,数组[$(keyhashABC)] 是完全有效的,这意味着这个嵌套数组的建设可以有一些有趣的语义。

Interestingly, this also means that first level keys are valid and can contain values. So, array["$(keyhash "abc")"] is completely valid, which means this "nested array" construction can have some interesting semantics.

在这种或那种形式,在猛砸嵌套数组的任何解决方案是拉动这个确切的同样的伎俩:生产(希望射)映射功能 F(项,子项)产生,它们可以被用作数组键串。
这总是可以为进一步应用于F(F(项,子项),subsubkey),或在的情况下 keyhash 功能上面,我preFER定义 F(键)键,适用于子项为 CONCAT(F(键),F(子项)) CONCAT(F(键),F(子项),F(subsubkey))
在记忆化的˚F组合,这是很多更有效率。
在分隔符的解决方案的情况下,女 的嵌套应用程序是必要的,当然。

In one form or another, any solution for nested arrays in Bash is pulling this exact same trick: produce a (hopefully injective) mapping function f(key,subkey) which produces strings that they can be used as array keys. This can always be applied further as f(f(key,subkey),subsubkey) or, in the case of the keyhash function above, I prefer to define f(key) and apply to subkeys as concat(f(key),f(subkey)) and concat(f(key),f(subkey),f(subsubkey)). In combination with memoization for f, this is a lot more efficient. In the case of the delimiter solution, nested applications of f are necessary, of course.

使用的已知,我知道最好的解决办法是采取键的短哈希值子项值。

With that known, the best solution that I know of is to take a short hash of the key and subkey values.

我承认,有该类型的答案一般不喜欢你就错了,用这个工具等!但在bash关联数组在许多层面凌乱,并运行你陷入困境,当您尝试端口code到(对于一些愚蠢的理由或其他)没有在其上的bash平台,或有一个古老( pre-4.x的)版本。
如果你愿意寻找到另一种语言作为脚本的需求,我建议你挑一些awk的。

I recognize that there's a general dislike for answers of the type "You're doing it wrong, use this other tool!" but associative arrays in bash are messy on numerous levels, and run you into trouble when you try to port code to a platform that (for some silly reason or another) doesn't have bash on it, or has an ancient (pre-4.x) version. If you are willing to look into another language for your scripting needs, I'd recommend picking up some awk.

它提供shell脚本用自带的功能更丰富的语言灵活简单。
有几个原因,我认为这是一个好主意:

It provides the simplicity of shell scripting with the flexibility that comes with more feature rich languages. There are a few reasons I think this is a good idea:


  • GNU AWK(最prevalent变种)已完全成熟的关联数组它可以嵌套得当,用数组的直观的语法[关键] [子项]

  • 您可以在shell脚本嵌入awk的,所以你仍然可以获得壳的工具,当你真正需要它们

  • AWK是愚蠢的时代简单,它与其他外壳更换语言,如Perl和Python把它形成鲜明的对比

这并不是说,awk就没有它的缺点。它可以是很难理解当你第一次学习它,因为它向流处理(很像SED)重导向,但它是一个很大的只是勉强外壳的范围任务的一大利器。

That's not to say that awk is without its failings. It can be hard to understand when you're first learning it because it's heavily oriented towards stream processing (a lot like sed), but it's a great tool for a lot of tasks that are just barely outside of the scope of the shell.

请注意,上面我说的是GNU AWK(GAWK)具有多维数组。其他awks实际上做一个定义良好的分离器将按键的把戏, SUBSEP 。你可以自己做,与数组[A | B]在bash 解决方案,但NAWK有这个功能内置如果你做数组[键,子键] 。它仍然是一个有点比bash的数组的语法更加流畅,清晰。

Note that above I said that "GNU awk" (gawk) has multidimensional arrays. Other awks actually do the trick of separating keys with a well-defined separator, SUBSEP. You can do this yourself, as with the array[a|b] solution in bash, but nawk has this feature builtin if you do array[key,subkey]. It's still a bit more fluid and clear than bash's array syntax.

这篇关于在bash嵌套副阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆