具有关联数组的问题备忘bash功能范围界定 [英] Scoping issue memoising bash function with associative array

查看:94
本文介绍了具有关联数组的问题备忘bash功能范围界定的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个bash脚本,该脚本使用jq在某些JSON中查找依赖项"数据,并进行闭包(查找依赖项的依赖项等).

I have a bash script which uses jq to look up 'dependency' data in some JSON, and take the closure (find dependencies of dependencies of dependencies, etc.).

这很好用,但可能会非常慢,因为它可能一遍又一遍地查找相同的依赖项,所以我想记住它.

This works fine, but can be very slow, since it may look up the same dependencies over and over, so I'd like to memoise it.

我尝试使用全局关联数组将参数与结果关联,但是该数组似乎没有存储任何内容.

I tried using a global associative array to associate arguments with results, but the array doesn't seem to be storing anything.

我已将相关代码提取到以下演示中:

I've extracted the relevant code into the following demo:

#!/usr/bin/env bash

# Associative arrays must be declared; use 'g' for global
declare -Ag MYCACHE
function expensive {
    # Look up $1 in MYCACHE
    if [ "${MYCACHE[$1]+_}" ]
    then
        echo "Using cached version" >> /dev/stderr
        echo "${MYCACHE[$1]}"
        return
    fi

    # Not found, perform expensive calculation
    RESULT="foo"

    echo "Caching result" >> /dev/stderr
    MYCACHE["$1"]="$RESULT"

    # Check if the result was cached
    if [ "${MYCACHE[$1]+_}" ]
    then
        echo "Cached" >> /dev/stderr
    else
        abort "Didn't cache"
    fi

    # Done
    echo "$RESULT"
}

function abort {
    echo "$1" >> /dev/stderr
    exit 1
}

# Run once, make sure result is "foo"
[[ "x$(expensive "hello")" = "xfoo" ]] ||
    abort "Failed for hello"

# Run again, make sure "Using cached version" is in stderr
expensive "hello" 2>&1 > /dev/null | grep "Using cached version" ||
    abort "Didn't use cache"

这是我的结果:

$ ./foo.sh 
Caching result
Cached
Didn't use cache

我们得到Cached的事实似乎表明我正在正确地存储和查找值,但是由于我们点击Didn't use cache分支,因此在expensive的调用中并未保留它们.

The fact we get Cached seems to indicate that I'm storing and looking up values correctly, but they're not preserved across invocations of expensive since we hit the Didn't use cache branch.

对我来说,这似乎是一个范围界定问题,可能是由declare引起的.但是,declare -A似乎是使用关联数组的必要条件.

It looks like a scoping issue to me, maybe caused by the declare. However, declare -A seems to be a requirement for using associative arrays.

这是我的bash版本:

Here's my bash version:

$ bash --version
GNU bash, version 4.3.42(1)-release (i686-pc-linux-gnu)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

除了弄清我所遇到的行为,我还将欣赏bash中记忆功能的其他方法(最好是没有任何东西可以碰到文件系统,即使它是基于RAM的)

As well as figuring out the behaviour I'm experiencing, I'd also appreciate alternative ways of memoising functions in bash (preferably nothing which touches the filesystem, even if it's RAM-based)

推荐答案

您遇到了一些问题:

  1. declare -g仅在函数内部有意义.在外部,变量已经是全局变量.

  1. declare -g is only meaningful inside a function. Outside, a variable is already global.

全局变量仅在声明该变量的进程中是全局变量.您不能在进程之间共享全局变量.

A global variable is only global to the process in which it is declared. You can't have global variables shared across processes.

在命令替换中运行expensive会在一个单独的进程中进行,因此它创建并填充的缓存将在该进程中消失.

Running expensive inside a command substitution does so in a separate process, so the cache it creates and populates disappears with that process.

expensive作为管道的第一条命令运行也会创建一个新进程;它使用的缓存仅对该进程可见.

Running expensive as the first command of a pipeline also creates a new process; the cache it uses is only visible to that process.

您可以通过确保expensive仅在当前shell中运行,并通过

You can work around this by making sure expensive is only run in the current shell with

expensive "hello" > tmp.txt && read result < tmp.txt
[[ $foo = foo ]] || abort ...
expensive "hello" 2>&1 > /dev/null < <(grep "Using cached version") ||
abort "Didn't use cache"

但是,

Shell脚本根本不是设计用于这种类型的数据处理的.如果缓存很重要,请使用另一种语言,以更好地支持数据结构和数据的内存中处理. Shell已针对启动新流程和管理输入/输出文件进行了优化.

Shell scripting, however, is simply not designed for this type of data processing. If caching is important, use a different language with better support for data structures and in-memory handling of data. Shell is optimized for starting new processes and managing input/output files.

这篇关于具有关联数组的问题备忘bash功能范围界定的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆