动态间接Bash阵列 [英] Dynamic indirect Bash array

查看:81
本文介绍了动态间接Bash阵列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的日志格式如下:

log1,john,time,etc
log2,peter,time,etc
log3,jack,time,etc
log4,peter,time,etc

我想为每个人创建一个列表,格式为

I want to create a list for every person in the format

"name"=("no.lines" "line" "line" ...)

例如:

peter=("2" "log2,peter,time,etc" "log4,peter,time,etc")

我已经有了这种结构,并且知道如何创建

I already have this structure and know how to create variables like

declare "${FIELD[1]}"=1

但是我不知道如何增加记录数,如果要创建一个这样的列表并将其追加到列表中,则会出现错误.

but I don't know how to increase number of records and I am getting an error if I want to create a list like this and append into it.

#!/bin/bash

F=("log1,john,time,etc" "log2,peter,time,etc" "log3,jack,time,etc" "log4,peter,time,etc")
echo "${F[@]}"

declare -a CLIENTS
for LINE in "${F[@]}"
do
    echo "$LINE"
    IFS=',' read -ra  FIELD < <(echo "$LINE")

    if [ -z "${!FIELD[1]}" ] && [ -n "${FIELD[1]}" ] # check if there is already record for given line, if not create
    then 
            CLIENTS=("${CLIENTS[@]}" "${FIELD[1]}") # add person to list of variables records for later access
            declare -a "${FIELD[1]}"=("1" "LINE") # ERROR

    elif [ -n "${!FIELD[1]}" ] && [ -n "${FIELD[1]}" ] # if already record for client
    then 
            echo "Increase records number" # ???
            echo "Append record"
            "${FIELD[@]}"=("${FIELD[@]}" "$LINE") # ERROR

    else    
            echo "ELSE"
    fi

done

echo -e "CLIENTS: \n ${CLIENTS[@]}"
echo "Client ${CLIENTS[0]} has ${!CLIENTS[0]} records"
echo "Client ${CLIENTS[1]} has ${!CLIENTS[1]} records"
echo "Client ${CLIENTS[2]} has ${!CLIENTS[2]} records"
echo "Client ${CLIENTS[3]} has ${!CLIENTS[3]} records"

推荐答案

使用Coreutils,grep和sed进行重击

如果我对您的代码理解正确,则您尝试使用Bash不支持的多维数组.如果要从头解决此问题,请使用以下命令行工具组合(请参阅答案末尾的安全性问题!):

Bash with Coreutils, grep and sed

If I understand your code right, you try to have multidimensional arrays, which Bash doesn't support. If I were to solve this problem from scratch, I'd use this mix of command line tools (see security concerns at the end of the answer!):

#!/bin/bash

while read name; do
    printf "%s=(\"%d\" \"%s\")\n" \
        "$name" \
        "$(grep -c "$name" "$1")" \
        "$(grep "$name" "$1" | tr $'\n' ' ' | sed 's/ /" "/g;s/" "$//')"
done < <(cut -d ',' -f 2 "$1" | sort -u)

示例输出:

$ ./SO.sh infile
jack=("1" "log3,jack,time,etc")
john=("1" "log1,john,time,etc")
peter=("2" "log2,peter,time,etc" "log4,peter,time,etc")

这使用进程替换来准备日志文件,因此我们可以遍历唯一的名称;替换的输出看起来像

This uses process substitution to prepare the log file so we can loop over unique names; the output of the substitution looks like

$ cut -d ',' -f 2 "$1" | sort -u
jack
john
peter

即唯一名称列表.

对于每个名称,我们然后使用以下命令打印摘要日志行

For each name, we then print the summarized log line with

printf "%s=(\"%d\" \"%s\")\n"

哪里

  • %s字符串只是名称("$name").
  • 日志行数是grep命令的输出,

  • The %s string is just the name ("$name").
  • The log line count is the output of a grep command,

grep -c "$name" "$1"

计算"$name"的出现次数.如果名称可以出现在日志行的其他位置,则可以使用

which counts the number of occurrences of "$name". If the name can occur elsewhere in the log line, we can limit the search to just the second field of the log lines with

grep -c "$name" <(cut -d ',' -f 2 "$1")

  • 最后,要使用正确的引号将所有日志行和所有行都包含在一行中,我们使用

  • Finally, to get all log lines on one line with proper quoting and all, we use

    grep "$name" "$1" | tr $'\n' ' ' | sed 's/ /" "/g;s/" "$//'
    

    这将获取所有包含"$name"的行,用空格替换换行符,然后用引号将空格括起来,并从行尾删除多余的引号.

    This gets all lines containing "$name", replaces newlines with spaces, then surrounds the spaces with quotes and removes the extra quotes from the end of the line.

    最初认为纯Bash太麻烦了,但事实并非如此复杂:

    After initially thinking that pure Bash would be too cumbersome, it turned out to be not all that complicated:

    #!/bin/bash
    
    declare -A count
    declare -A lines
    
    old_ifs=IFS
    IFS=,
    while read -r -a line; do
        name="${line[1]}"
        (( ++count[$name] ))
        lines[$name]+="\"${line[*]}\" "
    done < "$1"
    
    for name in "${!count[@]}"; do
        printf "%s=(\"%d\" %s)\n" "$name" "${count[$name]}" "${lines[$name]% }"
    done
    
    IFS="$old_ifs"
    

    这将更新两个关联数组,同时循环输入文件:count跟踪某个名称出现的次数,并且lines将日志行附加到每个名称的条目上.

    This updates two associative arrays while looping over the input file: count keeps track of the number of times a certain name occurs, and lines appends the log lines to an entry per name.

    要用逗号分隔字段,我们将输入字段分隔符IFS设置为逗号(但请预先保存,以便可以在最后将其重置).

    To separate fields by commas, we set the input field separator IFS to a comma (but save it beforehand so it can be reset at the end).

    read -r -a将行读入具有逗号分隔字段的数组line中,因此名称现在在${line[1]}中.我们在算术表达式(( ... ))中增加该名称的计数,然后在下一行附加(+=)日志行.

    read -r -a reads the lines into an array line with comma separated fields, so the name is now in ${line[1]}. We increase the count for that name in the arithmetic expression (( ... )), and append (+=) the log line in the next line.

    ${line[*]}打印由IFS分隔的数组的所有字段,这正是我们想要的.我们还在这里添加一个空格;行尾(最后一个元素之后)的多余空间将在以后删除.

    ${line[*]} prints all fields of the array separated by IFS, which is exactly what we want. We also add a space here; the unwanted space at the end of the line (after the last element) will be removed later.

    第二个循环遍历count数组的所有键(名称),然后为每个键打印格式正确的行. ${lines[$name]% }删除行尾的空格.

    The second loop iterates over all the keys of the count array (the names), then prints the properly formatted line for each. ${lines[$name]% } removes the space from the end of the line.

    安全问题

    似乎这些脚本的输出应该由Shell重用,如果我们不信任日志文件的内容,我们可能希望阻止恶意代码的执行.

    As it seems that the output of these scripts is supposed to be reused by the shell, we might want to prevent malicious code execution if we can't trust the contents of the log file.

    一种针对Bash解决方案的方法(提示: Charles Duffy )如下: for循环必须替换为

    A way to do that for the Bash solution (hat tip: Charles Duffy) would be the following: the for loop would have to be replaced by

    for name in "${!count[@]}"; do
        IFS=' ' read -r -a words <<< "${lines[$name]}"
        printf -v words_str '%q ' "${words[@]}"
        printf "%q=(\"%d\" %s)\n" "$name" "${count[$name]}" "${words_str% }"
    done
    

    也就是说,我们将合并的日志行拆分为数组words,使用%q格式标记将其打印为字符串words_str,然后将该字符串用于我们的输出,从而得到转义的输出,如下所示:

    That is, we split the combined log lines into an array words, print that with the %q formatting flag into a string words_str and then use that string for our output, resulting in escaped output like this:

    peter=("2" \"log2\,peter\,time\,etc\" \"log4\,peter\,time\,etc\")
    jack=("1" \"log3\,jack\,time\,etc\")
    john=("1" \"log1\,john\,time\,etc\")
    

    第一个解决方案可以做类似的事情.

    The analogous could be done for the first solution.

    这篇关于动态间接Bash阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆