在awk的多维数组 [英] multidimensional arrays in awk

查看:139
本文介绍了在awk的多维数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图创造AWK伪多维数组。

 #计算累计背景下得分BEGIN {FS = OFS =\\ t的}{
        一[$ 2 + FS + $ 7,$ 3] + = $ 6个
}END {为(I,J)在
        {打印I,J,A [I,J]}}

输出:

  AWK:ccstscan.awk:9:END {用于在(I,J)
AWK:ccstscan.awk:9:^语法错误

这是什么在GNU AWK手册中提到:


  

要测试在是否存在特定的索引序列
  多维数组,使用相同的操作员(在)是用于
  一维数组。写指数在整个序列
  括号,用逗号隔开,作为左操作数:

 (subscript1,subscript2,...)的阵列


我试图修改脚本来创建一个真正的多维数组:

  BEGIN {FS = OFS =\\ t的}{
    一[$ 2 + FS + $ 7] [$ 3] + = $ 6个
}END {因为我在
    {
     一种用于在由[i]Ĵ
        {打印I,J,A [I] [J]。
        }    }
}

我GAWK运行它。它也给了一个错误:

  GAWK:ccstscan.awk:6:[$ 2 + FS + $ 7] [$ 3] + = $ 6
GAWK:ccstscan.awk:6:^语法错误
呆子:ccstscan.awk:9:END {因为我在
GAWK:ccstscan.awk:9:^语法错误
呆子:ccstscan.awk:11:对于j在[I]
GAWK:ccstscan.awk:11:^语法错误
呆子:ccstscan.awk:11:对于j在[I]
GAWK:ccstscan.awk:11:^语法错误
呆子:ccstscan.awk:12:{打印I,J,A [I] [J]。
GAWK:ccstscan.awk:12:^语法错误

什么是使和扫描多维-关联数组的正确格式


解决方案

如果您使用的是模拟多维数组,你的循环将需要是这样的:

  END {
    对(在IJ){
      拆分(IJ,指数,SUBSEP);
      I =指数为[1];
      J =指数[2];
      打印I,J,A [IJ]
    }
  }

(I,J)在语法仅适用于测试特定索引是否是数组中为止。它不为for循环,尽管for循环允许类似的语法变通

对于真正的多维数组(数组的数组),你可以写这样的:

  BEGIN {FS = OFS =\\ t的}{A [$ 2 + FS + $ 7] [$ 3] + = $ 6}结束 {
  对(在ⅰ){
    对(在[I] j)条{
      打印I,J,A [I] [J]。
    }
  }
}

然而,数组的数组只在4.0 gawk的添加,所以你GAWK的版本可能不支持它。

另注:在这条线:

  A [$ 2 + FS + $ 7,$ 3] + = $ 6个

好像你正试图串联$ 2,FS和$ 7,但+是一种用于数字此外,不串联。您需要把它写这样的:

  A [$ 2 FS $ 7,$ 3] + = $ 6个

I tried creating a pseudo-multidimensional array in awk.

# Calculate cumulative context score

BEGIN { FS=OFS="\t" }

{
        a[$2+FS+$7,$3]+=$6
}

END { for (i,j) in a
        { print i,j,a[i,j] }

}

Output:

awk: ccstscan.awk:9: END { for (i,j) in a
awk: ccstscan.awk:9:             ^ syntax error

this is what is mentioned in GNU awk manual:

To test whether a particular index sequence exists in a multidimensional array, use the same operator (in) that is used for single dimensional arrays. Write the whole sequence of indices in parentheses, separated by commas, as the left operand:

 (subscript1, subscript2, ...) in array

i tried modifying the script to create a true-multi dimensional array:

BEGIN { FS=OFS="\t" }

{
    a[$2+FS+$7][$3]+=$6
}

END { for i in a
    {
     for j in a[i]
        { print i,j,a[i][j]
        }

    }
}

i ran it with gawk. it also gave an error:

gawk: ccstscan.awk:6:   a[$2+FS+$7][$3]+=$6
gawk: ccstscan.awk:6:              ^ syntax error
gawk: ccstscan.awk:9: END { for i in a
gawk: ccstscan.awk:9:           ^ syntax error
gawk: ccstscan.awk:11:   for j in a[i]
gawk: ccstscan.awk:11:       ^ syntax error
gawk: ccstscan.awk:11:   for j in a[i]
gawk: ccstscan.awk:11:             ^ syntax error
gawk: ccstscan.awk:12:          { print i,j,a[i][j]
gawk: ccstscan.awk:12:                          ^ syntax error

what is the correct format to make and scan multi-dimensional-associative arrays

解决方案

If you are using the simulated multi-dimensional arrays, your loop would need to be like this:

  END { 
    for (ij in a) {
      split(ij,indices,SUBSEP);
      i=indices[1];
      j=indices[2];
      print i,j,a[ij]
    }
  }

The (i,j) in a syntax only works for testing whether a particular index is in the array. It doesn't work for for-loops, despite the for-loop allowing a similar syntax.

For the true multi-dimensional arrays (arrays of arrays), you can write it like this:

BEGIN { FS=OFS="\t" }

{ a[$2+FS+$7][$3]+=$6 }

END { 
  for (i in a) {
    for (j in a[i]) { 
      print i,j,a[i][j]
    }
  }
}

However, arrays of arrays was only added in gawk 4.0, so your version of gawk may not support it.

Another note: on this line:

a[$2+FS+$7,$3]+=$6

It seems like you are trying to concatenate $2, FS, and $7, but "+" is for numerical addition, not concatenation. You would need to write it like this:

a[$2 FS $7,$3] += $6

这篇关于在awk的多维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆