在awk中打​​印用户定义函数的输出会产生意外的令牌错误 [英] print output of user-defined function in awk gives unexpected token error

查看:102
本文介绍了在awk中打​​印用户定义函数的输出会产生意外的令牌错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想灵活地将两个小awk的输出打印到bash管道,这些管道使用了变量(他们最初的工作原理)。我最初以为我可以将整个命令存储为变量本身,但对于一个它不起作用,显然(存储awk命令在bash脚本的变量中)这不是一个好主意。所以我写了两个函数,但我在完成附近获得了一个意外标记,但它的格式与上面的链接相同。



我的错误在哪里?

 覆盖范围内的文件* / *。cov 
do
#gene_count = $(awk'{print $ 5}'$ coverage_file | sort | uniq -c | wc -l)#this显然不是一个好主意
#contig_count = $(awk'{print $ 1}'$ coverage_file | sort | uniq -c | wc -l)#这显然不是一个好主意
cmd_gene(){awk'{print $ 5}'$ coverage_file | sort | uniq -c | wc -l}
cmd_contig(){awk'{print $ 1}'$ coverage_file | sort | uniq -c | wc -l}
cmd_gene $ coverage_file
cmd_contig $ coverage_file
#printwe found,$ gene_count,genes on,$ contig_countcontigs
done

cov文件看起来像这样:

  k141_85332.3 4119 19 A5 phnM_031 
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_88684 267 5 B10 phnM_032
k141_88684 268 5 B10 phnM_032
k141_88684 269 5 B10 phnM_032
k141_88684 270 5 B10 phnM_032
k141_88684 271 5 B10 phnM_032
k141_88684 272 5 B10 phnM_032

编辑:这包括被接受的答案+一种可能的方式来清楚地打印它:

 #!/ bin / bash 

#define变量
基因=phnM
阈值=5

#define函数
cmd_gene(){awk'{print $ 5 }'$ 1 | sort | uniq -c | wc -l;}#分号在这里很重要!
cmd_contig(){awk'{print $ 1}'$ 1 | sort | uniq -c | wc -l; } #semicolon在这里很重要!

#loop覆盖文件和打印结果(printf会更漂亮)
覆盖文件* / *。cov
do
echo $ genewas found $(cmd_gene$ coverage_file)timeson$(cmd_contig$ coverage_file)覆盖范围最小覆盖范围$ coverage_file
完成
pre>

输出:

  phnM在65个contig上被找到67次phnm / test.cov 
phnM的最小覆盖率为5,在2个contig上发现了3次,test / test.cov中的最小覆盖率为5,


解决方案

意外的令牌错误即将到来,因为当您定义函数时,}必须位于它自己的行或前面;。

另外,因为您在函数的定义中使用 $ coverage_file ,所以您不必通过它。

  for coverage_file in * / *。cov 
do
cmd_gene(){awk' {print $ 5}'$ coverage_file | sort | uniq -c | wc -l; }
cmd_contig(){awk'{print $ 1}'$ coverage_file | sort | uniq -c | wc -l; }
cmd_gene
cmd_contig
#printwe found,$ gene_count,genes on,$ contig_countcontigs
done

如果你想在for循环之外定义函数,你可以使用 $ 1 与awk的$ 1混淆)并且像

编辑那样传递 $ coverage_file strong>:上面的示例

  $ cat a.sh 
cmd_gene(){awk'{print $ 5}' $ 1 | sort | uniq -c | wc -l;}
cmd_contig(){awk'{print $ 1}'$ 1 | sort | uniq -c | wc -l;}

for coverage_file in * / *。cov
do
cmd_gene $ coverage_file
cmd_contig $ coverage_file
完成

$ ls * / *。cov
bf / a.cov

$ cat * / *。cov
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_85332.3 4119 8 A2 phnM_0 31
k141_88684 267 5 B10 phnM_032
k141_88684 268 5 B10 phnM_032
k141_88684 269 5 B10 phnM_032
k141_88684 270 5 B10 phnM_032
k141_88684 271 5 B10 phnM_032
k141_88684 272 5 B10 phnM_032

$ sh a.sh
2
2


I wanted to flexibly print the output of two small awk to bash pipes, which are using variables (they worked originally). I initially thought I could store the whole command as variable itself, but for one it did not work and apparently (store awk command in a variable of bash script) it is not a good idea. So I wrote two functions, but I'm getting an "unexpected token" near "done", but it is formatted as in the link above.

Where is my mistake?

for coverage_file in */*.cov
do
    #gene_count=$(awk '{print $5}' $coverage_file |sort | uniq -c | wc -l) #this is apparently not a good idea
    #contig_count=$(awk '{print $1}' $coverage_file |sort | uniq -c | wc -l) #this is apparently not a good idea
    cmd_gene() { awk '{print $5}' $coverage_file |sort | uniq -c | wc -l }
    cmd_contig() { awk '{print $1}' $coverage_file |sort | uniq -c | wc -l }
    cmd_gene $coverage_file
    cmd_contig $coverage_file
    #print "we found", $gene_count, "genes on ",$contig_count" contigs
done

the cov files look like this:

k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_88684 267 5 B10 phnM_032
k141_88684 268 5 B10 phnM_032
k141_88684 269 5 B10 phnM_032
k141_88684 270 5 B10 phnM_032
k141_88684 271 5 B10 phnM_032
k141_88684 272 5 B10 phnM_032

EDIT: this includes the accepted answer + a possible way to print it plainly:

#!/bin/bash

#define variables
gene="phnM"
threshold="5"

#define functions
cmd_gene() { awk '{print $5}' $1 |sort | uniq -c | wc -l ; } #semicolon is important here!
cmd_contig() { awk '{print $1}' $1 |sort | uniq -c | wc -l ; } #semicolon is important here!

#loop over files and print results (would be prettier with printf)
for coverage_file in */*.cov
do
    echo $gene" was found" $(cmd_gene "$coverage_file") "times on" $(cmd_contig "$coverage_file")" contigs with minimum coverage of" $threshold in $coverage_file
done

OUTPUT:

phnM was found 67 times on 65 contigs with minimum coverage of 5 in phnm/test.cov
phnM was found 3 times on 2 contigs with minimum coverage of 5 in test/test.cov

解决方案

The unexpected token error is coming because when you define a function, the } has to be on it's own line or preceded by ;.

Also, since you're using $coverage_file in the definition of the function, you don't have to pass it.

for coverage_file in */*.cov
do
    cmd_gene() { awk '{print $5}' $coverage_file |sort | uniq -c | wc -l; }
    cmd_contig() { awk '{print $1}' $coverage_file |sort | uniq -c | wc -l; }
    cmd_gene 
    cmd_contig 
    #print "we found", $gene_count, "genes on ",$contig_count" contigs
done

If you want to define the functions outside the for loop, you would use $1 (not to be confused with awk's $1) and pass $coverage_file like you were doing before.

EDIT: Example of above

$ cat a.sh
cmd_gene() { awk '{print $5}' $1 |sort | uniq -c | wc -l; }
cmd_contig() { awk '{print $1}' $1 |sort | uniq -c | wc -l; }

for coverage_file in */*.cov
do
    cmd_gene $coverage_file
    cmd_contig $coverage_file
done

$ ls */*.cov
bf/a.cov

$ cat */*.cov
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_88684 267 5 B10 phnM_032
k141_88684 268 5 B10 phnM_032
k141_88684 269 5 B10 phnM_032
k141_88684 270 5 B10 phnM_032
k141_88684 271 5 B10 phnM_032
k141_88684 272 5 B10 phnM_032

$ sh a.sh
       2
       2

这篇关于在awk中打​​印用户定义函数的输出会产生意外的令牌错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆