在awk中打印用户定义函数的输出会产生意外的令牌错误 [英] print output of user-defined function in awk gives unexpected token error
问题描述
我想灵活地将两个小awk的输出打印到bash管道,这些管道使用了变量(他们最初的工作原理)。我最初以为我可以将整个命令存储为变量本身,但对于一个它不起作用,显然(存储awk命令在bash脚本的变量中)这不是一个好主意。所以我写了两个函数,但我在完成附近获得了一个意外标记,但它的格式与上面的链接相同。
我的错误在哪里?
覆盖范围内的文件* / *。cov
do
#gene_count = $(awk'{print $ 5}'$ coverage_file | sort | uniq -c | wc -l)#this显然不是一个好主意
#contig_count = $(awk'{print $ 1}'$ coverage_file | sort | uniq -c | wc -l)#这显然不是一个好主意
cmd_gene(){awk'{print $ 5}'$ coverage_file | sort | uniq -c | wc -l}
cmd_contig(){awk'{print $ 1}'$ coverage_file | sort | uniq -c | wc -l}
cmd_gene $ coverage_file
cmd_contig $ coverage_file
#printwe found,$ gene_count,genes on,$ contig_countcontigs
done
cov文件看起来像这样:
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_88684 267 5 B10 phnM_032
k141_88684 268 5 B10 phnM_032
k141_88684 269 5 B10 phnM_032
k141_88684 270 5 B10 phnM_032
k141_88684 271 5 B10 phnM_032
k141_88684 272 5 B10 phnM_032
编辑:这包括被接受的答案+一种可能的方式来清楚地打印它:
#!/ bin / bash
pre>
#define变量
基因=phnM
阈值=5
#define函数
cmd_gene(){awk'{print $ 5 }'$ 1 | sort | uniq -c | wc -l;}#分号在这里很重要!
cmd_contig(){awk'{print $ 1}'$ 1 | sort | uniq -c | wc -l; } #semicolon在这里很重要!
#loop覆盖文件和打印结果(printf会更漂亮)
覆盖文件* / *。cov
do
echo $ genewas found $(cmd_gene$ coverage_file)timeson$(cmd_contig$ coverage_file)覆盖范围最小覆盖范围$ coverage_file
完成
输出:
phnM在65个contig上被找到67次phnm / test.cov
phnM的最小覆盖率为5,在2个contig上发现了3次,test / test.cov中的最小覆盖率为5,
解决方案意外的令牌错误即将到来,因为当您定义函数时,}必须位于它自己的行或前面;。
另外,因为您在函数的定义中使用
$ coverage_file
,所以您不必通过它。for coverage_file in * / *。cov
do
cmd_gene(){awk' {print $ 5}'$ coverage_file | sort | uniq -c | wc -l; }
cmd_contig(){awk'{print $ 1}'$ coverage_file | sort | uniq -c | wc -l; }
cmd_gene
cmd_contig
#printwe found,$ gene_count,genes on,$ contig_countcontigs
done
如果你想在for循环之外定义函数,你可以使用
$ 1
与awk的$ 1混淆)并且像
编辑那样传递
$ coverage_file
strong>:上面的示例$ cat a.sh
cmd_gene(){awk'{print $ 5}' $ 1 | sort | uniq -c | wc -l;}
cmd_contig(){awk'{print $ 1}'$ 1 | sort | uniq -c | wc -l;}
for coverage_file in * / *。cov
do
cmd_gene $ coverage_file
cmd_contig $ coverage_file
完成
$ ls * / *。cov
bf / a.cov
$ cat * / *。cov
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 19 A5 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 28 A1 phnM_031
k141_85332.3 4119 8 A2 phnM_031
k141_85332.3 4119 8 A2 phnM_0 31
k141_88684 267 5 B10 phnM_032
k141_88684 268 5 B10 phnM_032
k141_88684 269 5 B10 phnM_032
k141_88684 270 5 B10 phnM_032
k141_88684 271 5 B10 phnM_032
k141_88684 272 5 B10 phnM_032
$ sh a.sh
2
2
I wanted to flexibly print the output of two small awk to bash pipes, which are using variables (they worked originally). I initially thought I could store the whole command as variable itself, but for one it did not work and apparently (store awk command in a variable of bash script) it is not a good idea. So I wrote two functions, but I'm getting an "unexpected token" near "done", but it is formatted as in the link above.
Where is my mistake?
for coverage_file in */*.cov do #gene_count=$(awk '{print $5}' $coverage_file |sort | uniq -c | wc -l) #this is apparently not a good idea #contig_count=$(awk '{print $1}' $coverage_file |sort | uniq -c | wc -l) #this is apparently not a good idea cmd_gene() { awk '{print $5}' $coverage_file |sort | uniq -c | wc -l } cmd_contig() { awk '{print $1}' $coverage_file |sort | uniq -c | wc -l } cmd_gene $coverage_file cmd_contig $coverage_file #print "we found", $gene_count, "genes on ",$contig_count" contigs done
the cov files look like this:
k141_85332.3 4119 19 A5 phnM_031 k141_85332.3 4119 19 A5 phnM_031 k141_85332.3 4119 28 A1 phnM_031 k141_85332.3 4119 28 A1 phnM_031 k141_85332.3 4119 8 A2 phnM_031 k141_85332.3 4119 8 A2 phnM_031 k141_88684 267 5 B10 phnM_032 k141_88684 268 5 B10 phnM_032 k141_88684 269 5 B10 phnM_032 k141_88684 270 5 B10 phnM_032 k141_88684 271 5 B10 phnM_032 k141_88684 272 5 B10 phnM_032
EDIT: this includes the accepted answer + a possible way to print it plainly:
#!/bin/bash #define variables gene="phnM" threshold="5" #define functions cmd_gene() { awk '{print $5}' $1 |sort | uniq -c | wc -l ; } #semicolon is important here! cmd_contig() { awk '{print $1}' $1 |sort | uniq -c | wc -l ; } #semicolon is important here! #loop over files and print results (would be prettier with printf) for coverage_file in */*.cov do echo $gene" was found" $(cmd_gene "$coverage_file") "times on" $(cmd_contig "$coverage_file")" contigs with minimum coverage of" $threshold in $coverage_file done
OUTPUT:
phnM was found 67 times on 65 contigs with minimum coverage of 5 in phnm/test.cov phnM was found 3 times on 2 contigs with minimum coverage of 5 in test/test.cov
解决方案The unexpected token error is coming because when you define a function, the } has to be on it's own line or preceded by ;.
Also, since you're using
$coverage_file
in the definition of the function, you don't have to pass it.for coverage_file in */*.cov do cmd_gene() { awk '{print $5}' $coverage_file |sort | uniq -c | wc -l; } cmd_contig() { awk '{print $1}' $coverage_file |sort | uniq -c | wc -l; } cmd_gene cmd_contig #print "we found", $gene_count, "genes on ",$contig_count" contigs done
If you want to define the functions outside the for loop, you would use
$1
(not to be confused with awk's $1) and pass$coverage_file
like you were doing before.EDIT: Example of above
$ cat a.sh cmd_gene() { awk '{print $5}' $1 |sort | uniq -c | wc -l; } cmd_contig() { awk '{print $1}' $1 |sort | uniq -c | wc -l; } for coverage_file in */*.cov do cmd_gene $coverage_file cmd_contig $coverage_file done $ ls */*.cov bf/a.cov $ cat */*.cov k141_85332.3 4119 19 A5 phnM_031 k141_85332.3 4119 19 A5 phnM_031 k141_85332.3 4119 28 A1 phnM_031 k141_85332.3 4119 28 A1 phnM_031 k141_85332.3 4119 8 A2 phnM_031 k141_85332.3 4119 8 A2 phnM_031 k141_88684 267 5 B10 phnM_032 k141_88684 268 5 B10 phnM_032 k141_88684 269 5 B10 phnM_032 k141_88684 270 5 B10 phnM_032 k141_88684 271 5 B10 phnM_032 k141_88684 272 5 B10 phnM_032 $ sh a.sh 2 2
这篇关于在awk中打印用户定义函数的输出会产生意外的令牌错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!