文字进入滑动窗口并计数 [英] Text into sliding window and count
问题描述
我有这样的文件(缺少1行)
I have file (More than 1 lack lines)as like this
20 14370 rs6054257 G A 29 PASS NS=3;DP=14;AF=0.5;DB;H2 GT:GQ:DP:HQ 0|0:48:1:51,51
20 17330 . T A 3 q10 NS=3;DP=11;AF=0.017 GT:GQ:DP:HQ 0|0:49:3:58,50
20 1110696 rs6040355 A G,T 67 PASS NS=2;DP=10;AF=0.333,0.667;AA=T;DB GT:GQ:DP:HQ 0|0:21:6:23,27
20 1230237 . T . 47 PASS NS=3;DP=13;AA=T GT:GQ:DP:HQ 0|0:54:7:56,60
20 1234567 GTC G,GTCT 50 PASS NS=3;DP=9;AA=G GT:GQ:DP 0/1:35:4
我需要拆分为滑动窗口并计数"0/0&".像这样的位置
I need split as sliding window and count "0/0" positions as like this
Pos Count
1-10001 0
2-10002 1
3-10003 0
为了计算每10000个位置,我使用了此cmd
For counting each 10000 positions I used this cmd
tail -n +11 file |
awk -v n=10000 '/0\/0/{c++} NR%n==0{print c; c=0} END {if (NR%n!=0) print c}'
推荐答案
第一个解决方案: 完全基于您显示的尝试,完全使用GNU awk
.无法测试太多,因为样本中没有0/0值,应该可以解决.从OP本身的尝试中获取的 tail
命令.
1st solution: Completely based on your shown attempts only, written in GNU awk
. Couldn't test much since samples are not having 0/0 values in it, should work through. Taken tail
command from OP's attempt itself.
tail -n +11 Input_file |
awk -v n="10000" '
NR%n==0{
++occur
print n+occur,count
count=""
}
/0\/0/{
count++
}
END{
++occur
if(count){ print n+occur }
}
'
第二个解决方案: 如果您的行中多次出现 0/0
,并且您希望对每一行进行计数,则尝试遵循与第一种解决方案稍有不同的方法.
2nd solution: In case you have multiple occurrences of 0/0
in your lines and you want to count all in each line then try following slightly different from 1st solution.
tail -n +11 Input_file |
awk -v n="10000" '
NR%n==0{
++occur
print n+occur,count
count=""
}
{
count+=gsub(/0\/0/,"&")
}
END{
++occur
if(count){ print n+occur }
}
'
这篇关于文字进入滑动窗口并计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!