在shell脚本中以一个间隔查找持续时间及其在数据集之间的平均值 [英] Find the durations and their average between the dataset in an interval in shell script
问题描述
这与我的老问题有关我的数据集为:
ifile.txt
2
3
2
3
2
20
2
0
2
0
0
2
1
2
5
6
7
0
3
0
3
4
5
我想找出6个值间隔中的0个值之间的不同持续时间及其平均值.
I would like to find out different duration and their average between the 0 values in 6 values interval.
我的愿望输出是:
ofile.txt
6 5.33
1 2
1 2
1 2
5 4.2
1 3
3 4
哪里
6 is the number of counts until next 0 within 6 values (i.e. 2,3,2,3,2,20) and 5.33 is the average value among them;
1 is the number of counts until next 0 within next 6 values (i.e. 2,0,2,0,0,2) and 2 is the average;
Next 1 and 2 are within same 6 values;
5 is the number of counts until next 0 within next 6 values (i.e. 1,2,5,6,7,0) and 4.2 is the average among them;
And so on
根据上一个问题的答案,我正在尝试以下方法:
As per the answer to my previous question, I was trying with this:
awk '
$0!=0{
count++
sum=sum+$0
found=""
}
$0==0{
print count,max
count=max=0
next
}
FNR%6==0{
print count,max
count=max=0
found=1
}
END{
if(!found){
print count,max
}
}
' Input_file | awk '!/^ /' | awk '$1 != 0'
推荐答案
再试一次,因为第二组6行包含2 0 2 0 0 2
,因此其输出应如果是这种情况,则为1 2, 1 2, 0 0,1 2
(我认为应该如此),然后尝试执行以下操作.
One more try since 2nd set of 6 lines have 2 0 2 0 0 2
so its output should be 1 2, 1 2, 0 0,1 2
if this is the case(which I believe ideally should be) then try following.
awk '
{
occur++
}
{
count=$0!=0?++count:count
sum+=$0
}
$0==0 || occur==6{
printf("%d %0.2f\n",count,count?sum/count:prev)
prev=count?sum/count:0
prev_count=count
count=sum=prev=prev_count=""
if(occur==6){
occur=""
}
}
END{
if(occur){
printf("%d %0.2f\n",count?count:prev_count,count?sum/count:prev)
}
}
' Input_file | awk '$1 != 0'
输出如下:
6 5.33
1 2.00
1 2.00
1 2.00
5 4.20
1 3.00
3 4.00
下面的编辑可能会帮助解决与实际问题稍有不同的类似问题,因此请将其保留在后期.
EDITs below may help in similar kind of problems which are bit different from this actual problem, so keeping them here in post.
:如果不想在Input_file中出现零时进行RESET计数,请尝试执行以下操作.这样只会连续查找6行,并且不会重置其计数.
In case you don't want to RESET count whenever a zero occurs in Input_file then try following. This will continuously look for only 6 lines and will NOT RESET its count.
awk '
{
occur++
}
$0!=0{
count++
sum+=$0
found=prev_count=prev=""
}
$0==0 && occur!=6{
printf("%d,%0.2f\n",count?count:prev_count,count?sum/count:prev)
prev=count?sum/count:0
prev_count=count
count=sum=""
found=1
next
}
occur==6{
printf("%d,%0.2f\n",count,count?sum/count:prev)
prev=count?sum/count:0
prev_count=count
count=sum=occur=""
found=1
}
END{
if(!found){
printf("%d,%0.2f\n",count?count:prev_count,count?sum/count:prev)
}
}
' Input_file
:能否请您仅使用提供的示例尝试进行以下测试,测试和编写.
Could you please try following, tested and written with provided samples only.
awk '
{
occur++
}
$0!=0{
count++
sum+=$0
found=prev_count=prev=""
}
$0==0{
printf("%d,%0.2f\n",count?count:prev_count,count?sum/count:prev)
prev=count?sum/count:0
prev_count=count
count=sum=occur=""
found=1
next
}
occur==6{
printf("%d,%0.2f\n",count,count?sum/count:prev)
prev=count?sum/count:0
prev_count=count
count=sum=occur=""
found=1
}
END{
if(!found){
printf("%d,%0.2f\n",count?count:prev_count,count?sum/count:prev)
}
}
' Input_file
代码处理的是什么:
What does code take care of:
- 要注意逻辑,如果连续的两行都具有
0
值,它将打印该行的先前计数和平均值. -
这还将处理诸如以下的边缘情况:
- It takes care of logic where if any continuous 2 lines are having
0
value then it will print previous count and average values for that line. This will also take care of edge cases like:
a-如果某行不是以0
结尾,它将通过我创建的found
标志检查是否有一些值要打印.
a- In case a line is either NOT ending with a 0
it will check if some values are there to print by found
flag I created.
b-如果任何Input_file的最后一行未除以6,则这种情况也将被END块通过found
标志对其进行检查的逻辑所覆盖.
b- In case of any Input_file's last line is NOT divided by 6 then also this case will be covered by END block's logic of checking it by found
flag.
说明: :为上述代码添加了详细说明.
Explanation: Adding a detailed explanation for above code.
awk ' ##Starting awk program from here.
{
occur++
}
$0!=0{ ##Checking condition if a line is NOT having zero value then do following.
count++ ##Increment variable count with 1 each time it comes here.
sum+=$0 ##Creating variable sum and keep adding current line value in it.
found=prev_count=prev="" ##Nullifying variables found, prev_count, prev here.
} ##Closing BLOCK for condition $0!=0 here.
$0==0{ ##Checking condition if a line is having value zero then do following.
printf("%d,%0.2f\n",count?count:prev_count,count?sum/count:prev) ##Printing count and count/sum here, making sure later is NOT getting divided by 0 too.
prev=count?sum/count:0 ##Creating variable prev which will be sum/count or zero in case count variable is NULL.
prev_count=count ##Creating variable prev_count whose value is count.
count=sum=occur="" ##Nullify variables count and sum here.
found=1 ##Setting value 1 to variable found here.
next ##next will skip all further statements from here.
} ##Closing BLOCK for condition $0==0 here.
occur==6{ ##Checking if current line is fully divided with 6 then do following.
printf("%d,%0.2f\n",count,count?sum/count:prev) ##Printing count and count/sum here, making sure later is NOT getting divided by 0 too.
prev=count?sum/count:0 ##Creating variable prev which will be sum/count or zero in case count variable is NULL.
prev_count=count ##Creating variable prev_count whose value is count.
count=sum=occur="" ##Nullifying variables count and sum here.
found=1 ##Setting value 1 to variable found here.
} ##Closing BLOCK for condition FNR%6==0 here.
END{ ##Starting END block for this awk program here.
if(!found){ ##Checking condition if variable found is NULL then do following.
printf("%d,%0.2f\n",count?count:prev_count,count?sum/count:prev) ##Printing count and count/sum here, making sure later is NOT getting divided by 0 too.
}
}
' Input_file ##Mentioning Input_file name here.
这篇关于在shell脚本中以一个间隔查找持续时间及其在数据集之间的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!