比较后续行的不同列以合并范围 [英] Compare different columns of subsequent rows to merge ranges

查看:73
本文介绍了比较后续行的不同列以合并范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个范围列表,并且我试图合并位于彼此给定距离内的后续条目.

I have a list of ranges, and I am trying to merge subsequent entries which lie within a given distance of each other.

在我的数据中,第一列包含范围的下限,第二列包含范围的上限.
逻辑如下:如果第1列中的值小于或等于上一行第2列中的值加上给定值,则打印上一行第1列中的条目和给定行第2列中的条目.

In my data, the first column contains the lower bound of the range and the second column contains the upper bound.
The logic follows: if the value in column 1 is less than or equal to the value in column 2 of the previous row plus a given value, print the entry in column 1 of the previous row and the entry in column 2 of the given row.

如果两个范围都在变量'dist'指定的距离之内,则应将它们合并,否则应按原样打印行.

If the two ranges lie within the distance specified by the variable 'dist', they should be merged, else the rows should be printed as they are.

Input:    
1   10  
9   19  
51  60

if dist=10, Desired output:    
1   19  
51  60  

我使用bash尝试了以下方法:

Using bash, I've tried things along these lines:

dist=10  
awk '$1 -le (p + ${dist}) { print q, $2 } {p=$2;} {q=$1} ' input.txt > output.txt

这将返回语法错误.

任何帮助表示赞赏!

推荐答案

假定如果条件满足2对连续记录(即总共3个记录,连续),那么第3个将考虑rec-1的输出和rec-2与其之前的记录相同.

Assuming, if the condition is satisfied for 2 pairs of consecutive records (i.e 3 records in total, consecutively) then 3rd one would consider the output of rec-1 and rec-2 as it's previous record.

awk -v dist=10 'FNR==1{prev_1=$1; prev_2=$2; next} ($1<=prev_2+dist){print prev_1,$2; prev_2=$2;next} {prev_1=$1; prev_2=$2}1' file

输入:

$cat file
1 10
9 19
10 30
51 60

输出:

1 19
1 30
51 60

这篇关于比较后续行的不同列以合并范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆