AWK：从2档匹配列值，如果它们的数值接近 [英] Awk: Match column values from 2 files if their numerical values are close

查看：160 发布时间：2016/7/29 11:15:33 bash awk

本文介绍了AWK：从2档匹配列值，如果它们的数值接近的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

随着我的第一个问题在这里（ awk的：列数长度）

Following my first question here (Awk: Length of column number)

我的数据：

文件1

8.193506084253E+06 1.900521460E+01
8.193538509494E+06 1.899919490E+01
8.193540934736E+06 1.899317535E+01
8.193543359977E+06 1.898720476E+01
8.193546406105E+06 1.897934066E+01

文件2

8.193505938557E+06 1.572155163E+01
8.193509618041E+06 1.573016361E+01 
8.193513297526E+06 1.573874442E+01 
8.193516977010E+06 1.574725969E+01

我想从文件2采取了$ 1，在文件1 $ 1最最接近*值搜索，为了得到这样的例子输出

I want to take $1 from File 2 and search in File 1 the most closest* value in $1, in order to get an output like this example

 8.193505938557E+06 1.572155163E+01 1.900521460E+01

在这种情况下，列$ 1文件2中只有第一个值有比赛，没有别的，因为$ 1从文件2其他值不是从文件1足够接近（定义一些条件），以$ 1的值

In this case the only the first value of column $1 in file 2 has a match, and nothing else because the other values of $1 from File 2 are not close enough (defining some condition) to any value of $1 from File 1

请注意该行数是不同的。结果，
*最接近=其中两个数字之间的差大于某个阈值

Note that the number of rows are different.
*closest= where the difference between the two numbers is smaller than some threshold

推荐答案

据我了解，根据你的描述的结果应该是：

To my understanding, according to your description the result should be:

1235.34 d a
3457.23 e b
7589.34 f b

即。包括F的线最接近b的

i.e. including a line for "f" which is closest to "b".

这可以用下面的脚本来完成：

This can be done using the following script:

ARGIND == 1 {
    haystack[$1] = $2;
}
ARGIND == 2 {
    bestdiff=-1;
    for (v in haystack)
        if (bestdiff < 0 || (v-$1)**2 < bestdiff) {
            bestkey=haystack[v];
            bestdiff=(v-$1)**2;
        }
    print $1, $2, bestkey;
}

（我使用的是通过现蕾 ** 2 为取绝对值的替代品。）

(I'm using squaring via **2 as a substitute for taking the absolute value.)

如果你想用晚餐preSS的结果，如果不同的是，例如大于10，让你引述的结果，使用这样的：

If you want to suppress results if the difference is for example greater than 10, to get the result you quoted, use something like this:

if (bestdiff < 10**2)
    print $1, $2, bestkey;

编辑：的OP改变了问题IN-的例子和输出。下面是引用原始的示例文件。文件1：

The OP changed the example in- and output in the question. Here are the original example files for reference. File 1:

1234.34  a 
3456.23  b 
2325.89  c 
2326.20  c2

文件2：

1235.34 d
3457.23 e
7589.34 f

输出：

1235.34 d a
3457.23 e b

注意： ARGIND 和 ** 是GNU扩展。看到mklement0评论下面的详细资料。

Note: ARGIND and ** are GNU extensions. See comment from mklement0 below for details.

这篇关于AWK：从2档匹配列值，如果它们的数值接近的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

AWK：从2档匹配列值，如果它们的数值接近 [英] Awk: Match column values from 2 files if their numerical values are close

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录关闭

AWK：从2档匹配列值，如果它们的数值接近 [英] Awk: Match column values from 2 files if their numerical values are close

问题描述

推荐答案

相关文章

Linux/Unix最新文章

热门教程

热门工具

登录 关闭

登录关闭