如何比较在bash的两个文本文件的行值和列? [英] How to compare values of rows and columns of two text files in bash?

查看:81
本文介绍了如何比较在bash的两个文本文件的行值和列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个文本文件和我希望根据自己的行和列以比较它们的相应的值。通过比较,我的意思是检查是否值相等,回声如果值是相同的或没有。以下是文件:

I have two text files and I want to compare their correspondent values according to their rows and columns. By comparing, I mean to check if the values are equal and echo if the values are the same or not. Here are the files:

FILE1.TXT

file1.txt

Name  Col1  Col2  Col3  
-----------------------
row1  1     4     7        
row2  2     5     8         
row3  3     6     9   

FILE2.TXT

file2.txt

Name  Col1  Col2  Col3  
-----------------------
row2  1     4     11        
row1  2     5     12           

下面是约束条件:


  • 比较仅存在的行(即:自ROW3中存在FILE1.TXT但不是在FILE2.TXT,则不需要比较)

  • 比较已做一行和一列在一个时间。

  • 必须使用 AWK

  • 行可能是乱序中的文件

我想到的是这样的:

awk 'NR>2
    for (i=2;i<NR;i++)              #for each row of file1.txt
    {     
        for(j=1;i<NF;j++)           #for each column of file1.txt
        {
             // check if row and column of file1.txt is equal to row and column of file2.txt
        } 
    }

' file1.txt file2.txt

我在bash的初学者,所以请原谅我所有的错误。是这样的可能吗?此外,你如何比较两个不同的文本文件值?谢谢。让我知道是否需要更多的解释。

I am a beginner in bash so please excuse all my errors. Is something like this possible? Also how do you compare values of two different text files? Thanks. Let me know if more explanation is needed.

推荐答案

显然,这是一门功课的问题,所以我只是给你一些提示。你必须有一个awk本书来读,在那里可以学到的细节。

Clearly this is a homework problem, so I'll just give you some tips. You must have an awk book to read where you can learn the details.

您正在尝试使用NR的方式不正确。它不是记录的总数,但当前的记录的数目。

The way you're trying to use NR is incorrect. It is not the total number of records, but the number of the current record.

记住,awk脚本是一个规则列表,每个表格模式{动作} 。所以,你的程序应该采取这种形式尽可能多地。 awk的基本机制是读取记录,依次测试每一个规则的模式,如果匹配,那么一个模式执行相关的动作,当它到达的规则走到底就到下一个记录。它的数据驱动的,这是由如C或Java语言非常不同的,例如

Remember that an awk script is a list of rules, each with the form pattern {actions}. So your program should take that form as much as possible. awk's basic mechanism is to read a record, test it against each rule's pattern in turn, if it matches a pattern then execute the associated actions, when it reaches the end of the rules go on to the next record. It's "data driven", which is very different from a language like C or Java, for instance.

您可以跳过这两个文件的前两行,像这样一个初始规则:

You can skip the first two lines of both files with an initial rule like this:

FNR < 3 { next }  # if file record number < 3, go to next record

有是对付两个文件的惯用方式。 NR == FNR将只在第一个文件如此,因为NR(备案号)不断递增跨文件而FNR(文件记录数)的文件之间重置。所以,你可以这样做:

There is an idiomatic way to deal with two files. NR == FNR will only be true in the first file, since NR (record number) keeps incrementing across files whereas FNR (file record number) is reset between files. So you can do this:

NR == FNR {
    # Only the first file's records will be processed here

    next  # go on to the next record
}

在处理的第一个文件,你要使用关联数组保存记录的,由第一场键控。

While processing the first file, you'll want to use an associative array to save the records, keyed by the first field.

最终规则将仅处理该第二文件,测试如果第一字段是缔阵列中的键,如果是,则比较其他字段,看它们是否匹配。

The final rule will deal only with the second file, testing if the first field is a key in the associative array, and if it is, comparing the other fields to see if they match.

所以,你的程序可能有这样的结构:

So your program might have this structure:

FNR < 3 { next }  # if file record number < 3, go to next record

NR == FNR {
    # Only the first file's records will be processed here

    # Save info in an associative array.
    aa[$1] = ...

    next  # go on to the next record
}

# If a rule has no pattern, it matches every record
{
    # Only the second file's records will be processed here

    if ($1 in aa) {
        # compare fields
    }
}

这篇关于如何比较在bash的两个文本文件的行值和列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆