将不同文件的同一列转换为同一新文件 [英] Same column of different files into the same new file

查看:62
本文介绍了将不同文件的同一列转换为同一新文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有多个文件夹Case-1,Case-2 .... Case-N,它们都有一个名为PPD的文件.我想提取所有第二列并将它们放入一个名为123.dat的文件中. 看来我不能在for循环中使用awk.

I have multiple folders Case-1, Case-2....Case-N and they all have a file named PPD. I want to extract all 2nd columns and put them into one file named 123.dat. It seems that I cannot use awk in a for loop.

case=$1
for (( i = 1; i <= $case ; i ++ ))
do
    file=Case-$i
    cp $file/PPD temp$i.dat

    awk 'FNR==1{f++}{a[f,FNR]=$2}
         END
         {for(x=1;x<=FNR;x++)
             {for(y=1;y<ARGC;y++)
             printf("%s ",a[y,x]);print ""} }'  

    temp$i.dat >> 123.dat   
done

现在123.dat仅具有Case-N中最后一个PPD的日期

Now 123.dat only has the date of the last PPD in Case-N

我知道如果每个PPD文件都至少有一个相同的列,那么我可以使用join(我之前使用过该命令),但是如果我有很多Case文件夹,事实证明它会非常慢

I know I can use join(I used that command before) if every PPD file has at least one column the same, but it turns out to be extremely slow if I have lots of Case folders

推荐答案

下面的AWK程序可以为您提供帮助.

The below AWK program can help you.

#!/usr/bin/awk -f

BEGIN {
    # Defaults
    nrecord=1
    nfiles=0
}

BEGINFILE {
    # Check if the input file is accessible,
    # if not skip the file and print error.
    if (ERRNO != "") {
        print("Error: ",FILENAME, ERRNO)
        nextfile
    }
}

{
    # Check if the file is accessed for the first time
    # if so then increment nfiles. This is to keep count of
    # number of files processed.
    if ( FNR == 1 ) {
        nfiles++
    } else if (FNR > nrecord) {
        # Fetching the maximum size of the record processed so far.
        nrecord=FNR
    }

    # Fetch the second column from the file.
    array[nfiles,FNR]=$2

}

END {
    # Iterate through the array and print the records.
    for (i=1; i<=nrecord; i++) {
        for (j=1; j<=nfiles; j++) {
            printf("%5s", array[j,i])
        }
        print ""
    }
}

输出:

$ ./get.awk Case-*/PPD
    1   11   21
    2   12   22
    3   13   23
    4   14   24
    5   15   25
    6   16   26
    7   17   27
    8   18   28
    9   19   29
   10   20   30

Case*/PPD在此处扩展为Case-1/PPDCase-2/PPDCase-3/PPD,依此类推.下面是为其生成输出的源文件.

Here the Case*/PPD expands to Case-1/PPD, Case-2/PPD, Case-3/PPD and so on. Below are the source files for which the output was generated.

$ cat Case-1/PPD 
1   1   1   1
2   2   2   2
3   3   3   3
4   4   4   4
5   5   5   5
6   6   6   6
7   7   7   7
8   8   8   8
9   9   9   9
10  10  10  10
$ cat Case-2/PPD 
11  11  11  11
12  12  12  12
13  13  13  13
14  14  14  14
15  15  15  15
16  16  16  16
17  17  17  17
18  18  18  18
19  19  19  19
20  20  20  20
$ cat Case-3/PPD 
21  21  21  21
22  22  22  22
23  23  23  23
24  24  24  24
25  25  25  25
26  26  26  26
27  27  27  27
28  28  28  28
29  29  29  29
30  30  30  30

这篇关于将不同文件的同一列转换为同一新文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆