从多个文件合并成一个粘贴同一列 [英] paste same column from multiple files into one

查看:113
本文介绍了从多个文件合并成一个粘贴同一列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有我想要打印列$ 7至一个新的文件约50制表符分隔文件。所有文件都列相同数量和线的相同的量。在输出从不同的文件的列应彼此相邻被粘贴,由标签分隔。

I have around 50 tab-delimited files of which I want to print column $7 to a new file. All files have the same amount of columns and the same amount of lines. In the output the columns from the different files should be pasted next to each other, seperated by tab.

我想用LS,xargs的和AWK的组合。所以LS找到所有我想要的文件,然后在awk打印的第7列,并创建output.txt的

I was thinking to use a combination of 'ls', 'xargs' and 'awk'. So ls to find all the file I want, then awk for printing the 7th column and create output.txt

ls /folder/*_name.txt | awk '{print $7}' xargs {} > output.txt

我的主要问题是使用xargs的,以及如何打印在不同的列所有$ 7输出文件

My main issue is the use of xargs and how to print all $7 in different columns in the output file

推荐答案

如果我知道你想什么做正确,那么使用awk你可以使用

If I understand what you're trying to do correctly, then with awk you could use

awk -F '\t' 'FNR == 1 { ++file } { col[FNR, file] = $7 } END { for(i = 1; i <= FNR; ++i) { line = col[i, 1]; for(j = 2; j <= file; ++j) { line = line "\t" col[i, j] }; print line } }' file1 file2 file3 file4

在code是

FNR == 1 { ++file }                 # in the first line of a file, increase
                                    # the file counter, so file is the number
                                    # of the file we're processing
{                         
  col[FNR, file] = $7               # remember the 7th column from all lines
}                                   # by line and file number

END {                               # at the end:
  for(i = 1; i <= FNR; ++i) {       # walk through the lines,
    line = col[i, 1]                # paste together the columns in that line
    for(j = 2; j <= file; ++j) {    # from each file
      line = line "\t" col[i, j]
    }
    print line                      # and print the result.
  }
}

编辑:调整了装配在飞行中,而不是在最后的线,这可以缩短到

Tweaked to assemble the lines on the fly rather than at the end, this could be shortened to

awk -F '\t' 'FNR == 1 && FNR != NR { sep = "\t" } { line[FNR] = line[FNR] sep $7 } END { for(i = 1; i <= FNR; ++i) { print line[i] } }'

这是

FNR == 1 && FNR != NR {   # in the first line, but not in the first file
  sep = "\t"              # set the separator to a tab (in the first it's empty)
}
{                         # assemble the line on the fly
  line[FNR] = line[FNR] sep $7
}
END {                     # and in the end, print the lines.
  for(i = 1; i <= FNR; ++i) {
    print line[i]
  }
}

绑定自己呆子,这可以进一步缩短到

Tethering yourself to gawk, this could be further shortened to

awk -F '\t' '{ line[FNR] = line[FNR] sep $7 } ENDFILE { sep = "\t" } END { for(i = 1; i <= FNR; ++i) { print line[i] } }'

...但 ENDFILE 不知道到其他awk的实现,例如mawk,所以你可能preFER,以避免它。

...but ENDFILE is not known to other awk implementations such as mawk, so you may prefer to avoid it.

这篇关于从多个文件合并成一个粘贴同一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆