在Unix Join操作中填充空白字段 [英] Padding Empty Field in Unix Join Operation
问题描述
我有两个要在其中执行联合操作的文件 根据第一栏:
I have two files where I want to perform union operation based on 1st column:
file1.txt
file1.txt
foo 1
bar 2
qux 3
file2.txt
file2.txt
foo x
qux y
boo z
我希望得到的结果是这样的:
The result I hope to get is like this:
foo 1 x
bar 2 -
qux 3 y
boo - z
填充第1列的空白字段 与-".
where the empty fields of column 1 is padded with "-".
但是为什么这个加入命令不能按我预期的那样工作?
But why this join command doesn't work as I expected?
$ join -a1 -a2 -e"-" file1.txt file2.txt
什么是正确的方法?
推荐答案
"重要:FILE1和FILE2必须在连接字段上排序."(摘自此在线联机帮助页).
"Important: FILE1 and FILE2 must be sorted on the join fields." (from this online manpage).
此问题#1.问题2更为严重:选项-e
的文献记载不充分-仅与-o
结合使用,因此例如:
This problem #1. Problem #2 is worse: option -e
is badly documented -- only works in conjunction with -o
, so for example:
$ join -a 1 -a 2 -e'-' -o '0,1.2,2.2' sfile1.txt sfile2.txt
bar 2 -
boo - z
foo 1 x
qux 3 y
其中s
前缀名称表示我事先sort
已存储的文件.
where the s
prefix name indicated files that I've sort
ed beforehand.
man join
解释了-o
开关(我在上面指向的联机帮助页也是如此).它指定要输出的字段(1.2表示文件1和c中的第二个字段),或0表示连接字段,并且是逗号分隔的列表. (实际上,我不记得0的值,因此最初给出的笨拙解决方案需要awk后处理,但是当前的解决方案更好……不需要awk!).
man join
explains the -o
switch (so does the online manpage I point to above). It specifies the fields to output (1.2 means 2nd field from file 1, &c), or 0 to mean the join field, and is a comma-separated list. (I didn't remember the 0 value, actually, so had originally given a clumsier solution requiring awk post-processing, but the current solution is better... and no awk needed!).
这篇关于在Unix Join操作中填充空白字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!