Bash目录排序问题 - 删除重复行? [英] Bash Directory Sorting Issue - Removing Duplicate Lines?

查看:182
本文介绍了Bash目录排序问题 - 删除重复行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用此命令合并多个相同的目录,并从每个相应的文件中删除重复的行:

I'm using this command to merge multiple identical directories and to remove duplicate lines from each of the corresponding files:

for f in app1/*; do 
   bn="$(basename "$f")"
   sort -u "$f" "app2/$bn" > "app/$bn"
done

有没有办法编辑这个,检查所有文件的行,并删除所有的重复项以及?

Is there a way to edit this so that it checks the lines of all the files and removes all the duplicates as well? I do need to keep the existing file structure with individual files.

最终结果创建一个目录,300个文本文件不超过30mb。

The end result creates a directory with 300 text files that's no larger than 30mb.

示例:

**Directory app1**
*1.txt*       
a
b
c

*2.txt*
d
e
f

**Directory app2**
*1.txt*
a
b
c
g

*2.txt*
a
b
c
d
e
f

**Results in Directory app**
*1.txt*
a
b
c
g

*2.txt*
a
b
c
d
e
f

Desired Result in Directory app Should Be:
*1.txt*
a
b
c
g

*2.txt*
d
e
f

如您所见,当在1.txt中也找到了2.txt的重复的ABC行。每个文件中的所有行都应该保持唯一,所有重复项都应该被删除。

As you can see it's not removing the duplicate "A B C" lines from 2.txt when it's also found in 1.txt. All lines in each file should remain unique and all duplicates should be removed.

推荐答案


看到它没有从
2.txt中删除重复的ABC行,当它也在1.txt中找到。每个文件中的所有行都应该保持唯一,所有重复的内容都应该被删除。

As you can see it's not removing the duplicate "A B C" lines from 2.txt when it's also found in 1.txt. All lines in each file should remain unique and all duplicates should be removed.

您可以通过将7171u的答案应用于其他问题 Unix Bash从目录文件中删除重复的行?到上面的命令的结果(更改 tmp / * 在他的脚本到 app / * ,这应该是微不足道的)。

You can accomplish this goal by applying 7171u's answer to your other question "Unix Bash Remove Duplicate Lines From Directory Files?" to the result of your command above (after having changed the tmp/* in his script to app/*, which should be trivial).

这篇关于Bash目录排序问题 - 删除重复行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆