如何逐行比较两个不同的文件并在第三个文件中写入差异? [英] How to compare two different files line by line and write the difference in third file?
问题描述
我想比较每个都有三列的两个文本文件.一个文件有999行,另一个文件有757行.我希望将不同的242行存储在不同的文件中.我使用随机网络生成器创建了第一个文件(999行)(999行是边,第三列是第一,第二列之间的权重-源节点和目标节点).
I would like to compare two text files which have three columns each. One file has 999 rows and another has 757 rows. I want the different 242 rows to be stored in a different file. I created the first file (999 rows) using a random network generator (999 rows are edges with third column being weight between first, second columns - source, destination nodes).
文件格式-文件1、2
1 3 1
16 36 1
我尝试过
比较两个文件逐行生成另一个文件中的差异 和 找到两个文本文件之间的差异,每个文本文件一个和 http://www.daniweb. com/software-development/python/threads/124932/610058#post610058
都没有为我工作.
我认为这是字符串比较的问题.我想比较第一列和第二列中的数字.如果它们都不相同,我想将其写入第三个文件.
I think it is a problem of string comparison. I would like to compare the numbers in first column and second column. If they both are different, I want to write it to third file.
任何帮助将不胜感激!
更新
我正在发布以下代码,我在@MK发表他的评论后尝试过.
I am posting the following code that I tried after @MK posted his comment.
f = open("results.txt","w")
for line in file("100rwsnMore.txt"):
rwsncount += 1
line = line.split()
src = line[0]
dest = line[1]
for row in file("100rwsnDeleted.txt"):
row = row.split()
s = row[0]
d = row[1]
if(s != src and d != dest):
f.write(str(s))
f.write(' ')
f.write(str(d))
f.write('\n')
f.close()
推荐答案
如果您使用的是* nix系统,最好的通用选择就是使用:
The best general-purpose option if you're on a *nix system is just to use:
sort filea fileb | uniq -u
但是如果您需要使用Python:
But if you need to use Python:
您的代码在外部文件的每次迭代中都会重新打开内部文件.在循环外打开它.
Your code reopens the inner file in every iteration of the outer file. Open it outside the loop.
使用嵌套循环比循环遍历第一个存储找到的值,然后将第二个与这些值进行比较的效率低.
Using a nested loop is less efficient than looping over the first storing the found values, and then comparing the second to those values.
def build_set(filename):
# A set stores a collection of unique items. Both adding items and searching for them
# are quick, so it's perfect for this application.
found = set()
with open(filename) as f:
for line in f:
# [:2] gives us the first two elements of the list.
# Tuples, unlike lists, cannot be changed, which is a requirement for anything
# being stored in a set.
found.add(tuple(sorted(line.split()[:2])))
return found
set_more = build_set('100rwsnMore.txt')
set_del = build_set('100rwsnDeleted.txt')
with open('results.txt', 'w') as out_file:
# Using with to open files ensures that they are properly closed, even if the code
# raises an exception.
for res in (set_more - set_del):
# The - computes the elements in set_more not in set_del.
out_file.write(" ".join(res) + "\n")
这篇关于如何逐行比较两个不同的文件并在第三个文件中写入差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!