比较两个文件报告差异在python [英] Compare two files report difference in python
问题描述
我有两个名为hosts的文件(在不同的目录中)
I have 2 files called "hosts" (in different directories)
我想使用python来比较他们是否是IDENTICAL。如果它们不相同,我想在屏幕上打印差异。
I want to compare them using python to see if they are IDENTICAL. If they are not Identical, I want to print the difference on the screen.
到目前为止,我已经尝试过
So far I have tried this
hosts0 = open(dst1 + "/hosts","r")
hosts1 = open(dst2 + "/hosts","r")
lines1 = hosts0.readlines()
for i,lines2 in enumerate(hosts1):
if lines2 != lines1[i]:
print "line ", i, " in hosts1 is different \n"
print lines2
else:
print "same"
我获得
File "./audit.py", line 34, in <module>
if lines2 != lines1[i]:
IndexError: list index out of range
这意味着其中一个主机有比另一个更多的线路。
是否有更好的方法来比较2个文件并报告差异?
Which means one of the hosts has more lines than the other. Is there a better method to compare 2 files and report the difference?
推荐答案
import difflib
lines1 = '''
dog
cat
bird
buffalo
gophers
hound
horse
'''.strip().splitlines()
lines2 = '''
cat
dog
bird
buffalo
gopher
horse
mouse
'''.strip().splitlines()
# Changes:
# swapped positions of cat and dog
# changed gophers to gopher
# removed hound
# added mouse
for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm=''):
print line
:
--- file1
+++ file2
@@ -1,7 +1,7 @@
+cat
dog
-cat
bird
buffalo
-gophers
-hound
+gopher
horse
+mouse
此差异为您提供上下文 - 它清楚如何文件是不同的。您可以在这里看到两次cat,因为它从下面的dog中移除并添加在上面。
This diff gives you context -- surrounding lines to help make it clear how the file is different. You can see "cat" here twice, because it was removed from below "dog" and added above it.
您可以使用n = 0删除上下文。 / line> $ p
You can use n=0 to remove the context.
for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0):
print line
输出:
--- file1
+++ file2
@@ -0,0 +1 @@
+cat
@@ -2 +2,0 @@
-cat
@@ -5,2 +5 @@
-gophers
-hound
+gopher
@@ -7,0 +7 @@
+mouse
但现在它充满了@@行告诉你在文件中的位置已经改变。
But now it's full of the "@@" lines telling you the position in the file that has changed. Let's remove the extra lines to make it more readable.
for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0):
for prefix in ('---', '+++', '@@'):
if line.startswith(prefix):
break
else:
print line
给我们这个输出:
+cat
-cat
-gophers
-hound
+gopher
+mouse
现在你想要什么?
如果忽略所有删除的行,那么您将看不到hound已删除。
如果你很高兴只是显示添加到文件,那么你可以这样做:
Now what do you want it to do? If you ignore all removed lines, then you won't see that "hound" was removed. If you're happy just showing the additions to the file, then you could do this:
diff = difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0)
lines = list(diff)[2:]
added = [line[1:] for line in lines if line[0] == '+']
removed = [line[1:] for line in lines if line[0] == '-']
print 'additions:'
for line in added:
print line
print
print 'additions, ignoring position'
for line in added:
if line not in removed:
print line
输出:
additions:
cat
gopher
mouse
additions, ignoring position:
gopher
mouse
现在你可以告诉你,以打印差异的两个文件,所以如果你想要更多的帮助,你将需要非常具体。
You can probably tell by now that there are various ways to "print the differences" of two files, so you will need to be very specific if you want more help.
这篇关于比较两个文件报告差异在python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!