比较两个文件报告差异在python [英] Compare two files report difference in python

查看:262
本文介绍了比较两个文件报告差异在python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个名为hosts的文件(在不同的目录中)

I have 2 files called "hosts" (in different directories)

我想使用python来比较他们是否是IDENTICAL。如果它们不相同,我想在屏幕上打印差异。

I want to compare them using python to see if they are IDENTICAL. If they are not Identical, I want to print the difference on the screen.

到目前为止,我已经尝试过

So far I have tried this

hosts0 = open(dst1 + "/hosts","r") 
hosts1 = open(dst2 + "/hosts","r")

lines1 = hosts0.readlines()

for i,lines2 in enumerate(hosts1):
    if lines2 != lines1[i]:
        print "line ", i, " in hosts1 is different \n"
        print lines2
    else:
        print "same"

我获得

File "./audit.py", line 34, in <module>
  if lines2 != lines1[i]:
IndexError: list index out of range

这意味着其中一个主机有比另一个更多的线路。
是否有更好的方法来比较2个文件并报告差异?

Which means one of the hosts has more lines than the other. Is there a better method to compare 2 files and report the difference?

推荐答案

import difflib

lines1 = '''
dog
cat
bird
buffalo
gophers
hound
horse
'''.strip().splitlines()

lines2 = '''
cat
dog
bird
buffalo
gopher
horse
mouse
'''.strip().splitlines()

# Changes:
# swapped positions of cat and dog
# changed gophers to gopher
# removed hound
# added mouse

for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm=''):
    print line

--- file1
+++ file2
@@ -1,7 +1,7 @@
+cat
 dog
-cat
 bird
 buffalo
-gophers
-hound
+gopher
 horse
+mouse

此差异为您提供上下文 - 它清楚如何文件是不同的。您可以在这里看到两次cat,因为它从下面的dog中移除并添加在上面。

This diff gives you context -- surrounding lines to help make it clear how the file is different. You can see "cat" here twice, because it was removed from below "dog" and added above it.

您可以使用n = 0删除上下文。 / line> $ p

You can use n=0 to remove the context.

for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0):
    print line

输出:

--- file1
+++ file2
@@ -0,0 +1 @@
+cat
@@ -2 +2,0 @@
-cat
@@ -5,2 +5 @@
-gophers
-hound
+gopher
@@ -7,0 +7 @@
+mouse

但现在它充满了@@行告诉你在文件中的位置已经改变。

But now it's full of the "@@" lines telling you the position in the file that has changed. Let's remove the extra lines to make it more readable.

for line in difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0):
    for prefix in ('---', '+++', '@@'):
        if line.startswith(prefix):
            break
    else:
        print line

给我们这个输出:

+cat
-cat
-gophers
-hound
+gopher
+mouse

现在你想要什么?
如果忽略所有删除的行,那么您将看不到hound已删除。
如果你很高兴只是显示添加到文件,那么你可以这样做:

Now what do you want it to do? If you ignore all removed lines, then you won't see that "hound" was removed. If you're happy just showing the additions to the file, then you could do this:

diff = difflib.unified_diff(lines1, lines2, fromfile='file1', tofile='file2', lineterm='', n=0)
lines = list(diff)[2:]
added = [line[1:] for line in lines if line[0] == '+']
removed = [line[1:] for line in lines if line[0] == '-']

print 'additions:'
for line in added:
    print line
print
print 'additions, ignoring position'
for line in added:
    if line not in removed:
        print line

输出:

additions:
cat
gopher
mouse

additions, ignoring position:
gopher
mouse

现在你可以告诉你,以打印差异的两个文件,所以如果你想要更多的帮助,你将需要非常具体。

You can probably tell by now that there are various ways to "print the differences" of two files, so you will need to be very specific if you want more help.

这篇关于比较两个文件报告差异在python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆