两个文件之间返回的行不同(Python) [英] Returning lines that differ between two files (Python)

查看:75
本文介绍了两个文件之间返回的行不同(Python)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个文件,每个文件都有成千上万的行,分别是output1.txt和output2.txt.我想遍历两个文件并返回两者之间不同的行(和内容).它们大都相同,这就是为什么我找不到差异(filecmp.cmp返回false)的原因.

I have two files with tens of thousands of lines each, output1.txt and output2.txt. I want to iterate through both files and return the line (and content) of the lines that differ between the two. They're mostly the same which is why I can't find the differences (filecmp.cmp returns false).

推荐答案

您可以执行以下操作:

import difflib, sys

tl=100000    # large number of lines

# create two test files (Unix directories...)

with open('/tmp/f1.txt','w') as f:
    for x in range(tl):
        f.write('line {}\n'.format(x))

with open('/tmp/f2.txt','w') as f:
    for x in range(tl+10):   # add 10 lines
        if x in (500,505,1000,tl-2):
            continue         # skip these lines
        f.write('line {}\n'.format(x))        

with open('/tmp/f1.txt','r') as f1, open('/tmp/f2.txt','r') as f2:
    diff = difflib.ndiff(f1.readlines(),f2.readlines())    
    for line in diff:
        if line.startswith('-'):
            sys.stdout.write(line)
        elif line.startswith('+'):
            sys.stdout.write('\t\t'+line)   

打印(400毫秒内):

Prints (in 400 ms):

- line 500
- line 505
- line 1000
- line 99998
        + line 100000
        + line 100001
        + line 100002
        + line 100003
        + line 100004
        + line 100005
        + line 100006
        + line 100007
        + line 100008
        + line 100009

如果需要行号,请使用枚举:

If you want the line number, use enumerate:

with open('/tmp/f1.txt','r') as f1, open('/tmp/f2.txt','r') as f2:
    diff = difflib.ndiff(f1.readlines(),f2.readlines())    
    for i,line in enumerate(diff):
        if line.startswith(' '):
            continue
        sys.stdout.write('My count: {}, text: {}'.format(i,line))  

这篇关于两个文件之间返回的行不同(Python)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆