仅比较文件/文件夹名称上的目录,是否打印出差异? [英] Compare directories on file/folder names only, printing any differences?

查看:104
本文介绍了仅比较文件/文件夹名称上的目录,是否打印出差异?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何递归比较两个目录(比较应仅基于文件名),并仅在一个或另一个目录中打印出文件/文件夹?

How do I recursively compare two directories (comparison should be based only on file name) and print out files/folders only in one or the other directory?

I我正在使用Python 3.3。

I'm using Python 3.3.

我见过 filecmp 模块,但是它似乎并没有完全满足我的需求。最重要的是,它不仅仅基于文件名来比较文件。

I've seen the filecmp module, however, it doesn't seem to quite do what I need. Most importantly, it compares files based on more than just the filename.

到目前为止,这是我得到的:

Here's what I've got so far:

import filecmp
dcmp = filecmp.dircmp('./dir1', './dir2')
dcmp.report_full_closure()

dir1 看起来像这样:

dir1
  - atextfile.txt
  - anotherfile.xml
  - afolder
    - testscript.py
  - anotherfolder
    - file.txt
  - athirdfolder

dir2 看起来像这样:

dir2
  - atextfile.txt
  - afolder
    - testscript.py
  - anotherfolder
    - file.txt
    - file2.txt

我希望结果看起来像这样:

I want results to look something like:

files/folders only in dir1
  * anotherfile.xml
  * athirdfolder

files/folders only in dir2
  * anotherfolder/file2.txt

我需要一种简单的pythonic方法,仅根据文件/文件夹名称比较两个目录,并打印出差异。

I need a simple pythonic way to compare two directoies based only on file/folder name, and print out differences.

此外,我需要一种检查目录是否相同的方法。

Also, I need a way to check whether the directories are identical or not.

注意:我已经在stackoverflow和google上搜索了类似的内容。我看到了很多关于如何在考虑文件内容的情况下比较文件的示例,但是我找不到关于文件名的任何信息。

Note: I have searched on stackoverflow and google for something like this. I see lots of examples of how to compare files taking into account the file content, but I can't find anything about just file names.

推荐答案

我的解决方案使用set()类型存储相对路径。然后比较只是集减法的问题。

My solution uses the set() type to store relative paths. Then comparison is just a matter of set subtraction.

import os
import re

def build_files_set(rootdir):
    root_to_subtract = re.compile(r'^.*?' + rootdir + r'[\\/]{0,1}')

    files_set = set()
    for (dirpath, dirnames, filenames) in os.walk(rootdir):
        for filename in filenames + dirnames:
            full_path = os.path.join(dirpath, filename)
            relative_path = root_to_subtract.sub('', full_path, count=1)
            files_set.add(relative_path)

    return files_set

def compare_directories(dir1, dir2):
    files_set1 = build_files_set(dir1)
    files_set2 = build_files_set(dir2)
    return (files_set1 - files_set2, files_set2 - files_set1)

if __name__ == '__main__':
    dir1 = 'old'
    dir2 = 'new'
    in_dir1, in_dir2 = compare_directories(dir1, dir2)

    print '\nFiles only in {}:'.format(dir1)
    for relative_path in in_dir1:
        print '* {0}'.format(relative_path)

    print '\nFiles only in {}:'.format(dir2)
    for relative_path in in_dir2:
        print '* {0}'.format(relative_path)



讨论




  • 最主要的功能是build_files_set()函数。它遍历目录并创建一组相对的文件/目录名称

    Discussion

    • The workhorse is the function build_files_set(). It traverse a directory and create a set of relative file/dir names

      功能compare_directories()接收两组文件并返回差异-非常直接

      The function compare_directories() takes two set of files and return the diferences--very straight forward.

      这篇关于仅比较文件/文件夹名称上的目录,是否打印出差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆