仅比较文件/文件夹名称上的目录,是否打印出差异? [英] Compare directories on file/folder names only, printing any differences?
问题描述
如何递归比较两个目录(比较应仅基于文件名),并仅在一个或另一个目录中打印出文件/文件夹?
How do I recursively compare two directories (comparison should be based only on file name) and print out files/folders only in one or the other directory?
I我正在使用Python 3.3。
I'm using Python 3.3.
我见过 filecmp
模块,但是它似乎并没有完全满足我的需求。最重要的是,它不仅仅基于文件名来比较文件。
I've seen the filecmp
module, however, it doesn't seem to quite do what I need. Most importantly, it compares files based on more than just the filename.
到目前为止,这是我得到的:
Here's what I've got so far:
import filecmp
dcmp = filecmp.dircmp('./dir1', './dir2')
dcmp.report_full_closure()
dir1
看起来像这样:
dir1
- atextfile.txt
- anotherfile.xml
- afolder
- testscript.py
- anotherfolder
- file.txt
- athirdfolder
和 dir2
看起来像这样:
dir2
- atextfile.txt
- afolder
- testscript.py
- anotherfolder
- file.txt
- file2.txt
我希望结果看起来像这样:
I want results to look something like:
files/folders only in dir1
* anotherfile.xml
* athirdfolder
files/folders only in dir2
* anotherfolder/file2.txt
我需要一种简单的pythonic方法,仅根据文件/文件夹名称比较两个目录,并打印出差异。
I need a simple pythonic way to compare two directoies based only on file/folder name, and print out differences.
此外,我需要一种检查目录是否相同的方法。
Also, I need a way to check whether the directories are identical or not.
注意:我已经在stackoverflow和google上搜索了类似的内容。我看到了很多关于如何在考虑文件内容的情况下比较文件的示例,但是我找不到关于文件名的任何信息。
Note: I have searched on stackoverflow and google for something like this. I see lots of examples of how to compare files taking into account the file content, but I can't find anything about just file names.
推荐答案
我的解决方案使用set()类型存储相对路径。然后比较只是集减法的问题。
My solution uses the set() type to store relative paths. Then comparison is just a matter of set subtraction.
import os
import re
def build_files_set(rootdir):
root_to_subtract = re.compile(r'^.*?' + rootdir + r'[\\/]{0,1}')
files_set = set()
for (dirpath, dirnames, filenames) in os.walk(rootdir):
for filename in filenames + dirnames:
full_path = os.path.join(dirpath, filename)
relative_path = root_to_subtract.sub('', full_path, count=1)
files_set.add(relative_path)
return files_set
def compare_directories(dir1, dir2):
files_set1 = build_files_set(dir1)
files_set2 = build_files_set(dir2)
return (files_set1 - files_set2, files_set2 - files_set1)
if __name__ == '__main__':
dir1 = 'old'
dir2 = 'new'
in_dir1, in_dir2 = compare_directories(dir1, dir2)
print '\nFiles only in {}:'.format(dir1)
for relative_path in in_dir1:
print '* {0}'.format(relative_path)
print '\nFiles only in {}:'.format(dir2)
for relative_path in in_dir2:
print '* {0}'.format(relative_path)
讨论
-
最主要的功能是build_files_set()函数。它遍历目录并创建一组相对的文件/目录名称
Discussion
The workhorse is the function build_files_set(). It traverse a directory and create a set of relative file/dir names
功能compare_directories()接收两组文件并返回差异-非常直接
The function compare_directories() takes two set of files and return the diferences--very straight forward.
这篇关于仅比较文件/文件夹名称上的目录,是否打印出差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!