Python.遍历文件列表,查找相同的文件名但扩展名不同 [英] Python. Iterate over a list of files, finding same filenames but different extensions
问题描述
所以我有如下列表:
mylist = ['movie1.mp4','movie2.srt','movie1.srt','movie3.mp4','movie1.mp4']
注意:这是一个简单的测试列表,该脚本将处理未知的文件名以及更多文件名.
Note: a simple list for testing, the script will deal with unknown file names and more of them.
所以我想找到带有成对的srt文件的电影文件,并将它们放在字典中.剩下的所有内容(即movie3.mp4)都将保留在列表中,以后再处理.
So I want to find the movie files with a paired srt file, and put those in a dictionary. Anything left (ie movie3.mp4) will be left in the list and dealt with later.
我一直在与列表理解打交道,尽管它可能不会留下剩余的数据并允许我构造字典.
I've been playing a bit with list comprehension, though it might not leave the leftover data and allow me to construct the dictionary.
import re
matches = [ x for x, a in mylist if (re.sub('\.srt$', '\.mp4$', a ) == x or re.sub('\.srt$', '\.mp4$', a ) == x) ]
import re
matches = [ x for x, a in mylist if (re.sub('\.srt$', '\.mp4$', a ) == x or re.sub('\.srt$', '\.mp4$', a ) == x) ]
这将返回:
ValueError: too many values to unpack
关于我该如何处理的任何想法?
Any ideas on how I might approach this?
推荐答案
您对问题采用了错误的方法.最简单的方法是使用 os确定文件的基本名称.path.splitext 并根据它进行分组.一种可能的方法是使用 itertools.groupby
You are adopting a wrong approach to your problem. The easiest would be to determine the basenames of the files using os.path.splitext and group them according to it. A possible approach would be to use itertools.groupby
实施
groups = {key: set(value)
for key, value in groupby(sorted(mylist,
key = lambda e: os.path.splitext(e)[0]),
key = lambda e: os.path.splitext(e)[0])}
示例
>>> pprint.pprint(groups)
{'movie1': set(['movie1.mp4', 'movie1.srt']),
'movie2': set(['movie2.srt']),
'movie3': set(['movie3.mp4'])}
这篇关于Python.遍历文件列表,查找相同的文件名但扩展名不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!