Python.遍历文件列表,查找相同的文件名但扩展名不同 [英] Python. Iterate over a list of files, finding same filenames but different extensions

查看:64
本文介绍了Python.遍历文件列表,查找相同的文件名但扩展名不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有如下列表:

mylist = ['movie1.mp4','movie2.srt','movie1.srt','movie3.mp4','movie1.mp4']

注意:这是一个简单的测试列表,该脚本将处理未知的文件名以及更多文件名.

Note: a simple list for testing, the script will deal with unknown file names and more of them.

所以我想找到带有成对的srt文件的电影文件,并将它们放在字典中.剩下的所有内容(即movie3.mp4)都将保留在列表中,以后再处理.

So I want to find the movie files with a paired srt file, and put those in a dictionary. Anything left (ie movie3.mp4) will be left in the list and dealt with later.

我一直在与列表理解打交道,尽管它可能不会留下剩余的数据并允许我构造字典.

I've been playing a bit with list comprehension, though it might not leave the leftover data and allow me to construct the dictionary.

import re matches = [ x for x, a in mylist if (re.sub('\.srt$', '\.mp4$', a ) == x or re.sub('\.srt$', '\.mp4$', a ) == x) ]

import re matches = [ x for x, a in mylist if (re.sub('\.srt$', '\.mp4$', a ) == x or re.sub('\.srt$', '\.mp4$', a ) == x) ]

这将返回: ValueError: too many values to unpack

关于我该如何处理的任何想法?

Any ideas on how I might approach this?

推荐答案

您对问题采用了错误的方法.最简单的方法是使用 os确定文件的基本名称.path.splitext 并根据它进行分组.一种可能的方法是使用 itertools.groupby

You are adopting a wrong approach to your problem. The easiest would be to determine the basenames of the files using os.path.splitext and group them according to it. A possible approach would be to use itertools.groupby

实施

groups = {key: set(value)
      for key, value in groupby(sorted(mylist,
                                       key = lambda e: os.path.splitext(e)[0]),
                                key = lambda e: os.path.splitext(e)[0])}

示例

>>> pprint.pprint(groups)
{'movie1': set(['movie1.mp4', 'movie1.srt']),
 'movie2': set(['movie2.srt']),
 'movie3': set(['movie3.mp4'])}

这篇关于Python.遍历文件列表,查找相同的文件名但扩展名不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆