返回目录中特定格式的最新文件 [英] returning latest file in directory for specific format
问题描述
我有一个目录,其中包含以下格式的文件:
I have a directory with files of the format:
test_report-01-13-2014.11_53-en.zip
test_report-12-04-2013.11_53-en.zip
,我需要根据文件名中的日期而不是最后触摸文件的日期返回最后一个文件.如果这样做,我可能最终得到2013文件,那是错误的. 我正在执行以下操作,但无法正常工作. 我要传递以下参数:
and I need to return the last files based on the date in the file names not the date the file was last touched. If I do that I could end up with the 2013 file instead, which would be wrong . I am doing the following, but it's not working. I am passing in the following paramaters:
mypath = "C:\\temp\\test\\"
mypattern = "test_report-%m-%d-%Y*"
myfile = getLatestFile(mypath, mypattern)
def getLatestFile(path="./", pattern="*"):
fformat= path + pattern
archives = glob.glob(fformat)
if len(archives) > 0:
return archives[-1]
else:
return None
你知道是什么原因造成的吗?
any idea what could be the cause of the problem?
推荐答案
glob
以任意顺序返回匹配的路径,但它不理解%m-%d-%Y
(它不那么聪明)
glob
returns matching paths in an arbitrary order, and it doesn't understand %m-%d-%Y
(its not that smart).
您需要阅读路径列表,提取文件名,然后从文件名中获取日期.这将是您用来对文件列表进行排序的键.
You need to read the list of paths, extract the file name, then get the date from the file name. This will be the key that you will use to sort the list of files.
这是一种做到这一点的方法:
Here is one way to do just that:
import glob
import os
import datetime
def sorter(path):
filename = os.path.basename(path)
return datetime.datetime.strptime(filename[12:22], '%m-%d-%Y')
pattern = "test_report-*"
search_path = r'C:\temp\test\' # or 'c:/temp/test/'
file_list = glob.glob(pattern+search_path)
# Order by the date
ordered_list = sorted(file_list, key=sorter, reverse=True)
os.path.basename
是一个函数返回路径的最后一部分;由于glob
将返回完整路径,因此最后一个部分将是文件名.
os.path.basename
is a function to return the last component of a path; since glob
will return the full path, the last component will be the file name.
由于您的文件名具有固定格式-而不是使用正则表达式进行处理,我只是通过对文件名进行切片来获取日期部分,并将其转换为datetime对象.
As your file name has a fixed format - instead of mucking with regular expressions I just grabbed the date part by slicing the file name, and converted it to a datetime object.
最后,sorted
返回排序结果(正常的sort
方法是就地排序).关键功能是提取日期并返回日期,需要reverse=True
才能以最新的顺序获取返回的列表.
Finally, sorted
returns the result of the sort (the normal sort
method is an in place sort). The key function is what extract the date and returns it, reverse=True
is required to get the returned list in the order of latest first.
您可以通过将glob.glob
的结果直接传递给已排序的代码来稍微缩短代码:
You can shorten the code a bit by passing the result of glob.glob
directly to sorted:
ordered_list = sorted(glob.glob(pattern+search_path), key=sorter, reverse=True)
要将其与您编写的功能结合起来:
To combine this with the function you have written:
import glob, os, datetime
def sorter(path):
filename = os.path.basename(path)
return datetime.datetime.strptime(filename[12:22], '%m-%d-%Y')
def getLatestFile(path="./", pattern="*"):
fformat = path + pattern
archives = glob.glob(fformat)
if len(archives):
return sorted(archives, key=sorter, reverse=True)[0]
这篇关于返回目录中特定格式的最新文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!