返回目录中特定格式的最新文件 [英] returning latest file in directory for specific format

查看:82
本文介绍了返回目录中特定格式的最新文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个目录,其中包含以下格式的文件:

I have a directory with files of the format:

test_report-01-13-2014.11_53-en.zip
test_report-12-04-2013.11_53-en.zip

,我需要根据文件名中的日期而不是最后触摸文件的日期返回最后一个文件.如果这样做,我可能最终得到2013文件,那是错误的. 我正在执行以下操作,但无法正常工作. 我要传递以下参数:

and I need to return the last files based on the date in the file names not the date the file was last touched. If I do that I could end up with the 2013 file instead, which would be wrong . I am doing the following, but it's not working. I am passing in the following paramaters:

mypath = "C:\\temp\\test\\"
mypattern = "test_report-%m-%d-%Y*"
myfile = getLatestFile(mypath, mypattern)

def getLatestFile(path="./", pattern="*"):
   fformat= path + pattern
   archives = glob.glob(fformat)

   if len(archives) > 0:
       return archives[-1]
   else:
       return None

你知道是什么原因造成的吗?

any idea what could be the cause of the problem?

推荐答案

glob以任意顺序返回匹配的路径,但它不理解%m-%d-%Y(它不那么聪明)

glob returns matching paths in an arbitrary order, and it doesn't understand %m-%d-%Y (its not that smart).

您需要阅读路径列表,提取文件名,然后从文件名中获取日期.这将是您用来对文件列表进行排序的键.

You need to read the list of paths, extract the file name, then get the date from the file name. This will be the key that you will use to sort the list of files.

这是一种做到这一点的方法:

Here is one way to do just that:

import glob
import os
import datetime

def sorter(path):
    filename = os.path.basename(path)
    return datetime.datetime.strptime(filename[12:22], '%m-%d-%Y')

pattern = "test_report-*"
search_path = r'C:\temp\test\' # or 'c:/temp/test/'

file_list = glob.glob(pattern+search_path)

# Order by the date
ordered_list = sorted(file_list, key=sorter, reverse=True)

os.path.basename 是一个函数返回路径的最后一部分;由于glob将返回完整路径,因此最后一个部分将是文件名.

os.path.basename is a function to return the last component of a path; since glob will return the full path, the last component will be the file name.

由于您的文件名具有固定格式-而不是使用正则表达式进行处理,我只是通过对文件名进行切片来获取日期部分,并将其转换为datetime对象.

As your file name has a fixed format - instead of mucking with regular expressions I just grabbed the date part by slicing the file name, and converted it to a datetime object.

最后,sorted 返回排序结果(正常的sort方法是就地排序).关键功能是提取日期并返回日期,需要reverse=True才能以最新的顺序获取返回的列表.

Finally, sorted returns the result of the sort (the normal sort method is an in place sort). The key function is what extract the date and returns it, reverse=True is required to get the returned list in the order of latest first.

您可以通过将glob.glob的结果直接传递给已排序的代码来稍微缩短代码:

You can shorten the code a bit by passing the result of glob.glob directly to sorted:

ordered_list = sorted(glob.glob(pattern+search_path), key=sorter, reverse=True)

要将其与您编写的功能结合起来:

To combine this with the function you have written:

import glob, os, datetime

def sorter(path):
    filename = os.path.basename(path)
    return datetime.datetime.strptime(filename[12:22], '%m-%d-%Y')

def getLatestFile(path="./", pattern="*"):
   fformat = path + pattern
   archives = glob.glob(fformat)

   if len(archives):
      return sorted(archives, key=sorter, reverse=True)[0]

这篇关于返回目录中特定格式的最新文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆