在目录中查找最旧的文件(递归) [英] Find the oldest file (recursively) in a directory

查看:123
本文介绍了在目录中查找最旧的文件(递归)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个Python备份脚本,我需要找到一个目录(及其子目录)中最旧的文件。我还需要将其过滤到* .avi文件中。



脚本将始终在Linux机器上运行。有没有一些方法来做到这一点在Python或运行一些shell命令更好?



目前我运行 df 来获得特定分区上的可用空间,如果少于5 GB空闲,我想要开始删除最旧的 *。avi 文件,直到条件得到满足。

解决方案

嗯。纳迪亚的回答更接近你的意思要求;然而,为了找到树中(单个)最老的文件,试试这个:

pre $ import $
def old_file_in_tree( (root文件夹)
为dirname,dirnames,文件名在os.walk(rootfolder)
返回min(
(os.path.join(dirname, b $ b用于文件名
if filename.endswith(扩展名)),
key = lambda fn:os.stat(fn).st_mtime)

经过一些修改,你可以得到 n 最旧的文件(类似于Nadia的回答):

  import os,heapq 
def oldest_files_in_tree(rootfolder,count = 1,extension =。avi):
返回heapq.nsmallest(计数,
(os.path.join(dirname,文件名)
为dirname,dirnames,文件名在os.walk(rootfolder)
为文件名在文件名
if filename.endswith(extension)),
key = lambda fn:os.stat(fn).st_mtime)

请注意,使用 .endswith 方法可以调用:

  oldest_files_in_tree(/ home / user,20,(.avi,.mov))

选择多个扩展名。



最后,如果需要按修改时间排序的完整文件列表,以便删除

$ $ $ $ c $ import $ $ $ $ $ def files_to_delete(rootfolder,extension = .avi):
返回排序(
(os.path.join(dirname,filename)
为dirname,dirnames,文件名在os.walk(rootfolder)
为文件名文件名
if filename.endswith(扩展名)),
key = lambda fn:os.stat(fn).st_mtime),
reverse = True)

并注意 reverse = True 会将最旧的文件列表中,以便下一个要删除的文件只需执行 file_list.po顺便说一句,为了解决你的问题,因为你在Linux上运行, os.statvfs 可用,你可以这样做:

  import os 
def free_space_up_to(free_bytes_required,rootfolder,extension =.avi):
file_list = files_to_delete(rootfolder,extension)
while file_list:
statv = os.statvfs(rootfolder)
如果statv.f_bfree * statv.f_bsize> = free_bytes_required:
break
os.remove(file_list.pop())

statvfs.f_bfree 是设备空闲块, statvfs.f_bsize 是块大小。我们使用 rootfolder statvfs,因此请注意指向其他设备的任何符号链接,我们可以删除多个文件,而不会实际释放此设备中的空间。



UPDATE(复制Juan的注释):

根据操作系统和文件系统的实现,您可能希望将f_bfree乘以f_frsize而不是比f_bsize。在一些实现中,后者是优选的I / O请求大小。例如,在我刚刚测试的FreeBSD 9系统上,f_frsize是4096,f_bsize是16384. POSIX表示块计数字段是以f_frsize为单位(参见 http://pubs.opengroup.org/onlinepubs/9699919799//basedefs/sys_statvfs.h.html


I'm writing a Python backup script and I need to find the oldest file in a directory (and its sub-directories). I also need to filter it down to *.avi files only.

The script will always be running on a Linux machine. Is there some way to do it in Python or would running some shell commands be better?

At the moment I'm running df to get the free space on a particular partition, and if there is less than 5 gigabytes free, I want to start deleting the oldest *.avi files until that condition is met.

解决方案

Hm. Nadia's answer is closer to what you meant to ask; however, for finding the (single) oldest file in a tree, try this:

import os
def oldest_file_in_tree(rootfolder, extension=".avi"):
    return min(
        (os.path.join(dirname, filename)
        for dirname, dirnames, filenames in os.walk(rootfolder)
        for filename in filenames
        if filename.endswith(extension)),
        key=lambda fn: os.stat(fn).st_mtime)

With a little modification, you can get the n oldest files (similar to Nadia's answer):

import os, heapq
def oldest_files_in_tree(rootfolder, count=1, extension=".avi"):
    return heapq.nsmallest(count,
        (os.path.join(dirname, filename)
        for dirname, dirnames, filenames in os.walk(rootfolder)
        for filename in filenames
        if filename.endswith(extension)),
        key=lambda fn: os.stat(fn).st_mtime)

Note that using the .endswith method allows calls as:

oldest_files_in_tree("/home/user", 20, (".avi", ".mov"))

to select more than one extension.

Finally, should you want the complete list of files, ordered by modification time, in order to delete as many as required to free space, here's some code:

import os
def files_to_delete(rootfolder, extension=".avi"):
    return sorted(
        (os.path.join(dirname, filename)
         for dirname, dirnames, filenames in os.walk(rootfolder)
         for filename in filenames
         if filename.endswith(extension)),
        key=lambda fn: os.stat(fn).st_mtime),
        reverse=True)

and note that the reverse=True brings the oldest files at the end of the list, so that for the next file to delete, you just do a file_list.pop().

By the way, for a complete solution to your issue, since you are running on Linux, where the os.statvfs is available, you can do:

import os
def free_space_up_to(free_bytes_required, rootfolder, extension=".avi"):
    file_list= files_to_delete(rootfolder, extension)
    while file_list:
        statv= os.statvfs(rootfolder)
        if statv.f_bfree*statv.f_bsize >= free_bytes_required:
            break
        os.remove(file_list.pop())

statvfs.f_bfree are the device free blocks and statvfs.f_bsize is the block size. We take the rootfolder statvfs, so mind any symbolic links pointing to other devices, where we could delete many files without actually freeing up space in this device.

UPDATE (copying a comment by Juan):

Depending on the OS and filesystem implementation, you may want to multiply f_bfree by f_frsize rather than f_bsize. In some implementations, the latter is the preferred I/O request size. For example, on a FreeBSD 9 system I just tested, f_frsize was 4096 and f_bsize was 16384. POSIX says the block count fields are "in units of f_frsize" ( see http://pubs.opengroup.org/onlinepubs/9699919799//basedefs/sys_statvfs.h.html )

这篇关于在目录中查找最旧的文件(递归)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆