在目录中查找最旧的文件(递归) [英] Find the oldest file (recursively) in a directory
问题描述
脚本将始终在Linux机器上运行。有没有一些方法来做到这一点在Python或运行一些shell命令更好?
目前我运行 df
来获得特定分区上的可用空间,如果少于5 GB空闲,我想要开始删除最旧的 *。avi
文件,直到条件得到满足。
嗯。纳迪亚的回答更接近你的意思要求;然而,为了找到树中(单个)最老的文件,试试这个:
pre $ import $
def old_file_in_tree( (root文件夹)
为dirname,dirnames,文件名在os.walk(rootfolder)
返回min(
(os.path.join(dirname, b $ b用于文件名
if filename.endswith(扩展名)),
key = lambda fn:os.stat(fn).st_mtime)
经过一些修改,你可以得到 n
最旧的文件(类似于Nadia的回答):
import os,heapq
def oldest_files_in_tree(rootfolder,count = 1,extension =。avi):
返回heapq.nsmallest(计数,
(os.path.join(dirname,文件名)
为dirname,dirnames,文件名在os.walk(rootfolder)
为文件名在文件名
if filename.endswith(extension)),
key = lambda fn:os.stat(fn).st_mtime)
请注意,使用 .endswith
方法可以调用:
oldest_files_in_tree(/ home / user,20,(.avi,.mov))
选择多个扩展名。
最后,如果需要按修改时间排序的完整文件列表,以便删除
$ $ $ $ c $ import $ $ $ $ $ def files_to_delete(rootfolder,extension = .avi):
返回排序(
(os.path.join(dirname,filename)
为dirname,dirnames,文件名在os.walk(rootfolder)
为文件名文件名
if filename.endswith(扩展名)),
key = lambda fn:os.stat(fn).st_mtime),
reverse = True)
并注意 reverse = True
会将最旧的文件列表中,以便下一个要删除的文件只需执行 file_list.po顺便说一句,为了解决你的问题,因为你在Linux上运行,
os.statvfs
可用,你可以这样做:
import os
def free_space_up_to(free_bytes_required,rootfolder,extension =.avi):
file_list = files_to_delete(rootfolder,extension)
while file_list:
statv = os.statvfs(rootfolder)
如果statv.f_bfree * statv.f_bsize> = free_bytes_required:
break
os.remove(file_list.pop())
statvfs.f_bfree
是设备空闲块, statvfs.f_bsize
是块大小。我们使用 rootfolder
statvfs,因此请注意指向其他设备的任何符号链接,我们可以删除多个文件,而不会实际释放此设备中的空间。
UPDATE(复制Juan的注释):
根据操作系统和文件系统的实现,您可能希望将f_bfree乘以f_frsize而不是比f_bsize。在一些实现中,后者是优选的I / O请求大小。例如,在我刚刚测试的FreeBSD 9系统上,f_frsize是4096,f_bsize是16384. POSIX表示块计数字段是以f_frsize为单位(参见 http://pubs.opengroup.org/onlinepubs/9699919799//basedefs/sys_statvfs.h.html )
I'm writing a Python backup script and I need to find the oldest file in a directory (and its sub-directories). I also need to filter it down to *.avi files only.
The script will always be running on a Linux machine. Is there some way to do it in Python or would running some shell commands be better?
At the moment I'm running df
to get the free space on a particular partition, and if there is less than 5 gigabytes free, I want to start deleting the oldest *.avi
files until that condition is met.
Hm. Nadia's answer is closer to what you meant to ask; however, for finding the (single) oldest file in a tree, try this:
import os
def oldest_file_in_tree(rootfolder, extension=".avi"):
return min(
(os.path.join(dirname, filename)
for dirname, dirnames, filenames in os.walk(rootfolder)
for filename in filenames
if filename.endswith(extension)),
key=lambda fn: os.stat(fn).st_mtime)
With a little modification, you can get the n
oldest files (similar to Nadia's answer):
import os, heapq
def oldest_files_in_tree(rootfolder, count=1, extension=".avi"):
return heapq.nsmallest(count,
(os.path.join(dirname, filename)
for dirname, dirnames, filenames in os.walk(rootfolder)
for filename in filenames
if filename.endswith(extension)),
key=lambda fn: os.stat(fn).st_mtime)
Note that using the .endswith
method allows calls as:
oldest_files_in_tree("/home/user", 20, (".avi", ".mov"))
to select more than one extension.
Finally, should you want the complete list of files, ordered by modification time, in order to delete as many as required to free space, here's some code:
import os
def files_to_delete(rootfolder, extension=".avi"):
return sorted(
(os.path.join(dirname, filename)
for dirname, dirnames, filenames in os.walk(rootfolder)
for filename in filenames
if filename.endswith(extension)),
key=lambda fn: os.stat(fn).st_mtime),
reverse=True)
and note that the reverse=True
brings the oldest files at the end of the list, so that for the next file to delete, you just do a file_list.pop()
.
By the way, for a complete solution to your issue, since you are running on Linux, where the os.statvfs
is available, you can do:
import os
def free_space_up_to(free_bytes_required, rootfolder, extension=".avi"):
file_list= files_to_delete(rootfolder, extension)
while file_list:
statv= os.statvfs(rootfolder)
if statv.f_bfree*statv.f_bsize >= free_bytes_required:
break
os.remove(file_list.pop())
statvfs.f_bfree
are the device free blocks and statvfs.f_bsize
is the block size. We take the rootfolder
statvfs, so mind any symbolic links pointing to other devices, where we could delete many files without actually freeing up space in this device.
UPDATE (copying a comment by Juan):
Depending on the OS and filesystem implementation, you may want to multiply f_bfree by f_frsize rather than f_bsize. In some implementations, the latter is the preferred I/O request size. For example, on a FreeBSD 9 system I just tested, f_frsize was 4096 and f_bsize was 16384. POSIX says the block count fields are "in units of f_frsize" ( see http://pubs.opengroup.org/onlinepubs/9699919799//basedefs/sys_statvfs.h.html )
这篇关于在目录中查找最旧的文件(递归)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!