这是使用Python获取文件名唯一版本的最佳方法吗? [英] Is this the best way to get unique version of filename w/ Python?

查看:106
本文介绍了这是使用Python获取文件名唯一版本的最佳方法吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

仍然深入"到Python中,并要确保我不会忽略某些东西.我编写了一个脚本,该脚本从多个zip文件中提取文件,并将提取的文件一起保存在一个目录中.为了防止重复的文件名被覆盖,我编写了这个小函数-我只是想知道是否有更好的方法来做到这一点? 谢谢!

Still 'diving in' to Python, and want to make sure I'm not overlooking something. I wrote a script that extracts files from several zip files, and saves the extracted files together in one directory. To prevent duplicate filenames from being over-written, I wrote this little function - and I'm just wondering if there is a better way to do this? Thanks!

def unique_filename(file_name):
counter = 1
file_name_parts = os.path.splitext(file_name) # returns ('/path/file', '.ext')
while os.path.isfile(file_name): 
    file_name = file_name_parts[0] + '_' + str(counter) + file_name_parts[1]
    counter += 1
return file_name

我真的要做,要求文件位于单个目录中,在我的情况下,对重复项进行编号绝对是可以接受的,因此,我并没有在寻找一种更可靠的方法(我想欢迎使用指针),但只是为了确保以正确的方式完成这项工作.

I really do require the files to be in a single directory, and numbering duplicates is definitely acceptable in my case, so I'm not looking for a more robust method (tho' I suppose any pointers are welcome), but just to make sure that what this accomplishes is getting done the right way.

推荐答案

一个问题是上述代码中存在竞争条件,因为在测试存在性和创建文件之间存在间隙.这样做可能会带来安全隐患(例如,有人恶意将符号链接插入到敏感文件中,他们将无法覆盖它们,但是您的程序以更高的特权运行,则可能).像这样的攻击就是os.tempnam( )已弃用.

One issue is that there is a race condition in your above code, since there is a gap between testing for existance, and creating the file. There may be security implications to this (think about someone maliciously inserting a symlink to a sensitive file which they wouldn't be able to overwrite, but your program running with a higher privilege could) Attacks like these are why things like os.tempnam() are deprecated.

要解决这个问题,最好的方法是实际尝试创建该文件,以使其在失败时会得到异常,并在成功后返回实际打开的文件对象.通过传递os.O_CREAT和os.O_EXCL标志,可以使用较低级别的os.open函数来完成此操作.打开后,返回您创建的实际文件(和可选的文件名).例如,这是修改您的代码以使用这种方法(返回(文件,文件名)元组):

To get around it, the best approach is to actually try create the file in such a way that you'll get an exception if it fails, and on success, return the actually opened file object. This can be done with the lower level os.open functions, by passing both the os.O_CREAT and os.O_EXCL flags. Once opened, return the actual file (and optionally filename) you create. Eg, here's your code modified to use this approach (returning a (file, filename) tuple):

def unique_file(file_name):
    counter = 1
    file_name_parts = os.path.splitext(file_name) # returns ('/path/file', '.ext')
    while 1:
        try:
            fd = os.open(file_name, os.O_CREAT | os.O_EXCL | os.O_RDRW)
            return os.fdopen(fd), file_name
        except OSError:
            pass
        file_name = file_name_parts[0] + '_' + str(counter) + file_name_parts[1]
        counter += 1

实际上,一种更好的方法可以使用tempfile模块来为您解决上述问题,尽管您可能会失去对命名的控制.这是一个使用它的示例(保持类似的界面):

Actually, a better way, which will handle the above issues for you, is probably to use the tempfile module, though you may lose some control over the naming. Here's an example of using it (keeping a similar interface):

def unique_file(file_name):
    dirname, filename = os.path.split(file_name)
    prefix, suffix = os.path.splitext(filename)

    fd, filename = tempfile.mkstemp(suffix, prefix+"_", dirname)
    return os.fdopen(fd), filename

>>> f, filename=unique_file('/home/some_dir/foo.txt')
>>> print filename
/home/some_dir/foo_z8f_2Z.txt

此方法的唯一缺点是,您总是会得到一个包含一些随机字符的文件名,因为没有尝试首先创建未修改的文件(/home/some_dir/foo.txt). 您可能还需要查看tempfile.TemporaryFile和NamedTemporaryFile,它们将执行上述操作,并且在关闭时也会自动从磁盘中删除.

The only downside with this approach is that you will always get a filename with some random characters in it, as there's no attempt to create an unmodified file (/home/some_dir/foo.txt) first. You may also want to look at tempfile.TemporaryFile and NamedTemporaryFile, which will do the above and also automatically delete from disk when closed.

这篇关于这是使用Python获取文件名唯一版本的最佳方法吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆