从路径中提取文件名,无论 os/path 格式如何 [英] Extract file name from path, no matter what the os/path format

查看:48
本文介绍了从路径中提取文件名,无论 os/path 格式如何的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

无论操作系统或路径格式是什么,我都可以使用哪个 Python 库从路径中提取文件名?

例如,我希望所有这些路径都返回给我 c:

a/b/c/一/二/三\a\b\c\a\b\c\a\b\ca/b/../../a/b/c/a/b/../../a/b/c

解决方案

像其他人建议的那样使用 os.path.splitos.path.basename 不起作用在所有情况下:如果您在 Linux 上运行脚本并尝试处理经典的 Windows 样式路径,它将失败.

Windows 路径可以使用反斜杠或正斜杠作为路径分隔符.因此,ntpath 模块(在 Windows 上运行时相当于 os.path)将适用于所有平台上的所有(1) 路径.

导入ntpathntpath.basename("a/b/c")

当然,如果文件以斜线结尾,basename就会为空,所以自己做一个函数来处理:

def path_leaf(path):头,尾 = ntpath.split(path)返回尾部或 ntpath.basename(head)

验证:

<预><代码>>>>路径 = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\公元前',... 'a/b/../../a/b/c/', 'a/b/../../a/b/c']>>>[path_leaf(path) 用于路径中的路径]['c', 'c', 'c', 'c', 'c', 'c', 'c']


(1) 有一个警告:Linux 文件名可能包含反斜杠.所以在 linux 上,r'a/b\c' 总是指向 a 文件夹中的文件 b\c,而在 Windows 上,它总是指 a 文件夹的 b 子文件夹中的 c 文件.因此,当在路径中同时使用正斜杠和反斜杠时,您需要知道相关的平台才能正确解释它.实际上,假设它是 Windows 路径通常是安全的,因为在 Linux 文件名中很少使用反斜杠,但在编写代码时请记住这一点,以免造成意外的安全漏洞.

Which Python library can I use to extract filenames from paths, no matter what the operating system or path format could be?

For example, I'd like all of these paths to return me c:

a/b/c/
a/b/c
\a\b\c
\a\b\c\
a\b\c
a/b/../../a/b/c/
a/b/../../a/b/c

解决方案

Using os.path.split or os.path.basename as others suggest won't work in all cases: if you're running the script on Linux and attempt to process a classic windows-style path, it will fail.

Windows paths can use either backslash or forward slash as path separator. Therefore, the ntpath module (which is equivalent to os.path when running on windows) will work for all(1) paths on all platforms.

import ntpath
ntpath.basename("a/b/c")

Of course, if the file ends with a slash, the basename will be empty, so make your own function to deal with it:

def path_leaf(path):
    head, tail = ntpath.split(path)
    return tail or ntpath.basename(head)

Verification:

>>> paths = ['a/b/c/', 'a/b/c', '\\a\\b\\c', '\\a\\b\\c\\', 'a\\b\\c', 
...     'a/b/../../a/b/c/', 'a/b/../../a/b/c']
>>> [path_leaf(path) for path in paths]
['c', 'c', 'c', 'c', 'c', 'c', 'c']


(1) There's one caveat: Linux filenames may contain backslashes. So on linux, r'a/b\c' always refers to the file b\c in the a folder, while on Windows, it always refers to the c file in the b subfolder of the a folder. So when both forward and backward slashes are used in a path, you need to know the associated platform to be able to interpret it correctly. In practice it's usually safe to assume it's a windows path since backslashes are seldom used in Linux filenames, but keep this in mind when you code so you don't create accidental security holes.

这篇关于从路径中提取文件名,无论 os/path 格式如何的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆