解析文件名中的反斜杠和正斜杠的混合 [英] Parsing a mix of Backward slash and forward slash in a filename
问题描述
我从包含 /
和 \
混合格式的 api 获取文件名.
I am getting filename from an api in this format containing mix of /
and \
.
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
当我尝试解析目录结构时,\
后跟一个字符被转换为单个字符.
When I try to parse the directory structure, \
followed by a character is converted into single character.
有没有办法正确获取每个组件?
Is there a way around to get each component correctly?
我已经尝试过的:
path.normpath didn't help.
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
os.path.normpath(infilename)
out:
'c:\\mydir1\\mydir2\\mydir3\\mydir4Sxyz.csv'
推荐答案
这在你的例子中不可见,但写这个:
that's not visible in your example but writing this:
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
不是一个好主意,因为如果跟在反斜杠后面,一些小写(和一些大写)字母会被解释为转义序列.臭名昭著的例子是\t
、\b
,还有其他的.例如:
isn't a good idea because some of the lowercase (and a few uppercase) letters are interpreted as escape sequences if following an antislash. Notorious examples are \t
, \b
, there are others. For instance:
infilename = 'c:/mydir1/mydir2\thedir3\bigdir4\123xyz.csv'
双重失败,因为 2 个字符被解释为tab"和退格".
doubly fails because 2 chars are interpreted as "tab" and "backspace".
在处理文字 Windows 样式的路径(或正则表达式)时,您必须使用 raw 前缀,更好的是,规范化您的路径以去除斜线.
When dealing with literal Windows-style path (or regexes), you have to use the raw prefix, and better, normalize your path to get rid of the slashes.
infilename = os.path.normpath(r'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv')
然而,原始前缀仅适用于文字.如果返回的字符串出现,在打印repr(string)
时,如'the\terrible\\dir'
,则表示字符串中已经放入了tab字符,并且有除了糟糕的后期处理外,您无能为力.
However, the raw prefix only applies to literals. If the returned string appears, when printing repr(string)
, as 'the\terrible\\dir'
, then tab chars have already been put in the string, and there's nothing you can do except a lousy post-processing.
这篇关于解析文件名中的反斜杠和正斜杠的混合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!