Pathlib 使用“$"“规范化"UNC 路径; [英] Pathlib 'normalizes' UNC paths with "$"

查看:69
本文介绍了Pathlib 使用“$"“规范化"UNC 路径;的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Python3.8 上,我尝试使用 pathlib 将字符串连接到远程计算机 C 驱动器上的 UNC 路径.
这很奇怪.
例如:

<预><代码>>>>remote = Path("\\\\remote\\", "C$\\Some\\Path")>>>偏僻的WindowsPath('//远程//C$/Some/Path')>>>remote2 = 路径(远程,更多")>>>远程2WindowsPath('/remote/C$/Some/Path/More')

注意最初的 // 是如何变成 / 的?
将初始路径放在一行中,一切都很好:

<预><代码>>>>remote = Path("\\\\remote\\C$\\Some\\Path")>>>偏僻的WindowsPath('//remote/C$/Some/Path')>>>远程 2 = 路径(远程,更多")>>>远程2WindowsPath('//remote/C$/Some/Path/more')

这是一种解决方法,但我怀疑我误解了它应该如何工作或做错了.
有人知道发生了什么吗?

解决方案

tldr: 你应该把整个 UNC 共享 (\\\\host\\share) 作为一个单元,pathlib 对 UNC 路径进行了特殊情况处理,但它需要专门使用此前缀才能识别路径为 UNC.不能用pathlib的工具来分别管理host和share,这让pathlib炸了.

Path 构造函数规范化(去重)路径分隔符:

<预><代码>>>>PPP('////foo//bar////qux')PurePosixPath('/foo/bar/qux')>>>PWP('////foo//bar////qux')PureWindowsPath('/foo/bar/qux')

PureWindowsPath 对于识别为 UNC 的路径有一个特殊情况,即 //host/share... 避免折叠前导分隔符.

然而你的初始连接使它变得奇怪,因为它创建了一个 //host//share... 形式的路径然后路径被转换回来传递给构造函数时为字符串,此时它不再与 UNC 匹配并且所有分隔符都折叠起来:

<预><代码>>>>PWP("\\\\远程\\", "C$\\Some\\Path")PureWindowsPath('//远程//C$/Some/Path')>>>str(PWP("\\\\remote\\", "C$\\Some\\Path"))'\\\\远程\\\\C$\\Some\\Path'>>>PWP(str(PWP("\\\\remote\\", "C$\\Some\\Path")))PureWindowsPath('/remote/C$/Some/Path')

问题似乎特别是在看起来像 UNC 的路径上存在尾随分隔符,我不知道这是错误还是匹配其他一些 UNC 样式(但不是 UNC)的特殊情况:

><预><代码>>>>PWP("//远程")PureWindowsPath('/远程')>>>PWP("//远程/")PureWindowsPath('//remote//') # 这个很奇怪,尾随分隔符加倍了,这打破了一切>>>PWP("//远程/foo")PureWindowsPath('//remote/foo/')>>>PWP("//远程//foo")PureWindowsPath('/remote/foo')

这些行为似乎并没有真正记录下来,pathlib 文档特别指出它折叠了路径分隔符,并且有几个 UNC 示例表明它没有,但我真的不知道应该发生什么.无论哪种方式,如果前两个段保留为单个驱动器"单元,并且共享路径被视为驱动器 特别记录.

注意:使用 joinpath// 似乎不会触发重新规范化,您的路径仍然不正确(因为主机和共享之间的第二个路径仍然是两倍) 但它并没有完全折叠起来.

On Python3.8, I'm trying to use pathlib to concatenate a string to a UNC path that's on a remote computer's C drive.
It's weirdly inconsistent.
For example:

>>> remote = Path("\\\\remote\\", "C$\\Some\\Path")
>>> remote
WindowsPath('//remote//C$/Some/Path')

>>> remote2 = Path(remote, "More")
>>> remote2
WindowsPath('/remote/C$/Some/Path/More')

Notice how the initial // is turned into /?
Put the initial path in one line though, and everything is fine:

>>> remote = Path("\\\\remote\\C$\\Some\\Path")
>>> remote
WindowsPath('//remote/C$/Some/Path')

>>> remote2 = Path(remote, "more")
>>> remote2
WindowsPath('//remote/C$/Some/Path/more')

This works as a workaround, but I suspect I'm misunderstanding how it's supposed to work or doing it wrong.
Anyone got a clue what's happening?

解决方案

tldr: you should give the entire UNC share (\\\\host\\share) as a single unit, pathlib has special-case handling of UNC paths but it needs specifically this prefix in order to recognize a path as UNC. You can't use pathlib's facilities to separately manage host and share, it makes pathlib blow a gasket.

The Path constructor normalises (deduplicates) path separators:

>>> PPP('///foo//bar////qux')
PurePosixPath('/foo/bar/qux')
>>> PWP('///foo//bar////qux')
PureWindowsPath('/foo/bar/qux')

PureWindowsPath has a special case for paths recognised as UNC, that is //host/share... which avoids collapsing leading separators.

However your initial concatenation puts it in a weird funk because it creates a path of the form //host//share... then the path gets converted back to a string when passed to the constructor, at which point it doesn't match a UNC anymore and all the separators get collapsed:

>>> PWP("\\\\remote\\", "C$\\Some\\Path")
PureWindowsPath('//remote//C$/Some/Path')
>>> str(PWP("\\\\remote\\", "C$\\Some\\Path"))
'\\\\remote\\\\C$\\Some\\Path'
>>> PWP(str(PWP("\\\\remote\\", "C$\\Some\\Path")))
PureWindowsPath('/remote/C$/Some/Path')

the issue seems to be specifically the presence of a trailing separator on a UNC-looking path, I don't know if it's a bug or if it's matching some other UNC-style (but not UNC) special case:

>>> PWP("//remote")
PureWindowsPath('/remote')
>>> PWP("//remote/")
PureWindowsPath('//remote//') # this one is weird, the trailing separator gets doubled which breaks everything
>>> PWP("//remote/foo")
PureWindowsPath('//remote/foo/')
>>> PWP("//remote//foo")
PureWindowsPath('/remote/foo')

These behaviours don't really seem documented, the pathlib doc specifically notes that it collapses path separators, and has a few examples of UNC which show that it doesn't, but I don't really know what's supposed to happen exactly. Either way it only seems to handle UNC paths somewhat properly if the first two segments are kept as a single "drive" unit, and that the share-path is considered a drive is specifically documented.

Of note: using joinpath / / doesn't seem to trigger a re-normalisation, your path remains improper (because the second pathsep between host and share remains doubled) but it doesn't get completely collapsed.

这篇关于Pathlib 使用“$"“规范化"UNC 路径;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆