UNIX 文件路径最正确的正则表达式是什么? [英] What is the most correct regular expression for a UNIX file path?

查看:21
本文介绍了UNIX 文件路径最正确的正则表达式是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

UNIX 文件路径最正确的正则表达式 (regex) 是什么?

What is the most correct regular expression (regex) for a UNIX file path?

例如,检测这样的事情:

For example, to detect something like this:

/usr/lib/libgccpp.so.1.0.2

制作一个匹配大多数文件的正则表达式非常容易,但最好的正则表达式是什么,包括可以检测转义空白序列的正则表达式,以及通常在 UNIX 上的文件路径中找不到的异常字符.

It's pretty easy to make a regular expression that will match most files, but what's the best one, including one that can detect escaped whitespace sequences, and unusual characters you don't usually find in file paths on UNIX.

另外,是否有几种不同编程语言的库函数提供文件路径正则表达式?

Also, are there library functions in several different programming languages that provide a file path regex?

推荐答案

如果你不介意识别路径的误报,那么你真的只需要确保路径不包含 NUL 性格;其他所有内容都是允许的(特别是 / 是名称分隔符).更好的方法是使用适当的文件 IO 函数(例如 File.exists(), File.getCanonicalFile() 在 Java 中.

If you don't mind false positives for identifying paths, then you really just need to ensure the path doesn't contain a NUL character; everything else is permitted (in particular, / is the name-separator character). The better approach would be to resolve the given path using the appropriate file IO function (e.g. File.exists(), File.getCanonicalFile() in Java).

长答案:

这既是操作系统文件系统 依赖.例如,维基百科文件系统比较指出,除了文件系统施加的限制,

This is both operating system and file system dependent. For example, the Wikipedia comparison of file systems notes that besides the limits imposed by the file system,

MS-DOS、Microsoft Windows 和 OS/2禁止字符 /: ?* " > < |NUL在文件和目录中名称​​跨所有文件系统.联合国和 Linux 不允许字符 /NUL 在文件和目录名称中跨所有文件系统.

MS-DOS, Microsoft Windows, and OS/2 disallow the characters / : ? * " > < | and NUL in file and directory names across all filesystems. Unices and Linux disallow the characters / and NUL in file and directory names across all filesystems.

在 Windows 中,也不允许使用以下保留设备名称作为文件名:

In Windows, the following reserved device names are also not permitted as filenames:

CON, PRN, AUX, NUL, COM1, COM2, COM3, COM4, COM5,
COM6, COM7, COM8, COM9, LPT1, LPT2, LPT3, LPT4, 
LPT5, LPT6, LPT7, LPT8, LPT9

这篇关于UNIX 文件路径最正确的正则表达式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆