PHP Regex用于匹配UNC路径 [英] PHP Regex for matching a UNC path

查看:132
本文介绍了PHP Regex用于匹配UNC路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在PHP中使用一些正则表达式来验证通过表单传递的UNC路径.格式应为:

\\server\something

...,并允许其他子文件夹.为了保持一致性,最好在结尾处加上斜杠,尽管我可以轻松地用substr做到这一点.

我已经在线阅读了在PHP中与单个反斜杠匹配需要4个反斜杠(使用"C字符串"时),并且我认为我理解为什么(PHP转义(例如2 = 1,所以4 = 2),然后正则表达式引擎转义(剩下的2 = 1).我已经看到以下两个被引为 equivalent 的正则表达式来匹配单个反斜杠:

$regex = "/\\\\/s";

或者显然也是这样:

$regex = "/[\\]/s";

但是,这些结果会产生不同的结果,这与我最终匹配完整的UNC路径的最终目标略有不同.

要查看我是否可以匹配两个反斜线,我使用以下代码进行测试:

$path = "\\\\server";
echo "the path is: $path <br />"; // which is \\server
$regex = "/\\\\\\\\\/s";
if (preg_match($regex, $path)) 
{
    echo "matched";
}
else
{
    echo "not matched";
}

但是上述内容似乎在两个或更多反斜杠上匹配:(模式是8个斜杠,转换为2,那么为什么输入的3个反斜杠($path = "\\\\\\server")匹配?

我认为以下方法可能会起作用:

$regex = "/[\\][\\]/s";

再次,没有:(

在我跳出窗口前请帮忙:)

解决方案

使用这个小小的宝石:

$UNC_regex = '=^\\\\\\\\[a-zA-Z0-9-]+(\\\\[a-zA-Z0-9`~!@#$%^&(){}\'._-]+([ ]+[a-zA-Z0-9`~!@#$%^&(){}\'._-]+)*)+$=s';

来源: http://regexlib.com/REDetails.aspx?regexp_id=2285(采用PHP字符串转义)

上面显示的RegEx匹配有效的主机名(只允许几个有效字符)和主机名后面的路径部分(允许很多但不是全部字符)


反斜杠问题旁注:

  • 使用 double用引号(")括起字符串,您必须了解PHP特殊字符的转义.."\\"是PHP中的单个\.

  • 重要提示:即使使用单引号('),也必须转义反斜杠.
    一个带有单引号的PHP字符串按字面意义(未转义)获取字符串中的所有内容,但有一些例外:

    1. 反斜杠后跟反斜杠(\\)被解释为单个反斜杠.
      ('C:\\*.*' => C:\*.*)
    2. 反斜杠后跟单引号(\')被解释为单引号.
      ('I\'ll be back' => I'll be back)
    3. 反斜杠后跟其他任何字符都被解释为反斜杠.
      ('Just a \ somewhere' => Just a \ somewhere)

  • 此外,您必须了解 PCRE转义序列 .
    RegEx解析器将\用于字符类,因此您需要再次对RegEx进行转义.
    要匹配两个\\,您必须写$regex = "\\\\\\\\"$regex = '\\\\\\\\'

    有关PCRE转义序列的PHP文档:

    单引号和双引号的PHP字符串具有反斜杠的特殊含义.因此,如果\必须与正则表达式\匹配,则在PHP代码中必须使用"\\"或'\\'.


关于您的问题:

为什么输入的3个反斜杠($ path ="\\\ server")与正则表达式"/\\\\\\\\/s"相匹配?

原因是您没有定义边界(使用^表示字符串的开头,使用$表示字符串的结尾),因此它会找到\\ 某处" ,从而导致正匹配.为了获得预期的结果,您应该执行以下操作:

$regex = '/^\\\\\\\\[^\\\\]/s';

上面的RegEx有2个修改:

    开头的
  • ^仅匹配字符串开头的两个\\
  • [^\\]否定字符类说:不要在其后加上额外的反斜杠

关于您的最后一个RegEx:

$regex = "/[\\][\\]/s";

您对此处的反斜杠转义感到困惑(请参阅上面的说明). PHP将"/[\\][\\]/s"解释为/[\][\]/s,这将使RegEx失败,因为\是RegEx中的保留字符,因此必须转义.

此RegEx的变体可以工作,但也可以匹配两个反斜杠的出现,其原因与我上面已经解释的相同:

$regex = '/[\\\\][\\\\]/s';

I'm after a bit of regex to be used in PHP to validate a UNC path passed through a form. It should be of the format:

\\server\something

... and allow for further sub-folders. It might be good to strip off a trailing slash for consistency although I can easily do this with substr if need be.

I've read online that matching a single backslash in PHP requires 4 backslashes (when using a "C like string") and think I understand why that is (PHP escaping (e.g. 2 = 1, so 4 = 2), then regex engine escaping (the remaining 2 = 1). I've seen the following two quoted as equivalent suitable regex to match a single backslash:

$regex = "/\\\\/s";

or apparently this also:

$regex = "/[\\]/s";

However these produce different results, and that is slightly aside from my final aim to match a complete UNC path.

To see if I could match two backslashes I used the following to test:

$path = "\\\\server";
echo "the path is: $path <br />"; // which is \\server
$regex = "/\\\\\\\\\/s";
if (preg_match($regex, $path)) 
{
    echo "matched";
}
else
{
    echo "not matched";
}

The above however seems to match on two or more backslashes :( The pattern is 8 slashes, translating to 2, so why would an input of 3 backslashes ($path = "\\\\\\server") match?

I thought perhaps the following would work:

$regex = "/[\\][\\]/s";

and again, no :(

Please help before I jump out a window lol :)

解决方案

Use this little gem:

$UNC_regex = '=^\\\\\\\\[a-zA-Z0-9-]+(\\\\[a-zA-Z0-9`~!@#$%^&(){}\'._-]+([ ]+[a-zA-Z0-9`~!@#$%^&(){}\'._-]+)*)+$=s';

Source: http://regexlib.com/REDetails.aspx?regexp_id=2285 (adopted to PHP string escaping)

The RegEx shown above matches for valid hostname (which allows only a few valid characters) and the path part behind the hostname (which allows many, but not all characters)


Sidenote on the backslashes issue:

  • When you use double quotes (") to enclose your string, you must be aware of PHP special character escaping.. "\\" is a single \ in PHP.

  • Important: even with single quotes (') those backslashes must be escaped.
    A PHP string with single quotes takes everything in the string literally (unescaped) with a few exceptions:

    1. A backslash followed by a backslash (\\) is interpreted as a single backslash.
      ('C:\\*.*' => C:\*.*)
    2. A backslash followed by a single-quote (\') is interpreted as a single quote.
      ('I\'ll be back' => I'll be back)
    3. A backslash followed by anything else is interpreted as a backslash.
      ('Just a \ somewhere' => Just a \ somewhere)

  • Also, you must be aware of PCRE escape sequences.
    The RegEx parser treats \ for character classes, so you need to escape it for RegEx, again.
    To match two \\ you must write $regex = "\\\\\\\\" or $regex = '\\\\\\\\'

    From the PHP docs on PCRE escape sequences:

    Single and double quoted PHP strings have special meaning of backslash. Thus if \ has to be matched with a regular expression \, then "\\" or '\\' must be used in PHP code.


Regarding your Question:

why would an input of 3 backslashes ($path = "\\\server") match with regex "/\\\\\\\\/s"?

The reason is that you have no boundaries defined (use ^ for beginning and $ for end of string), thus it finds \\ "somewhere" resulting in a positive match. To get the expected result, you should do something like this:

$regex = '/^\\\\\\\\[^\\\\]/s';

The RegEx above has 2 modifications:

  • ^ at the beginning to only match two \\ at the beginning of the string
  • [^\\] negative character class to say: not followed by an additional backslash

Regarding your last RegEx:

$regex = "/[\\][\\]/s";

You have a confusion (see above for clarification) with backslash escaping here. "/[\\][\\]/s" is interpreted by PHP to /[\][\]/s, which will let the RegEx fail because \ is a reserved character in RegEx and thus must be escaped.

This variant of your RegEx would work, but also match any occurance of two backslashes for the same reason i already explained above:

$regex = '/[\\\\][\\\\]/s';

这篇关于PHP Regex用于匹配UNC路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆