子域的正则表达式 [英] Regexp for subdomain

查看:71
本文介绍了子域的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有谁知道如何编写一个只允许 a-zA-Z0-9.-(字母、数字、点和破折号)但是的正则​​表达式以点或破折号开头或结尾?

Does anyone know how to write a regexp that only allows a-zA-Z0-9.- (letters, numbers, dots, and dash) BUT that never starts or ends with a dot or dash ?

我试过这个:

/^[^.-][a-zA-Z0-9.-]+[^.-]$/

...但是如果我写一些像john@"这样的东西,它会起作用,我不想这样做,因为@是不允许的.

... but if I write something like "john@", it works, and I don't want to because @ is not allowed.

推荐答案

子域

根据相关的互联网建议(RFC3986 第 2.2 节,其中依次参考:RFC1034 第 3.5 节RFC1123 section 2.1),一个子域(是 DNS 域主机名的一部分),必须满足几个要求:

Subdomain

According to the pertinent internet recommendations (RFC3986 section 2.2, which in turn refers to: RFC1034 section 3.5 and RFC1123 section 2.1), a subdomain (which is a part of a DNS domain host name), must meet several requirements:

  • 每个子域部分的长度不得超过 63.
  • 每个子域部分必须以字母数字开头和结尾(即字母 [A-Za-z] 或数字 [0-9]).
  • 每个子域部分可能包含连字符(破折号),但不能以连字符开头或结尾.

这是满足这些要求的子域部分的表达式片段:

Here is an expression fragment for a subdomain part which meets these requirements:

[A-Za-z0-9](?:[A-Za-z0-9\-]{0,61}[A-Za-z0-9])?

请注意,不应单独使用此表达式片段 - 它需要在更大的上下文中并入边界条件,如以下 DNS 主机名表达式所示...

Note that this expression fragment should not be used alone - it requires the incorporation of boundary conditions in a larger context, as demonstrated in the following expression for a DNS host name...

命名主机(不是 IP 地址)必须满足其他要求:

A named host, (not an IP address), must meet additional requirements:

  • 主机名可能由多个子域部分组成,每个部分用一个点分隔.
  • 整个主机名的长度不应超过 255 个字符.
  • 顶级域(DNS 主机名的最右侧部分)必须是国际公认的值之一.有效顶级域的列表由 IANA.ORG 维护.(请参阅此处的基本当前列表:http://data.iana.org/TLD/tlds-alpha-by-domain.txt).

记住这一点,这里有一个带注释的正则表达式(在 PHP 语法中),它将伪验证 DNS 主机名:(请注意,这包含了上述子域表达式的修改版本,并为此添加了注释).

With this is mind, here a commented regex (in PHP syntax), which will pseudo-validate a DNS host name: (Note that this incorporates a modified version of the above expression for a subdomain and adds comments to this as well).

2016 年 8 月 20 日更新: 由于此答案最初于 2011 年发布,因此顶级域的数量呈爆炸式增长.截至 2016 年 8 月,现在已有 1400 多个.此答案的原始正则表达式包含所有这些,但这并不实用.下面的新正则表达式包含了顶级域的不同表达式.算法来自:顶级域名规范draft-liman-tld-names-06.

Update 2016-08-20: Since this answer was originally posted back in 2011, the number of top-level domains has exploded. As of August 2016 there are now more than 1400. The original regex to this answer incorporated all of these but this is no loger practical. The new regex below incorporates a different expression for the top-level domain. The algorithm comes from: Top Level Domain Name Specification draft-liman-tld-names-06.

$DNS_named_host = '%(?#!php/i DNS_named_host Rev:20160820_0800)
    # Match DNS named host domain having one or more subdomains.
    # See: http://stackoverflow.com/a/7933253/433790
    ^                     # Anchor to start of string.
    (?!.{256})            # Whole domain must be 255 or less.
    (?:                   # One or more sub-domains.
      [a-z0-9]            # Subdomain begins with alpha-num.
      (?:                 # Optionally more than one char.
        [a-z0-9-]{0,61}   # Middle part may have dashes.
        [a-z0-9]          # Starts and ends with alpha-num.
      )?                  # Subdomain length from 1 to 63.
      \.                  # Required dot separates subdomains.
    )+                    # End one or more sub-domains.
    (?:                   # Top level domain (length from 1 to 63).
      [a-z]{1,63}         # Either traditional-tld-label = 1*63(ALPHA).
    | xn--[a-z0-9]{1,59}  # Or an idn-label = Restricted-A-Label.
    )                     # End top level domain.
    $                     # Anchor to end of string.
    %xi';  // End $DNS_named_host.

请注意,此表达式并不完美.它需要一个或多个子域,但从技术上讲,主机可以由没有子域的 TLD 组成(但这种情况很少见).

Note that this expression is not perfect. It requires one or more subdomains, but technically, a host can consist of a TLD having no subdomain (but this is rare).

更新 2014-08-12: 为不需要更改的子域添加了简化表达式.

Update 2014-08-12: Added simplified expression for subdomain which does not require alternation.

更新 2016-08-20: 修改 DNS 主机名正则表达式以(更一般地)匹配新的大量有效顶级域.另外,从答案中删除了不必要的材料.

Update 2016-08-20: Modified DNS host name regex to (more generally) match the new vast number of valid top level domains. Also, trimmed out unnecessary material from answer.

这篇关于子域的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆