URL 中允许的字符 [英] Characters allowed in a URL

查看:29
本文介绍了URL 中允许的字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有谁知道可以在 GET 中使用而无需编码的完整字符列表?目前我正在使用 A-Z a-z 和 0-9...但我正在寻找完整列表.

Does anyone know the full list of characters that can be used within a GET without being encoded? At the moment I am using A-Z a-z and 0-9... but I am looking to find out the full list.

我也很想知道是否有针对即将添加的中文、阿拉伯文网址发布的规范(显然这将对我的问题产生重大影响)

I am also interested into if there is a specification released for the up coming addition of Chinese, Arabic url's (as obviously that will have a big impact on my question)

推荐答案

正如@Jukka K. Korpela 正确指出的那样,RFC 1738 已由 RFC 3986.这对主机有效的字符进行了扩展和澄清,不幸的是它不容易复制和粘贴,但我会尽力.

As @Jukka K. Korpela correctly points out, RFC 1738 was updated by RFC 3986. This has expanded and clarified the characters valid for host, unfortunately it's not easily copied and pasted, but I'll do my best.

按第一个匹配顺序:

host        = IP-literal / IPv4address / reg-name

IP-literal  = "[" ( IPv6address / IPvFuture  ) "]"

IPvFuture   = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )

IPv6address =         6( h16 ":" ) ls32
                  /                       "::" 5( h16 ":" ) ls32
                  / [               h16 ] "::" 4( h16 ":" ) ls32
                  / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
                  / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
                  / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
                  / [ *4( h16 ":" ) h16 ] "::"              ls32
                  / [ *5( h16 ":" ) h16 ] "::"              h16
                  / [ *6( h16 ":" ) h16 ] "::"

ls32        = ( h16 ":" h16 ) / IPv4address
                  ; least-significant 32 bits of address

h16         = 1*4HEXDIG 
               ; 16 bits of address represented in hexadecimal

IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet

dec-octet   = DIGIT                 ; 0-9
              / %x31-39 DIGIT         ; 10-99
              / "1" 2DIGIT            ; 100-199
              / "2" %x30-34 DIGIT     ; 200-249
              / "25" %x30-35          ; 250-255

reg-name    = *( unreserved / pct-encoded / sub-delims )

unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"     <---This seems like a practical shortcut, most closely resembling original answer

reserved    = gen-delims / sub-delims

gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
              / "*" / "+" / "," / ";" / "="

pct-encoded = "%" HEXDIG HEXDIG

来自 RFC 1738 规范的原始答案:

Original answer from RFC 1738 specification:

因此,只有字母数字、特殊字符$-_.+!*'()、"和可以使用用于其保留目的的保留字符在 URL 中未编码.

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

^ 自 1998 年起过时.

^ obsolete since 1998.

这篇关于URL 中允许的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆