HTTP标头值允许哪些字符? [英] what characters are allowed in HTTP header values?

查看:1461
本文介绍了HTTP标头值允许哪些字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在学习了 HTTP / 1.1标准后,特别是第31页和相关的我来到了结论是任何8位八位字节都可以出现在HTTP头值中。即代码来自[0,255]范围的任何字符。

After studying HTTP/1.1 standard, specifically page 31 and related I came to conclusion that any 8-bit octet can be present in HTTP header value. I.e. any character with code from [0,255] range.

然而我试过的HTTP服务器拒绝接受代码> 127(或大多数US-ASCII不可打印字符)的任何内容。

And yet HTTP servers I tried refuse to take anything with code > 127 (or most US-ASCII non-printable chars).

以下是标准中使用的语法摘录:

Here is dried out excerpt of grammar used in standard:

message-header = field-name ":" [ field-value ]
field-name     = token
field-value    = *( field-content | LWS )
field-content  = <the OCTETs making up the field-value and consisting of
                  either *TEXT or combinations of token, separators, and
                  quoted-string>

CR             = <US-ASCII CR, carriage return (13)>
LF             = <US-ASCII LF, linefeed (10)>
SP             = <US-ASCII SP, space (32)>
HT             = <US-ASCII HT, horizontal-tab (9)>
CRLF           = CR LF
LWS            = [CRLF] 1*( SP | HT )
OCTET          = <any 8-bit sequence of data>
CHAR           = <any US-ASCII character (octets 0 - 127)>
CTL            = <any US-ASCII control character (octets 0 - 31) and DEL (127)>
TEXT           = <any OCTET except CTLs, but including LWS>

token          = 1*<any CHAR except CTLs or separators>
separators     = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | "\"
               | <"> | "/" | "[" | "]" | "?" | "=" | "{" | "}" | SP | HT

quoted-string  = ( <"> *(qdtext | quoted-pair ) <"> )
qdtext         = <any TEXT except <">>
quoted-pair    = "\" CHAR

正如你所看到的那样 field-content 可以是 quoted-string ,这是一个引用的序列 TEXT (即除了之外的任何8位八位字节和来自的值[0-8,11-12,14-31,127] range)或 quoted-pair \ 后跟<$ c中的任何值$ c> [0,127] range)。即任何8位char序列都可以通过引用它并使用 \ )。

As you can see field-content can be a quoted-string, which is an enquoted sequence of TEXT (i.e. any 8-bit octet with exception of " and values from [0-8, 11-12, 14-31, 127] range) or quoted-pair (\ followed by any value from [0, 127] range). I.e. any 8-bit char sequence can be passed by en-quoting it and prefixing special symbols with \).

(注意标准不会以任何特殊方式处理 NUL(0x00) char )

(Note that standard doesn't treat NUL(0x00) char in any special way)

但是,显然我所尝试的所有服务器都不符合标准或自1999年以来标准发生了变化,或者我无法正确阅读。

But, obviously either all servers I tried are not conforming or standard has changed since 1999 or I can't read it properly.

那么...... HTTP标题值允许哪些字符以及为什么?

So... which characters are allowed in HTTP header values and why?

PS背后的原因:我正在寻找一种方法在HTTP标头值中传递utf-8编码序列(带如果可能的话,输出额外的编码。

P.S. Reason behind all of this: I am looking for a way to pass utf-8-encoded sequence in HTTP header value (without additional encoding, if possible).

推荐答案

RFC 2616已经过时(参见 https://www.rfc-editor.org/info/rfc2616 ),相关部分已被RFC 7230取代(见< a href =https://www.greenbytes.de/tech/webdav/rfc7230.html#rfc.section.A.2.p.9 =nofollow noreferrer> https://www.greenbytes.de /tech/webdav/rfc7230.html#rfc.section.A.2.p.9 ):

RFC 2616 is obsolete (see https://www.rfc-editor.org/info/rfc2616), the relevant part has been replaced by RFC 7230 (see https://www.greenbytes.de/tech/webdav/rfc7230.html#rfc.section.A.2.p.9):


NUL注释和引用字符串文本中不再允许八位字节,
和它们中的反斜杠转义处理已经澄清。
引用对规则不再允许转义其他
的控制字符而不是HTAB。 标题字段中的非US-ASCII内容和原因短语
已被废弃并变为不透明(删除了TEXT规则)。

(第3.2.6节)

The NUL octet is no longer allowed in comment and quoted-string text, and handling of backslash-escaping in them has been clarified. The quoted-pair rule no longer allows escaping control characters other than HTAB. Non-US-ASCII content in header fields and the reason phrase has been obsoleted and made opaque (the TEXT rule was removed). (Section 3.2.6)

从本质上讲,RFC 2616默认为ISO-8859-1,无论如何这都是不够的,也不是可互操作的。因此,RFC 7230已弃用字段值中的非ASCII八位字节。建议在其上使用转义机制(例如RFC 8187中定义的,或普通URI-percent-encoding)。

In essence, RFC 2616 defaulted to ISO-8859-1, and this was both insufficient and not interoperable anyway. Thus, RFC 7230 has deprecated non-ASCII octets in field values. The recommendation is to use an escaping mechanism on top of that (such as defined in RFC 8187, or plain URI-percent-encoding).

这篇关于HTTP标头值允许哪些字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆