如果附件中包含表单数据边界,该怎么办? [英] What if the form-data boundary is contained in the attached file?

查看:116
本文介绍了如果附件中包含表单数据边界,该怎么办?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们举一个multipart/form-data 采取的示例来自w3.com :

Content-Type: multipart/form-data; boundary=AaB03x

--AaB03x
Content-Disposition: form-data; name="submit-name"

Larry
--AaB03x
Content-Disposition: form-data; name="files"; filename="file1.txt"
Content-Type: text/plain

... contents of file1.txt ...
--AaB03x--

这很简单,但是,假设您正在编写实现此功能并从头创建此类请求的代码.假设file1.txt是由用户创建的,我们无法控制其内容.

It's pretty straight forward, but let's say you are writing code that implements this and creates such a request from scratch. Let's assume file1.txt is created by a user, and we have no control over its contents.

如果文本文件file1.txt包含字符串--AaB03x怎么办??您可能会随机生成边界AaB03x,但是我们假设场景.

What if the text file file1.txt contains the string --AaB03x? You likely generated the boundary AaB03x randomly, but let's assume a "million monkeys entering a million web forms" scenario.

是否存在标准方式来处理这种不可能但仍然可能的情况?

Is there a standard way of dealing with this improbably but still possible situation?

应该以某种方式对text/plain(甚至可能是image/jpegapplication/octet-stream之类的东西)进行编码"还是转义"中的某些信息?

Should the text/plain (or even, potentially something like image/jpeg or application/octet-stream) be "encoded" or some of the information within "escaped" in some sort of way?

还是开发人员应该始终在文件的内容中搜索边界,然后反复不断选择新的随机生成的边界,直到在文件中找不到所选的字符串为止?

Or should the developer always search the contents of the file for the boundary, and then repeatedly keep picking a new randomly generated boundary until the chosen string cannot be found within the file?

推荐答案

HTTP委托给MIME RFC,用于在此处定义multipart/类型.规则在 RFC 2046第5.1 节中列出.

HTTP delegates to the MIME RFCs for defining the multipart/ types here. The rules are laid out in RFC 2046 section 5.1.

RFC仅声明不得出现边界:

The RFC simply states the boundary must not appear:

边界 定界符不得出现在任何封装的部件内 行本身或作为任何行的前缀.这意味着它是 至关重要的是,组成主体能够选择并指定一个 不包含边界的唯一边界参数值 封闭多部分的参数值作为前缀.

The boundary delimiter MUST NOT appear inside any of the encapsulated parts, on a line by itself or as the prefix of any line. This implies that it is crucial that the composing agent be able to choose and specify a unique boundary parameter value that does not contain the boundary parameter value of an enclosing multipart as a prefix.

注意:因为边界定界符一定不能出现在主体部分中 被封装后,用户代理必须谨慎选择 唯一的边界参数值.边界参数值在 上面的示例可能是设计为 产生边界定界符的可能性很低 存在于要封装的数据中,而无需预先扫描 数据.替代算法可能会导致更可读"的边界 带有旧用户代理的收件人的分隔符,但需要 更多地注意边界定界符可能 出现在封装部分中某些行的开头.这 最简单的边界分隔线可能是"---"之类的东西, 边界分隔线为"-----".

NOTE: Because boundary delimiters must not appear in the body parts being encapsulated, a user agent must exercise care to choose a unique boundary parameter value. The boundary parameter value in the example above could have been the result of an algorithm designed to produce boundary delimiters with a very low probability of already existing in the data to be encapsulated without having to prescan the data. Alternate algorithms might result in more "readable" boundary delimiters for a recipient with an old user agent, but would require more attention to the possibility that the boundary delimiter might appear at the beginning of some line in the encapsulated part. The simplest boundary delimiter line possible is something like "---", with a closing boundary delimiter line of "-----".

大多数MIME软件仅生成一个随机边界,以使该边界出现在部件中的概率在统计上不太可能.例如可能会发生碰撞,但是发生碰撞的可能性非常低,以致无法实现.计算机 UUID值遵循相同的原则;如果您一年内生成几万亿个UUID,则生成两个相同的UUID值的可能性与某人被陨石撞击的可能性几乎相同,两者都有170亿的机会.

Most MIME software simply generates a random boundary such that the probability of that boundary appearing in the parts is statistically unlikely; e.g. a collision could happen but the probability of that ever happening is so low as to be infeasible. Computer UUID values rely on the same principles; if you generate a few trillion UUIDs in a year, the probability of generating two identical UUID values is about the same as someone being hit by a meteorite, both have a 1 in 17 billion chance.

请注意,您通常将二进制数据编码为某种形式的ASCII安全编码,例如base64,这种编码不包含破折号,从而消除了二进制数据包含边界的可能性.

Note that you usually encode binary data to some form of ASCII-safe encoding like base64, an encoding that doesn't include dashes, removing the likelihood that binary data ever contains the boundary.

因此,处理可能性的标准方法是简单地使可能性变得不可能,以至于几乎没有.如果您更有可能用计算机存储被陨石击中的电子邮件,那为什么还要担心MIME边界呢?

As such, the standard way to deal with the possibility is to simply make the possibility so unlikely as to be next to nothing. If you have a greater chance of a computer storing the email being hit by a meteorite, why worry about the MIME boundary?

这篇关于如果附件中包含表单数据边界,该怎么办?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆