为什么要使用urlencode? [英] Why should I use urlencode?

查看:128
本文介绍了为什么要使用urlencode?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个Web应用程序,并学习如何对html链接进行urlencode ...

I am writing a web application and learning how to urlencode html links...

此处的所有urlencode问题(请参见下面的标记)都是如何...?"问题.

All the urlencode questions here (see tag below) are "How to...?" questions.

我的问题不是如何?"但是为什么?".

My question is not "How?" but "Why?".

即使Wikipedia文章也只涉及其机制:
http://en.wikipedia.org/wiki/Urlencode 但不是为什么,我应该在我的应用程序中完全使用urlencode.

Even the wikipedia article only addresses the mechanics of it:
http://en.wikipedia.org/wiki/Urlencode but not why I should use urlencode in my application at all.

使用(或不使用)urlencode的安全性有什么含义?

What are the security implications of using (or rather not using) urlencode?

如何使用urlencode失败被利用?

未编码的网址会出现哪种错误或故障?

What kind of bugs or failures can crop up with unencoded urls?

我之所以问是因为,即使没有urlencode,也可以按预期工作到我的应用程序开发网站的链接,如下所示: http://myapp/my%20test/ée/ràé

I'm asking because even without urlencode, a link to my application dev web site like the following works as expected: http://myapp/my%20test/ée/ràé

为什么我应该使用urlencode?

Why should I use urlencode?

或另一种表达方式:

何时我应该使用urlencode?在哪种情况下?

When should I use urlencode? In what kind of situations?

推荐答案

更新:上面还有一个更好的解释(imo):

Update: There is an even better explanation (imo) further above:

URI表示为字符序列,而不是序列 八位字节.这是因为URI可能通过以下方式传输": 并非通过计算机网络进行,例如打印在纸上, 收音机等.

A URI is represented as a sequence of characters, not as a sequence of octets. That is because URI might be "transported" by means that are not through a computer network, e.g., printed on paper, read over the radio, etc.

对于包含非ASCII字符的原始字符序列, 但是,情况更加困难.互联网协议 传输旨在表示字符序列的八位字节序列 如果有的话,可以期望提供某种方式来识别所使用的字符集 可能不止一个[RFC2277].但是,目前有 通用URI语法中没有任何规定可完成此操作 鉴别.单个URI方案可能需要一个 字符集,定义默认字符集或提供一种方法来指示 使用了字符集.

For original character sequences that contain non-ASCII characters, however, the situation is more difficult. Internet protocols that transmit octet sequences intended to represent character sequences are expected to provide some way of identifying the charset used, if there might be more than one [RFC2277]. However, there is currently no provision within the generic URI syntax to accomplish this identification. An individual URI scheme may require a single charset, define a default charset, or provide a way to indicate the charset used.


因为它在 RFC 中声明:


Because it is stated in the RFC:

2.4.转义序列

2.4. Escape Sequences

如果数据没有使用 毫无保留的性格;这包括与以下内容不符的数据 US-ASCII编码字符集的可打印字符,或者 对应于不允许的任何US-ASCII字符,因为 解释如下.

Data must be escaped if it does not have a representation using an unreserved character; this includes data that does not correspond to a printable character of the US-ASCII coded character set, or that corresponds to any US-ASCII character that is disallowed, as explained below.

2.4.2.何时逃生和逃生

2.4.2. When to Escape and Unescape

由于对URI进行转义或转义,因此URI始终采用转义"形式 完成的URI可能会更改其语义.通常,只有时间 创建URI时可以安全地进行转义编码 从其组成部分;每个组件可能都有自己的一组 保留的字符,因此仅负责机制 生成或解释该组件可以确定转义字符是否会更改其语义.同样,一个URI 必须在转义字符之前将其分成多个部分 这些组件中的内容可以安全地解码.

A URI is always in an "escaped" form, since escaping or unescaping a completed URI might change its semantics. Normally, the only time escape encodings can safely be made is when the URI is being created from its component parts; each component may have its own set of characters that are reserved, so only the mechanism responsible for generating or interpreting that component can determine whether or not escaping a character will change its semantics. Likewise, a URI must be separated into its components before the escaped characters within those components can be safely decoded.

在某些情况下,可以由未保留的数据表示的数据 角色可能看起来已逃脱;例如,一些未保留的 标记"字符会被某些系统自动转义.如果 给定的URI方案定义了规范化算法,则 未保留的字符可以根据该算法进行转义. 例如,有时在HTTP URL中使用%7e"代替〜" 路径,但两者对于http URL来说是等效的.

In some cases, data that could be represented by an unreserved character may appear escaped; for example, some of the unreserved "mark" characters are automatically escaped by some systems. If the given URI scheme defines a canonicalization algorithm, then unreserved characters may be unescaped according to that algorithm. For example, "%7e" is sometimes used instead of "~" in an http URL path, but the two are equivalent for an http URL.

因为百分号%"字符始终具有保留的目的 作为转义指示符,必须将其转义为%25",以便 用作URI中的数据.实施者应注意不要 由于转义,多次对同一字符串进行转义或转义 已经未转义的字符串可能会导致误解百分比 数据字符作为另一个转义字符,反之亦然 转义已经转义的字符串的情况.

Because the percent "%" character always has the reserved purpose of being the escape indicator, it must be escaped as "%25" in order to be used as data within a URI. Implementers should be careful not to escape or unescape the same string more than once, since unescaping an already unescaped string might lead to misinterpreting a percent data character as another escaped character, or vice versa in the case of escaping an already escaped string.

这篇关于为什么要使用urlencode?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆