安全的HTML表单接受字符集? [英] Safe HTML form accept charset?

查看:87
本文介绍了安全的HTML表单接受字符集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用get方法提交表单时,我遇到了参数编码问题(我无法使用post方法)。某些强调字符未在URL中转义,因为我的页面是UTF8。

I faced a parameter encoding issue when submitting a form with the get method (I can't use the post method). Some accentuated characters were not escaped in the URL, since my page was UTF8. The Spring controller retrieved bad characters instead.

我通过设置 accept-charset =ISO-8859-1在我的表单,但现在,我想知道哪个字符集是安全所有服务器/浏览器组合。是否有任何推荐我的表单和'get'的网址?

I solved this issue by setting accept-charset="ISO-8859-1" on my form, but now, I am wondering which charset is safe for all server/browser combination. Is there any recommended for my forms and 'get' URLs?

推荐答案

这是令人沮丧的标准网址编码必须使用UTF-8,但servlet不仅默认为ISO-8859-1,但不提供任何方式使用代码更改。

This is frustrating (to put it mildly) with servlets. The standard URL encoding must use UTF-8 yet servlets not only default to ISO-8859-1 but don't offer any way to change that with code.

当然,在读取任何内容之前可以 req.setRequestEncoding(UTF-8),但对于一些不敬的原因,这只会影响请求正文,参数。 servlet请求接口中没有任何指定用于查询字符串参数的编码。

Sure you can req.setRequestEncoding("UTF-8") before you read anything, but for some ungodly reason this only affects request body, not query string parameters. There is nothing in the servlet request interface to specify the encoding used for query string parameters.

使用 ISO-8859-1 在你的形式是一个黑客。使用这种古老的编码会导致更多的问题,而不是确定解决。特别是因为浏览器不支持ISO-8859-1,总是把它当作Windows-1252。而servlet将ISO-8859-1视为ISO-8859-1,因此,如果你使用这个,你就会被置之不理。

Using ISO-8859-1 in your form is a hack. Using this ancient encoding will cause more problems than solve for sure. Especially since browsers do not support ISO-8859-1 and always treat it as Windows-1252. Whereas servlets treat ISO-8859-1 as ISO-8859-1, so you will be screwed beyond belief if you go with this.

要在Tomcat中改变它,您可以使用< connector> 元素中的 URIEncoding 属性:

To change this in Tomcat for example, you can use the URIEncoding attribute in your <connector> element:

<connector ... URIEncoding="UTF-8" ... />






如果您不使用这些设置,不能更改其设置或一些其他问题,您仍然可以使其工作,因为ISO-8859-1解码保留从原始二进制的完整信息。


If you don't use a container that has these settings, can't change its settings or some other issue, you can still make it work because ISO-8859-1 decoding retains full information from the original binary.

String correct = new String(request.getParameter("test").getBytes("ISO-8859-1"), "UTF-8")

所以让我们说 test =ä,如果一切正确设置, as test =%C3%A4 。您的servlet将不正确地解码为ISO-8859-1,并给出结果字符串ä。如果您应用更正,您可以获得ä回:

So let's say test=ä and if everything is correctly set, the browser encodes it as test=%C3%A4. Your servlet will incorrectly decode it as ISO-8859-1 and give you the resulting string "ä". If you apply the correction, you can get ä back:

System.out.println(new String("ä".getBytes("ISO-8859-1"), "UTF-8").equals("ä"));
//true

这篇关于安全的HTML表单接受字符集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆