JSP中损坏的UTF-8 URI编码 [英] Broken UTF-8 URI Encoding in JSPs

查看:88
本文介绍了JSP中损坏的UTF-8 URI编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了一个错误的URI编码问题,我们将不胜感激!

I got a strange issue with wrong URI Encoding and would appreciate any help!

该项目使用JSP,Servlet,Jquery,Tomcat 6.

The project uses JSPs, Servlets, Jquery, Tomcat 6.

JSP中的字符集设置为UTF-8,所有Tomcat连接器都使用URIEncoding = UTF-8,而且我还使用了字符编码过滤器,如

Charset in the JSPs is set to UTF-8, all Tomcat connectors use URIEncoding=UTF-8 and I also use a character encoding filter as described here. Also, I set the contentType in the meta Tag and my browser detects it correctly.

在使用Jquery的Ajax调用中,我在要用作URL参数的术语上使用encodeURIComponent(),然后使用$ .param()序列化整个参数集.在被调用的servlet中,这些参数已使用Java.net.URLDecoder.decode(术语"UTF-8")正确解码.

In Ajax calls with Jquery I use encodeURIComponent() on the terms I want to use as URL Parameters and then serialize the whole parameter set with $.param(). In the called servlet these parameters are decoded correctly with Java.net.URLDecoder.decode(term, "UTF-8").

在某些地方,我从JSP中的参数映射生成href元素的URL.每个参数值都在JSP端使用Java.net.URLEncoder.encode(value,"UTF-8")进行编码,但是与以前相同的方式对其进行解码会导致特殊字符损坏.相反,我必须在JSP中将其编码为"ISO-8859-2",然后在servlet中将其正确解码为"UTF-8".

In some places I generate URLs for href elements from a parameter map in the JSPs. Each parameter value is encoded with Java.net.URLEncoder.encode(value, "UTF-8") on JSP side but then decoding it the same way as before results in broken special characters. Instead, I have to encode it as "ISO-8859-2" in the JSP which is then decoded correctly as "UTF-8" in the servlet.

用于澄清的示例: 术语überfall"通过Javascript(%C3%BCberfall)进行URIEncoded,然后发送到Servlet进行解码和处理,这是可行的.将其传递回JSP后,我将其编码为UTF-8并构建URL,例如:

An example for clarifying: The term "überfall" is URIEncoded via Javascript (%C3%BCberfall) and sent to the servlet for decoding and processing, which works. After passing it back to a JSP I would encode it as UTF-8 and build the URL which results for instance in:

<a href="/myWebapp/servletPath?term=%C3%BCberfall">Click here</a>

但是,单击此链接会将参数作为%C3%83%C2%BCberfall"发送到Servlet,该Servlet解码为¼berfall".没有编码时也会发生同样的情况.

However, clicking this link will send the parameter as "%C3%83%C2%BCberfall" to the servlet which decodes to "überfall". The same occurs when no encoding takes place.

使用"ISO-8859-2"进行编码时,我得到:

When, using "ISO-8859-2" for encoding I get:

<a href="/myWebapp/servletPath?term=%FCberfall">Click here</a>

单击此链接时,我可以在Wireshark中观察到%C3%BCberfall作为参数发送,并再次解码为überfall"!

When clicking this link I can observe in Wireshark that %C3%BCberfall is sent as parameter which decodes again to "überfall"!

有人可以告诉我我在哪里想念东西吗?

Can anyone tell me where I miss something?

在Firebug中观察网络"选项卡时,我意识到通过使用

While observing the Network Tab in Firebug I realized that by using

$.param({term : encodeURIComponent(term)}); 

该术语用UTF-8编码两次,导致%25C3%25BCberfall",即百分比符号也进行了百分比编码.类似地,如果我对参数映射中的每个值两次调用encode(term,"UTF-8"),它对我也有效.

the term is UTF-8 encoded twice, resulting in "%25C3%25BCberfall", i.e. the percent symbols are also percent-encoded. Analogously, it works for me if I call encode(term, "UTF-8") twice on each value from the parameter map.

一次编码而不对字符串进行解码会再次导致¼berfall".

Encoding once and not decoding the String results in "überfall" again.

推荐答案

我认为我现在绝对可以解决问题.

I think I fixed the problem now definitely.

在Jontro的评论之后,我对所有URL参数值进行了一次编码,并删除了手动进行的servlet端解码.

Following Jontro's comment I encoded all URL parameter values once and removed the manual servlet-side decoding.

发送ü应该看起来像Firebug的网络"选项卡中的%C3%BC,这使我在servlet中获得了ü. 绝对使用-Dfile.encoding参数将Java设置为"UTF-8"内部编码. 我将问题追溯到像这样的request.getParameter()方法. request.getQueryString可以,但是在提取实际参数时会失败:

Sending an ü should look like %C3%BC in Firebug's Network tab which gave me ü in the servlet. Java was definitely set to "UTF-8" internal encoding with the -Dfile.encoding parameter. I traced the problem to the request.getParameter() method like this. request.getQueryString was ok, but when extracting the actual parameters it fails:

request.getCharacterEncoding())=> UTF-8
request.getContentType()=> null
request.getQueryString()=> from = 0& resultCount = 10& sortAsc = true& searchType = quick& term =%C3%BC
request.getParameter("term")=>¼
Charset.defaultCharset()=> UTF-8
OutputStreamWriter.getEncoding()=> UTF8
新的String(request.getParameter("term").getBytes(),UTF-8)=>¼
System.getProperty("file.encoding")=> UTF-8

request.getCharacterEncoding()) => UTF-8
request.getContentType() => null
request.getQueryString() => from=0&resultCount=10&sortAsc=true&searchType=quick&term=%C3%BC
request.getParameter("term") => ü
Charset.defaultCharset() => UTF-8
OutputStreamWriter.getEncoding() => UTF8
new String(request.getParameter("term").getBytes(), UTF-8) => ü
System.getProperty("file.encoding") => UTF-8

通过查看实现request.getParameter()的Tomcat和Coyote的源,我发现了问题:来自连接器的URIEncoding始终为空,在这种情况下,它默认为org.apache.coyote.Constants.DEFAULT_CHARACTER_ENCODING像Wolfram所说的"ISO-8859-1".

By looking into the sources of Tomcat and Coyote which implement request.getParameter() i found the problem: the URIEncoding from the connector was always null and in this case it defaults to org.apache.coyote.Constants.DEFAULT_CHARACTER_ENCODING which is "ISO-8859-1" like Wolfram said.

长话短说:我的错是在Tomcat的conf目录中编辑server.xml,仅当在服务器视图中创建新服务器时才将ONCE加载到Eclipse中!之后,必须在服务器"项目中编辑单独的server.xml.完成此操作后,将正确加载连接器设置,并且一切正常.

Long story short: my fault was editing the server.xml in Tomcat's conf directory which is only loaded ONCE into Eclipse when a new server is created in the servers view! After that, a separate server.xml in the Servers project has to be edited. After doing so, the connector setting is loaded correctly and everything works as it should.

感谢您的评论!希望这可以帮助某人...

Thanks for the comments! Hope this helps someone...

这篇关于JSP中损坏的UTF-8 URI编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆