如何将 Unicode 字符作为 JSP/Servlet request.getParameter 传递? [英] How to pass Unicode characters as JSP/Servlet request.getParameter?
问题描述
经过大量的反复试验,我仍然无法弄清楚问题所在.JSP、servlet 和数据库都设置为接受 UTF-8 编码,但即使我在任何具有任何两字节字符(如破折号)的东西上使用 request.getParameter 时,它们也会被打乱为损坏的字符.
我已经手动提交到数据库,它能够接受这些字符,没问题.如果我从 servlet 中的数据库中提取文本并以我的 jsp 页面的形式打印它,它显示没有问题.
唯一一次我发现它作为损坏的字符返回的时候是在我使用 request.getParameter 检索它后尝试将其显示在其他地方时.
有没有其他人遇到过这个问题?我该如何解决?
如果请求和/或响应编码根本没有正确设置,就会发生这种情况.
对于GET请求,需要在servletcontainer级别进行配置.目前尚不清楚您使用的是哪一个,但例如 Tomcat 将通过
元素中的 URIEncoding
属性在其 /conf/server 中完成.xml
.
对于 POST 请求,您需要创建一个
请注意,当页面通过 HTTP 提供时,HTML 标记被忽略.只有在通过
file://
从本地磁盘文件系统打开页面时才考虑它.另外指定 是不必要的,因为它已经默认为在使用表单提供 HTML 页面期间使用的响应编码.另请参阅 W3 HTML 规范.
另见:
- Unicode - 如何正确获取字符?
- 为什么POST 不尊重字符集,但 AJAX 请求呢?Tomcat 6
- HTML:表单不发送 UTF-8 格式输入
- servlet 应用程序中的 Unicode 字符显示为问号
- 错误的 UTF-8 编码,当写入数据库(读取正常)
After a lot of trial and error I still can't figure out the problem. The JSP, servlet, and database are all set to accept UTF-8 encoding, but even still whenever I use request.getParameter on anything that has any two-byte characters like the em dash they get scrambled up as broken characters.
I've made manual submissions to the database and it's able to accept these characters, no problem. And if I pull the text from the database in a servlet and print it in my jsp page's form it displays no problem.
The only time I've found that it comes back as broken characters is when I try and display it elsewhere after retrieving it using request.getParameter.
Has anyone else had this problem? How can I fix it?
That can happen if request and/or response encoding isn't properly set at all.
For GET requests, you need to configure it at the servletcontainer level. It's unclear which one you're using, but for in example Tomcat that's to be done by URIEncoding
attribute in <Connector>
element in its /conf/server.xml
.
<Connector ... URIEncoding="UTF-8">
For POST requests, you need to create a filter which is mapped on the desired URL pattern covering all those POST requests. E.g. *.jsp
or even /*
. Do the following job in doFilter()
:
request.setCharacterEncoding("UTF-8");
chain.doFilter(request, response);
For HTML responses and client side encoding of submitted HTML form input values, you need to set the JSP page encoding. Add this to top of the JSP (you've probably already done it properly given the fact that displaying UTF-8 straight form DB works fine).
<%@page pageEncoding="UTF-8" %>
Or to prevent copypasting this over every single JSP, configure it once in web.xml
:
<jsp-config>
<jsp-property-group>
<url-pattern>*.jsp</url-pattern>
<page-encoding>UTF-8</page-encoding>
</jsp-property-group>
</jsp-config>
For source code files and stdout (IDE console), you need to set the IDE workspace encoding. It's unclear which one you're using, but for in example Eclipse that's to be done by setting Window > Preferences > General > Workspace > Text File Encoding to UTF-8.
Do note that HTML <meta http-equiv>
tags are ignored when page is served over HTTP. It's only considered when page is opened from local disk file system via file://
. Also specifying <form accept-charset>
is unnecessary as it already defaults to response encoding used during serving the HTML page with the form. See also W3 HTML specification.
See also:
- Unicode - How to get the characters right?
- Why does POST not honor charset, but an AJAX request does? tomcat 6
- HTML : Form does not send UTF-8 format inputs
- Unicode characters in servlet application are shown as question marks
- Bad UTF-8 encoding when writing to database (reading is OK)
这篇关于如何将 Unicode 字符作为 JSP/Servlet request.getParameter 传递?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!