无法获取Servlet以UTF-8格式处理请求内容 [英] Cannot get Servlet to process request content as UTF-8

查看:112
本文介绍了无法获取Servlet以UTF-8格式处理请求内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将旧版应用程序从ISO-8859-1转换为UTF-8,并且我已经使用了许多资源来确定需要进行哪些设置才能使其正常工作.但是,在进行了一些配置,代码和环境更改之后,我的Servlet(在Tomcat 5中)似乎没有将提交的HTML表单内容处理为UTF-8.

I'm converting a legacy app from ISO-8859-1 to UTF-8, and I've used a number of resources to determine what I need to set to get this to work. However, after several configuration, code, and environment changes, my Servlet (in Tomcat 5) doesn't seem to process submitted HTML form content as UTF-8.

这是我为配置设置的内容.

Here's what I've set up for configuration.

  • 系统属性
[user@server ~]$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

  • tomcat5 server.xml
  • <Connector protocol="HTTP/1.1"
        ...
        URIEncoding="UTF-8"
        useBodyEncodingForURI="true"/>
    

    • JSP文件
    • <%@ page language="java" pageEncoding="UTF-8" contentType="text/html;charset=UTF-8" %>
      ...
      <meta http-equiv="Content-Type" content="text/html;charset=UTF-8">
      

      • Servlet过滤器
      • public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
        {
            if(request.getCharacterEncoding() == null)
            {
                request.setCharacterEncoding("UTF-8");
            }
            ...
        

        通过一些调试日志,我了解以下内容:

        With some debug logs I know the following:

        System.getProperty("file.encoding"): "UTF-8"
        java.nio.charset.Charset.defaultCharset(): "UTF-8"
        new OutputStreamWriter(new ByteArrayOutputStream()).getEncoding(): "UTF8"
        

        但是,当我使用包含Битьбаклуши"的输入提交表单时,会看到以下内容(从我的日志中):

        However, when I submit my form with an input containing "Бить баклуши", I see the following (from my logs):

        request.getParameter("myParameter") = Ð\221иÑ\202Ñ\214 баклÑ\203Ñ\210Ð
        

        我知道请求内容类型为null,因此在我的servlet过滤器中将其明确设置为"UTF-8".另外,我正在从一个终端查看日志,该终端的编码也被设置为UTF-8.

        I know that the request content type was null, so it was explicitly set to "UTF-8" in my servlet filter. Also, I'm viewing my logs from a terminal, whose encoding I know is set to UTF-8 as well.

        我在这里想念什么? 我还需要为Servlet设置什么才能正确地将我的输入处理为UTF-8??如果有更多信息会有所帮助,我将很乐意添加更多调试功能并以此来更新此问题.

        What am I missing here? What else do I need to set for the Servlet to correctly process my input as UTF-8? If more information will help, I'll be glad to add more debugging and update this question with it.

        • 我没有使用Windows终端(我正在使用PuTTY),所以我可以肯定的是问题不是我在查看日志时所用的东西.其次是当我将响应与提交的内容一起发送回浏览器并输出时,它与上面的垃圾相同.
        • 正在从IE8提交表单.

        我的CharsetFilter的web.xml定义太低了(在我的servlet配置和其他过滤器下面).我将过滤器定义移到了web.xml文档的最顶部,并且一切正常.请参阅下面的可接受答案.

        My web.xml definition for my CharsetFilter was too far down (below my servlet configurations and other filters). I moved the filter definition to the very top of the web.xml document and everything worked correctly. See the accepted answer below.

        推荐答案

        Edit4 (根据要求提供的最终答案和更正的答案)

        Edit4 (the final and corrected answer as requested)

        您的servlet过滤器应用得太晚了.

        Your servlet filter gets applied too late.

        可能的正确顺序如下所示:web.xml

        A possible proper order would be in web.xml as follows

        <?xml version="1.0" encoding="ISO-8859-1"?>
        <!DOCTYPE web-app
            PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
            "http://java.sun.com/j2ee/dtds/web-app_2.3.dtd">
        
        <web-app>
            <!--CharsetFilter start--> 
            <filter>
                <filter-name>Charset Filter</filter-name>
                <filter-class>CharsetFilter</filter-class>
                <init-param>
                    <param-name>requestEncoding</param-name>
                    <param-value>UTF-8</param-value>
                </init-param>
            </filter>
            <!-- The rest is ommited -->
        

        这篇关于无法获取Servlet以UTF-8格式处理请求内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆