struts2中的参数字符集转换 [英] Parameters charset conversion in struts2

查看:20
本文介绍了struts2中的参数字符集转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 struts2 web 应用程序,它接受许多不同字符集的 POST 和 GET 请求,将它们转换为 utf-8,在屏幕上显示正确的 utf-8 字符,然后将它们写入 utf-8 数据库.

I have a struts2 web application which accepts both POST and GET requests in many different charsets, does conversion of them into utf-8, displays the correct utf-8 characters on the screen and then writes them into utf-8 database.

我已经尝试了至少 5 种不同的方法来进行 windows-1250 到 utf-8 的简单无损字符集转换,但它们都不起作用.utf-8 是更大的集合",它应该可以正常工作(至少这是我的理解).

I have tried at least 5 different methods for doing simple losless charset conversion of windows-1250 to utf-8 to start with, and all of them did not work. Utf-8 being the "larger set", it should work without a problem (at least this is my understanding).

您能否提出如何将字符集从 windows-1250 转换为 utf-8,并且 struts2 是否可能对 params 字符集做了一些奇怪的事情,这将解释为什么我似乎无法正确理解.

Can you propose how to do a charset conversion from windows-1250 to utf-8, and is it possible that struts2 is doing something weird with the params charset, which would explain why I can't seem to get it right.

这是我最近的尝试:

    String inputData = getSimpleParamValue("some_input_param_from_get");
    Charset inputCharset = Charset.forName("windows-1250");
    Charset utfCharset = Charset.forName("UTF-8");

    CharsetDecoder decoder = inputCharset.newDecoder();
    CharsetEncoder encoder = utfCharset.newEncoder();

    String decodedData = "";
    try {
        ByteBuffer inputBytes = ByteBuffer.wrap(inputData.getBytes()); // I've tried putting UTF-8 here as well, with no luck
        CharBuffer chars = decoder.decode(inputBytes);

        ByteBuffer utfBytes = encoder.encode(chars);
        decodedData = new String(utfBytes.array());

    } catch (CharacterCodingException e) {
        logger.error(e);
    }

有什么想法可以尝试让它发挥作用吗?

Any ideas on what to try to get this working?

谢谢并致以最诚挚的问候,

Thank you and best regards,

博佐

推荐答案

我不确定您的方案.在 Java 中,字符串是 Unicode,只有在必须将字符串从/到字符串转换为二进制表示时才处理字符集转换.在您的示例中,当调用 getSimpleParamValue("some_input_param_from_get") 时, inputData 应该已经具有正确的"字符串,从字节流(从客户端浏览器传输到 Web 服务器)到字符串的转换应该已经进行了部分(您的应用程序的 Web 服务器 + Web 层的职责).为此,我对网络传输强制使用 UTF-8,在 web.xml(在 Struts 之前)中放置一个过滤器,例如:

I'm not sure of your scenario. In Java, a String is Unicode, one only deals with charset conversion when has to convert from/to String to/from a binary representation. In your example, when getSimpleParamValue("some_input_param_from_get") is called, inputData should already have the "correct" String, the conversion from the stream of bytes (that travelled from the client browser to the web server) to a string should have already taken part (responsability of the web server+web layer of your application). For this, I enforce UTF-8 for the web trasmission, placing a filter in the web.xml (before Struts), for example:

public class CharsetFilter implements Filter {

    public void destroy() {}

    public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
        HttpServletRequest req = (HttpServletRequest) request;
        HttpServletResponse res = (HttpServletResponse) response;
        req.setCharacterEncoding("UTF-8");

        chain.doFilter(req, res);
        String contentType = res.getContentType(); 
        if( contentType !=null && contentType.startsWith("text/html"))
            res.setCharacterEncoding("UTF-8");
    }

    public void init(FilterConfig filterConfig) throws ServletException {
    }
}

如果您不能这样做,并且如果您的 getSimpleParamValue() 在字符集转换中出错"(例如:它假定字节流是 UTF-8 并且是 windows-1250),那么您现在有一个不正确"的字符串,并且您必须尝试通过撤消和重做字节到字符串的转换来恢复它 - 在这种情况下,您必须知道错误和正确的字符集 - 更糟糕的是,处理丢失字符的可能性(如果它被解释为 UTF8,我可能发现了非法的字符序列).如果您必须在 Struts2 操作中处理这个问题,我会说您遇到了问题,您应该在它之前/之后明确处理它(在上层 Web 层 - 或在数据库驱动程序或文件编码中或其他)

If you cannot do this, and if your getSimpleParamValue() "errs" in the charset conversion (eg: it assumed the byte stream was UTF-8 and was windows-1250) you now have an "incorrect" string, and you must try to recover it by undoing and redoing the byte-to-string conversion - in which case you must know the wrong AND the correct charset - and, worse, deal with the possibity of missing chars (if it was interpreted as UTF8, i maight have found illegal char sequence). If you have to deal with this in a Struts2 action, I'd say you are in problems, you should deal with it explicitly before/after it (in the upper web layer - or in the Database driver or File encoding or whatever)

这篇关于struts2中的参数字符集转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆