servlet请求参数字符编码 [英] servlet request parameter character encoding
问题描述
我有一个Java Servlet,它通过HTTP GET请求从上游系统接收数据。该请求包括一个名为文本的参数。如果上游系统将此参数设置为:
I have a Java servlet that receives data from an upstream system via a HTTP GET request. This request includes a parameter named "text". If the upstream system sets this parameter to:
TEST3 please ignore:
它在上游系统的日志中显示为:
It appears in the logs of the upstream system as:
00 54 00 45 00 53 00 54 00 33 00 20 00 70 00 6c //TEST3 pl
00 65 00 61 00 73 00 65 00 20 00 69 00 67 00 6e //ease ign
00 6f 00 72 00 65 00 3a //ore:
(//注释实际上没有出现在日志)
(The // comments do not actually appear in the logs)
在我的servlet中,我读取了以下参数:
In my servlet I read this parameter with:
String text = request.getParameter("text");
如果我将 text
的值打印到在控制台上,它显示为:
If I print the value of text
to the console, it appears as:
T E S T 3 p l e a s e i g n o r e :
如果我在调试器中检查 text
的值,则显示为:
If I inspect the value of text
in the debugger, it appears as:
\u000T\u000E\u000S\u000T\u0003\u0000 \u000p\u000l\u000e\u000a\u000s\u000e\u0000
\u000i\u000g\u000n\u000o\u000r\u000e\u000:
所以字符编码似乎有问题。上游系统应该使用UTF-16。我的猜测是,该servlet假定使用UTF-8,因此读取的字符数应该是它的两倍。对于消息 TEST3,请忽略:,每个字符的第一个字节为 00
。当servlet读取消息时,这被解释为一个空格,它解释了servlet记录消息时出现在每个字符之前的空格。
So it seems that there's a problem with the character encoding. The upstream system is supposed to use UTF-16. My guess is that the servlet is assuming UTF-8 and therefore is reading twice the number of characters it should be. For the message "TEST3 please ignore:" the first byte of each character is 00
. This is being interpreted as a space when read by the servlet, which explains the space that appears before each character when the message is logged by the servlet.
显然,我的目标是当我阅读 text
请求参数时,只是得到消息 TEST3请忽略:。我的猜测是,可以通过指定请求参数的字符编码来实现此目的,但是我不知道该怎么做。
Obviously my goal is simply to get the message "TEST3 please ignore:" when I read the text
request param. My guess is that I could achieve this by specifying the character encoding of the request parameter, but I don't know how to do this.
推荐答案
像这样使用
new String(req.getParameter("<my request value>").getBytes("ISO-8859-1"),"UTF-8")
这篇关于servlet请求参数字符编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!