DataInputStream和UTF-8 [英] DataInputStream and UTF-8
问题描述
我是一个新的程序员,我正在处理我正在处理的代码的几个问题。
I'm kind of a new programmer, and I'm having a couple of problems with the code I'm handling.
基本上代码的作用是从另一个JSP接收表单,读取字节,解析数据,并使用DataInputStream将结果提交给SalesForce。
Basically what the code does is receive a form from another JSP, read the bytes, parse the data, and submit the results to SalesForce, using DataInputStream.
//Getting the parameters from request
String contentType = request.getContentType();
DataInputStream in = new DataInputStream(request.getInputStream());
int formDataLength = request.getContentLength();
//System.out.println(formDataLength);
byte dataBytes[] = new byte[formDataLength];
int byteRead = 0;
int totalBytesRead = 0;
while (totalBytesRead < formDataLength)
{
byteRead = in.read(dataBytes, totalBytesRead, formDataLength);
totalBytesRead += byteRead;
}
它工作正常,但仅在代码处理普通字符时才有效。每当它试图处理特殊字符(例如法语字符:àâäæçéèêëîïôùûü)时,我会得到以下乱码:
It works fine, but only if the code handles normal characters. Whenever it tries to handle special characters (like french chars: àâäæçéèêëîïôùûü) I get the following gibberish as a result:
ÃÃÃâÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃà ¨ÃªÃ«ÃÃÃÃÃÃÃÃÃÃ¥Ã
à âäæçéèêëîïôùûü
我知道它可能是DataInputStream的一个问题,以及它如何不返回UTF-8编码的文本。你们有没有就如何解决这个问题提出任何建议?
I understand it could be an issue of DataInputStream, and how it doesn't return UTF-8 encoded text. Do you guys offer any suggestions on how to tackle this issue?
所有.jsp文件都包含<%@ page pageEncoding =UTF-8contentType =text / html; charset = UTF-8%>和Tomcat的设置很好(URI = UTF-8等)。我尝试添加:
All the .jsp files include <%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"%> and Tomcat's settings are fine (URI = UTF-8, etc). I tried adding:
request.setCharacterEncoding(UTF-8);
和
response.setCharacterEncoding(UTF-8);
无效。
以下是解析数据的示例:
Here's an example of how it parses the data:
//Getting the notes for the Case
String notes = new String(dataBytes);
System.out.println(notes);
String savenotes = casetype.substring(notes.indexOf("notes"));
//savenotes = savenotes.substring(savenotes.indexOf("\n"), savenotes.indexOf("---"));
savenotes = savenotes.substring(savenotes.indexOf("\n")+1);
savenotes = savenotes.substring(savenotes.indexOf("\n")+1);
savenotes = savenotes.substring(0,savenotes.indexOf("name=\"datafile"));
savenotes = savenotes.substring(0,savenotes.lastIndexOf("\n------"));
savenotes = savenotes.trim();
谢谢提前。
推荐答案
问题不在输入流中,因为它们不处理字符,而只处理字节。问题出在将这些字节转换为字符的点。在这种特殊情况下,您需要在 String
构造函数。
The problem is not in the inputstreams since they doesn't handle characters, but only bytes. Your problem is at the point you convert those bytes to characters. In this particular case, you need to specify the proper encoding in the String
constructor.
String notes = new String(dataBytes, "UTF-8");
参见:
- Unicode - 如何获取char是吧?
- Unicode - How to get characters right?
See also:
顺便说一下, DataInputStream
在特定代码段中没有其他值。你可以保持 InputStream
。
By the way, the DataInputStream
has no additional value in the particular code snippet. You can just keep it InputStream
.
这篇关于DataInputStream和UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!