DataInputStream和UTF-8 [英] DataInputStream and UTF-8

查看:212
本文介绍了DataInputStream和UTF-8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一个新的程序员,我正在处理我正在处理的代码的几个问题。

I'm kind of a new programmer, and I'm having a couple of problems with the code I'm handling.

基本上代码的作用是从另一个JSP接收表单,读取字节,解析数据,并使用DataInputStream将结果提交给SalesForce。

Basically what the code does is receive a form from another JSP, read the bytes, parse the data, and submit the results to SalesForce, using DataInputStream.

   //Getting the parameters from request
 String contentType = request.getContentType();
 DataInputStream in = new DataInputStream(request.getInputStream());
 int formDataLength = request.getContentLength();

 //System.out.println(formDataLength);
 byte dataBytes[] = new byte[formDataLength];
 int byteRead = 0;
 int totalBytesRead = 0;
 while (totalBytesRead < formDataLength) 
 {
  byteRead = in.read(dataBytes, totalBytesRead, formDataLength);
  totalBytesRead += byteRead;
 }

它工作正常,但仅在代码处理普通字符时才有效。每当它试图处理特殊字符(例如法语字符:àâäæçéèêëîïôùûü)时,我会得到以下乱码:

It works fine, but only if the code handles normal characters. Whenever it tries to handle special characters (like french chars: àâäæçéèêëîïôùûü) I get the following gibberish as a result:

ÃÃÃâÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃÃà ¨ÃªÃ«ÃÃÃÃÃÃÃÃÃÃ¥Ã

à âäæçéèêëîïôùûü

我知道它可能是DataInputStream的一个问题,以及它如何不返回UTF-8编码的文本。你们有没有就如何解决这个问题提出任何建议?

I understand it could be an issue of DataInputStream, and how it doesn't return UTF-8 encoded text. Do you guys offer any suggestions on how to tackle this issue?

所有.jsp文件都包含<%@ page pageEncoding =UTF-8contentType =text / html; charset = UTF-8%>和Tomcat的设置很好(URI = UTF-8等)。我尝试添加:

All the .jsp files include <%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8"%> and Tomcat's settings are fine (URI = UTF-8, etc). I tried adding:

request.setCharacterEncoding(UTF-8);

response.setCharacterEncoding(UTF-8);

无效。

以下是解析数据的示例:

Here's an example of how it parses the data:

    //Getting the notes for the Case 
 String notes = new String(dataBytes);
 System.out.println(notes);
 String savenotes = casetype.substring(notes.indexOf("notes"));
 //savenotes = savenotes.substring(savenotes.indexOf("\n"), savenotes.indexOf("---"));
 savenotes = savenotes.substring(savenotes.indexOf("\n")+1);
 savenotes = savenotes.substring(savenotes.indexOf("\n")+1);
 savenotes = savenotes.substring(0,savenotes.indexOf("name=\"datafile"));
 savenotes = savenotes.substring(0,savenotes.lastIndexOf("\n------"));
 savenotes = savenotes.trim();

谢谢提前。

推荐答案

问题不在输入流中,因为它们不处理字符,而只处理字节。问题出在将这些字节转换为字符的点。在这种特殊情况下,您需要在 String 构造函数

The problem is not in the inputstreams since they doesn't handle characters, but only bytes. Your problem is at the point you convert those bytes to characters. In this particular case, you need to specify the proper encoding in the String constructor.

String notes = new String(dataBytes, "UTF-8");



参见:




  • Unicode - 如何获取char是吧?

  • See also:

    • Unicode - How to get characters right?
    • 顺便说一下, DataInputStream 在特定代码段中没有其他值。你可以保持 InputStream

      By the way, the DataInputStream has no additional value in the particular code snippet. You can just keep it InputStream.

      这篇关于DataInputStream和UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆