如何使用UTF-8读取InputStream? [英] How to read a InputStream with UTF-8?
问题描述
欢迎大家
我正在开发一个Java应用程序,该应用程序从Internet调用PHP,它给了我XML响应.
I'm developing a Java app, that calls a PHP from internet that it's giving me a XML response.
响应中包含以下单词:Próximo",但是当我解析XML的节点并将响应返回到String变量时,我收到的单词是:"Pr& oacute; ximo".
In the response is contained this word: "Próximo", but when i parse the nodes of the XML and obtain the response into a String variable, I'm receiving the word like this: "Pr& oacute;ximo".
我确定问题是我在Java应用程序中使用了不同的编码,然后在PHP脚本中使用了不同的编码.然后,我认为我必须将编码设置为与您的PHP xml中的编码相同,即UTF-8
I'm sure that the problem is that i'm using different encoding in the Java app then encoding of PHP script. Then, i supose i must set encoding to the same as in your PHP xml, UTF-8
这是我用来从PHP处理XML文件的代码.
This is the code i'm using to geat the XML file from the PHP.
¿我应将此代码中的哪些内容更改为将编码设置为UTF-8? (请注意,我未使用内置阅读器,我正在使用输入流)
¿What should i change in this code to set the encoding to UTF-8? (note that im not using bufered reader, i'm using input stream)
InputStream in = null;
String url = "http://www.myurl.com"
try {
URL formattedUrl = new URL(url);
URLConnection connection = formattedUrl.openConnection();
HttpURLConnection httpConnection = (HttpURLConnection) connection;
httpConnection.setAllowUserInteraction(false);
httpConnection.setInstanceFollowRedirects(true);
httpConnection.setRequestMethod("GET");
httpConnection.connect();
if (httpConnection.getResponseCode() == HttpURLConnection.HTTP_OK)
in = httpConnection.getInputStream();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(in);
doc.getDocumentElement().normalize();
NodeList myNodes = doc.getElementsByTagName("myNode");
推荐答案
当您获取InputStream
时,请从中读取byte[]
.创建字符串时,请在CharSet
中传递"UTF-8".示例:
When you get your InputStream
read byte[]
s from it. When you create your Strings, pass in the CharSet
for "UTF-8". Example:
byte[] buffer = new byte[contentLength];
int bytesRead = inputStream.read(buffer);
String page = new String(buffer, 0, bytesRead, "UTF-8");
请注意,您可能需要使缓冲区达到合理的大小(例如1024),并连续调用inputStream.read(buffer)
.
Note, you're probably going to want to make your buffer some sane size (like 1024), and continuously called inputStream.read(buffer)
.
@Amir Pashazadeh
@Amir Pashazadeh
是的,您还可以使用InputStreamReader,然后尝试将parse()行更改为:
Yes, you can also use an InputStreamReader, and try changing the parse() line to:
Document doc = db.parse(new InputSource(new InputStreamReader(in, "UTF-8")));
这篇关于如何使用UTF-8读取InputStream?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!