解码在utf-8格式编码的字符串 [英] decode string encoded in utf-8 format in android
问题描述
我尝试过以下操作:
try {
BufferedReader in = new BufferedReader(
new InputStreamReader(
new ByteArrayInputStream(nodevalue.getBytes()),UTF8));
event.attributes.put(title,in.readLine());
} catch(UnsupportedEncodingException e){
// TODO自动生成的catch块
e.printStackTrace();
} catch(IOException e){
// TODO自动生成的catch块
e.printStackTrace();
}
我也尝试过:
try {
event.attributes.put(title,URLDecoder.decode(nodevalue,UTF-8)) ;
} catch(UnsupportedEncodingException e){
// TODO自动生成的catch块
e.printStackTrace();
}
他们都没有工作。如何解码德国字符串
提前谢谢。
UDPDATE:
@Override
public void characters(char [] ch,int start,int length)
抛出SAXException {
// TODO自动生成的方法stub
super.characters(ch,start,length);
if(nodename!= null){
String nodevalue = String.copyValueOf(ch,0,length);
if(nodename.equals(startdat)){
if(event.attributes.get(eventid)。equals(187)){
}
}
if(nodename.equals(startscreen)){
imageaddress = nodevalue;
}
else {
if(nodename.equals(title)){
// try {
// BufferedReader in = new BufferedReader(
// new InputStreamReader(
// new ByteArrayInputStream(nodevalue.getBytes()),UTF8));
// event.attributes.put(title,in.readLine());
//} catch(UnsupportedEncodingException e){
// // TODO自动生成的catch块
// e.printStackTrace();
//} catch(IOException e){
// // TODO自动生成的catch块
// e.printStackTrace();
//}
// try {
// event.attributes.put(title,
// URLDecoder.decode(nodevalue,UTF-8)) ;
//} catch(UnsupportedEncodingException e){
// // TODO自动生成的catch块
// e.printStackTrace();
//}
event.attributes.put(title,StringEscapeUtils
.unescapeHtml(new String(ch,start,length).trim()));
} else
event.attributes.put(nodename,nodevalue);
}
}
}
您可以使用String构造函数和charset参数:
try
{
final String s = new String(nodevalue.getBytes(),UTF-8);
}
catch(UnsupportedEncodingException e)
{
Log.e(utf8,conversion,e);
}
此外,由于您从xml文档获取数据,我认为被编码为UTF-8,可能问题在于解析它。
您应该使用 InputStream
/ InputSource
而不是 XMLReader
实现,因为它附带了编码。因此,如果您从http响应中获取此数据,则可以使用 InputStream
和 InputSource
try
{
HttpEntity entity = response.getEntity();
final InputStream in = entity.getContent();
final SAXParser parser = SAXParserFactory.newInstance()。newSAXParser();
final XmlHandler handler = new XmlHandler();
Reader reader = new InputStreamReader(in,UTF-8);
InputSource is = new InputSource(reader);
is.setEncoding(UTF-8);
parser.parse(is,handler);
// TODO:从处理程序中获取数据
}
catch(final Exception e)
{
Log.e(ParseError,解析XML ,e);
}
或只是 InputStream
:
try
{
HttpEntity entity = response.getEntity();
final InputStream in = entity.getContent();
final SAXParser parser = SAXParserFactory.newInstance()。newSAXParser();
final XmlHandler handler = new XmlHandler();
parser.parse(in,handler);
// TODO:从处理程序中获取数据
}
catch(final Exception e)
{
Log.e(ParseError,解析XML ,e);
}
更新1
以下是完整的请求和响应处理示例:
try
{
final DefaultHttpClient client = new DefaultHttpClient();
final HttpPost httppost = new HttpPost(http://example.location.com/myxml);
final HttpResponse response = client.execute(httppost);
final HttpEntity entity = response.getEntity();
final InputStream in = entity.getContent();
final SAXParser parser = SAXParserFactory.newInstance()。newSAXParser();
final XmlHandler handler = new XmlHandler();
parser.parse(in,handler);
// TODO:从处理程序中获取数据
}
catch(final Exception e)
{
Log.e(ParseError,解析XML ,e);
}
更新2
由于问题不是编码而是将xml转义为html实体,所以最好的解决方案是(除了纠正php不要逃避响应),使用 apache.commons.lang库非常方便的 static StringEscapeUtils类
。
导入库后,在xml处理程序的字符
方法中,您将放入以下内容:
@Override
public void characters(final char [] ch,final int start,final int length)
throws SAXException
{
//该变量将保存正确的未转义值
final String elementValue = StringEscapeUtils。
unescapeHtml(new String(ch,start,length).trim());
[...]
}
更新3
在最后一个代码中,问题是初始化 nodevalue
变量。它应该是:
String nodevalue = StringEscapeUtils.unescapeHtml(
new String(ch,start,length).trim ());
I have a string which comes via an xml , and it is text in German. The characters that are German specific are encoded via the UTF-8 format. Before display the string I need to decode it.
I have tried the following:
try {
BufferedReader in = new BufferedReader(
new InputStreamReader(
new ByteArrayInputStream(nodevalue.getBytes()), "UTF8"));
event.attributes.put("title", in.readLine());
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
I have also tried this:
try {
event.attributes.put("title", URLDecoder.decode(nodevalue, "UTF-8"));
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
None of them are working. How do I decode the German string
thank you in advance.
UDPDATE:
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
// TODO Auto-generated method stub
super.characters(ch, start, length);
if (nodename != null) {
String nodevalue = String.copyValueOf(ch, 0, length);
if (nodename.equals("startdat")) {
if (event.attributes.get("eventid").equals("187")) {
}
}
if (nodename.equals("startscreen")) {
imageaddress = nodevalue;
}
else {
if (nodename.equals("title")) {
// try {
// BufferedReader in = new BufferedReader(
// new InputStreamReader(
// new ByteArrayInputStream(nodevalue.getBytes()), "UTF8"));
// event.attributes.put("title", in.readLine());
// } catch (UnsupportedEncodingException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// } catch (IOException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// }
// try {
// event.attributes.put("title",
// URLDecoder.decode(nodevalue, "UTF-8"));
// } catch (UnsupportedEncodingException e) {
// // TODO Auto-generated catch block
// e.printStackTrace();
// }
event.attributes.put("title", StringEscapeUtils
.unescapeHtml(new String(ch, start, length).trim()));
} else
event.attributes.put(nodename, nodevalue);
}
}
}
You could use the String constructor with the charset parameter:
try
{
final String s = new String(nodevalue.getBytes(), "UTF-8");
}
catch (UnsupportedEncodingException e)
{
Log.e("utf8", "conversion", e);
}
Also, since you get the data from an xml document, and I assume it is encoded UTF-8, probably the problem is in parsing it.
You should use InputStream
/InputSource
instead of a XMLReader
implementation, because it comes with the encoding. So if you're getting this data from a http response, you could either use both InputStream
and InputSource
try
{
HttpEntity entity = response.getEntity();
final InputStream in = entity.getContent();
final SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
final XmlHandler handler = new XmlHandler();
Reader reader = new InputStreamReader(in, "UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
parser.parse(is, handler);
//TODO: get the data from your handler
}
catch (final Exception e)
{
Log.e("ParseError", "Error parsing xml", e);
}
or just the InputStream
:
try
{
HttpEntity entity = response.getEntity();
final InputStream in = entity.getContent();
final SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
final XmlHandler handler = new XmlHandler();
parser.parse(in, handler);
//TODO: get the data from your handler
}
catch (final Exception e)
{
Log.e("ParseError", "Error parsing xml", e);
}
Update 1
Here is a sample of a complete request and response handling:
try
{
final DefaultHttpClient client = new DefaultHttpClient();
final HttpPost httppost = new HttpPost("http://example.location.com/myxml");
final HttpResponse response = client.execute(httppost);
final HttpEntity entity = response.getEntity();
final InputStream in = entity.getContent();
final SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
final XmlHandler handler = new XmlHandler();
parser.parse(in, handler);
//TODO: get the data from your handler
}
catch (final Exception e)
{
Log.e("ParseError", "Error parsing xml", e);
}
Update 2
As the problem is not the encoding but the source xml being escaped to html entities, the best solution is (besides correcting the php to do not escape the response), to use the apache.commons.lang library's very handy static StringEscapeUtils class
.
After importing the library, in your xml handler's characters
method you put the following:
@Override
public void characters(final char[] ch, final int start, final int length)
throws SAXException
{
// This variable will hold the correct unescaped value
final String elementValue = StringEscapeUtils.
unescapeHtml(new String(ch, start, length).trim());
[...]
}
Update 3
In your last code the problem is with the initialization of the nodevalue
variable. It should be:
String nodevalue = StringEscapeUtils.unescapeHtml(
new String(ch, start, length).trim());
这篇关于解码在utf-8格式编码的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!