JSON杰克逊+了HTTPClient与德国变音 [英] JSON Jackson + HTTPClient with german umlauts

查看:212
本文介绍了JSON杰克逊+了HTTPClient与德国变音的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有关于JSON字符串一个问题,我获得与Apache HTTP客户端,包含德国的变音。

JSON字符串的映射唯一的工作,如果字符串不包含任何德国元音,否则我得到一个JsonMappingException:无法反序列化的一个实例[...]出START_ARRAY的

Apache的HTTP客户端设置为接收字符集,以HTTP.UTF-8,但结果我总是得到如\\ u00fc代替ü。当我手动替换例如\\ u00fc以U映射可以完美运行。

我怎样才能得到一个UTF-8途中从Apache的HTTP客户端codeD JSON响应?
或者是服务器输出的问题?

  params.setParameter(HttpProtocolParams.USE_EXPECT_CONTINUE,FALSE);
HttpProtocolParams.setVersion(参数,可以HttpVersion.HTTP_1_1);
HttpProtocolParams.setContentCharset(参数,可以HTTP.UTF_8);
HttpClient的=新DefaultHttpClient(PARAMS);
HttpClient的=新DefaultHttpClient(PARAMS);
HTTPGET httpGetContentLoad =新HTTPGET(URL);
httpGetContentLoad.setHeader(接收字符集,UTF-8);
httpGetContentLoad.setParams(PARAMS);
响应= httpclient.execute(httpGetContentLoad);
实体= response.getEntity();
字符串loadedContent = NULL;
如果(实体!= NULL)
{
   loadedContent = EntityUtils.toString(实体,HTTP.UTF_8);
   entity.consumeContent();
}
如果(HttpStatus.SC_OK!= response.getStatusLine()的getStatus code())
{
    抛出新的异常(加载含量不合格);
}
closeConnection();
返回loadedContent;

和JSON的$​​ C $ C在这里映射:

 字符串jsonMetaData = loadGetRequestContent(getLatestEditionUrl(newspaperEdition));
本报loadedNewspaper = mapper.readValue(jsonMetaData,Newspaper.class);
loadedNewspaper.setEdition(newspaperEdition);

更新1:
JsonMetaData是包含获取JSON code字符串类型。

UPDATE2:

这code我使用的JSON输出转换成我需要:

 公共静态字符串convertJsonLatestEditionMeta(JSON字符串code)
{
    JSON code = json的code.replaceFirst(\\\\ [\\[A​​-ZA-Z0-9 - [:空白:]] + \\,\\\\ {,{\\编辑\\ :\\一个-A1 \\,);
    JSON code = json的code.replaceFirst(\\页\\:\\\\ {,\\页\\:\\\\ [);
    JSON code = Helper.replaceLast(JSON code,}}}],}]});
    JSON code = json的code.replaceAll(\\[\\\\ D]。* \\\\\\:\\\\ {\\,\\\\ {\\);
    返回JSON code;
}

UPDATE3:
JSON的转换例如:

JSON code转换前:

  [报纸标题
{
    日期:20130103,
页:
            {
            1:{RESSORT:ressorttitle1,pdfpfad:pathToPdf1,号码:1,大小:281506}
            2:{RESSORT:ressorttitle2,pdfpfad:pathToPdf2,数字:2,大小:281533}
            [...]
        }
    }
]

的Json code转换后:

  {
版:报纸标题
日期:20130103,
    页:
    [
       {RESSORT:Resorttitle1,pdfpfad:pathToPdf1,数字:1,大小:281506},
       {RESSORT:Resorttitle2,pdfpfad:pathToPdf2,数字:2,大小:281533},
       [...]
    ]
}

解决方案:
我开始使用作为GSON建议@Boris以及关于变音符号的问题不见了!进一步GSON似乎真的比杰克逊的Json快。

一个解决方法是更换字符手动以下这个表格:

 登录统一code再presentationA,A \\ u00c4,\\ u00e4
O,O- \\ u00d6,\\ u00f6
U,U \\ u00dc,\\ u00fc
SS \\ u00df
€\\ u20ac


解决方案

尝试解析这样的:

 实体= response.getEntity();
本报loadedNewspaper = mapper.readValue(entity.getContent(),Newspaper.class);

没有理由去通过字符串,杰克逊解析的InputStream 取值直接。此外,如果你用我的建议的方法杰克逊将自动检测编码。

修改顺便考虑使用 GSON JSON解析库。它比杰克逊甚至更快和更容易使用。然而,杰克逊最近开始解析XML也一样,这是一种美德。

EDIT2 您已经添加的所有细节后,我会想这个问题是在服务器上执行的服务 - 的变音符号都不能是单向code的JSON逃脱 - UTF-8是它本地编码。你为什么不代替的手动替换例如\\ u00fc与ü通过正则表达式呢?

I'm having a problem regarding a json string, i acquire with the Apache http client, containing german umlauts.

The mapping of json strings is only working, if the string does not contain any german umlaut, otherwise i get an "JsonMappingException: Can not deserialize instance of [...] out of START_ARRAY.

The Apache http client is set with "Accept-Charset" to HTTP.UTF-8, but as result i always get e.g. "\u00fc" instead "ü". When i manually replace e.g. "\u00fc" with "ü" the mapping works perfect.

How can i get a utf-8 encoded json response from Apache http client? Or is the server output the problem?

params.setParameter(HttpProtocolParams.USE_EXPECT_CONTINUE, false);
HttpProtocolParams.setVersion(params, HttpVersion.HTTP_1_1);
HttpProtocolParams.setContentCharset(params, HTTP.UTF_8);
httpclient = new DefaultHttpClient(params);
httpclient = new DefaultHttpClient(params);
HttpGet httpGetContentLoad = new HttpGet(url);
httpGetContentLoad.setHeader("Accept-Charset", "utf-8");
httpGetContentLoad.setParams(params);
response = httpclient.execute(httpGetContentLoad);
entity = response.getEntity();
String loadedContent = null;
if (entity != null)
{
   loadedContent = EntityUtils.toString(entity, HTTP.UTF_8);
   entity.consumeContent();
}
if (HttpStatus.SC_OK != response.getStatusLine().getStatusCode())
{
    throw new Exception("Loading content failed");
}
closeConnection();
return loadedContent;

And the json code is mapped here:

String jsonMetaData = loadGetRequestContent(getLatestEditionUrl(newspaperEdition));
Newspaper loadedNewspaper = mapper.readValue(jsonMetaData, Newspaper.class);
loadedNewspaper.setEdition(newspaperEdition);

Update 1: JsonMetaData is type of String containing the fetched json code.

Update2:

This code i use to transform the json output to me needs:

public static String convertJsonLatestEditionMeta(String jsonCode)
{
    jsonCode = jsonCode.replaceFirst("\\[\"[A-Za-z0-9-[:blank:]]+\",\\{", "{\"edition\":\"an-a1\",");
    jsonCode = jsonCode.replaceFirst("\"pages\":\\{", "\"pages\":\\[");
    jsonCode = Helper.replaceLast(jsonCode, "}}}]", "}]}");
    jsonCode = jsonCode.replaceAll("\"[\\d]*\"\\:\\{\"", "\\{\"");
    return jsonCode;
}

Update3: Json conversion example:

jsoncode before conversion:

["Newspaper title",
{
    "date":"20130103",
"pages":
            {
            "1":  {"ressort":"ressorttitle1","pdfpfad":"pathToPdf1","number":1,"size":281506},
            "2":{"ressort":"ressorttitle2","pdfpfad":"pathToPdf2","number":2,"size":281533},
            [...]
        }
    }
]

Jsoncode after conversion:

{   
"edition":"Newspaper title",
"date":"20130103",
    "pages":
    [
       {"ressort":"Resorttitle1","pdfpfad":"pathToPdf1","number":1,"size":281506},
       {"ressort":"Resorttitle2","pdfpfad":"pathToPdf2","number":2,"size":281533},
       [...]
    ]
}

Solution: I started using GSON as @Boris suggested and the problem regarding umlauts is gone! Further more GSON really seems to be faster than Jackson Json.

A workaround would be to replace the characters manually following this table:

Sign        Unicode representation

Ä, ä        \u00c4, \u00e4
Ö, ö        \u00d6, \u00f6
Ü, ü        \u00dc, \u00fc
ß           \u00df
€           \u20ac

解决方案

Try parsing like that:

entity = response.getEntity();
Newspaper loadedNewspaper=mapper.readValue(entity.getContent(), Newspaper.class);

No reason to go through String, Jackson parses InputStreams directly. Also Jackson will automatically detect the encoding if you use my proposed approach.

EDIT By the way consider using GSON JSON parsing library. It is even faster than Jackson and easier to use. However, Jackson recently started parsing XMl, too, which is a virtue.

EDIT2 After all you have added as details I would suppose the problem is with the server implementation of the services - the umlauts are not to be unicode escaped in the json - UTF 8 is native encoding for it. Why don't you instead of manually replace e.g. "\u00fc" with "ü" do it via regex?

这篇关于JSON杰克逊+了HTTPClient与德国变音的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆