Apache HttpClient-在查询中使用utf-8字符将请求发布到ETools.ch [英] Apache HttpClient - post request to ETools.ch with utf-8 chars in the query

查看:169
本文介绍了Apache HttpClient-在查询中使用utf-8字符将请求发布到ETools.ch的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果查询中不包含任何utf-8字符,则该代码可以正常工作.一旦有一个utf-8字符,ETools就会提供我没有想到的结果.例如,对于"trees",我得到正确的结果,而对于"bäume"(树木的德语单词),我得到奇怪的结果.看起来ETools以"b%C3%A4ume"的形式接收查询,并使用完全相同的字符查找完全相同的查询,而不是使用"bäume".我认为如果我设置一些标头参数可以解决问题,但我不知道在那里可以使用哪些参数.

The code works fine if the query does not contain any utf-8 chars. As soon as there is one utf-8 char then ETools provides results I do not expect. For example for "trees" I get correct result and for "bäume" (german word for trees) I get strange results. It looks like that ETools receives the query as "b%C3%A4ume" and looks for exact that query with exact those chars and not for "bäume". I think the problem may be solved if I set some header parameters but I dont know what parameters are possible there.

String query = "some+query+with+utf8+chars";

HttpClient client = new DefaultHttpClient();
HttpPost request = new HttpPost();

List<NameValuePair> parameters = new ArrayList<NameValuePair>();
parameters.add(new BasicNameValuePair("query", query));
parameters.add(new BasicNameValuePair("country", "web"));
parameters.add(new BasicNameValuePair("language", "all"));
parameters.add(new BasicNameValuePair("dataSourceResults", String.valueOf(40)));
parameters.add(new BasicNameValuePair("pageResults", String.valueOf(40)));
request.setEntity(new UrlEncodedFormEntity(parameters, "UTF-8"));
request.setHeader("Content-Type", "application/x-www-form-urlencoded");
request.setURI("http://www.etools.ch/searchAdvancedSubmit.do?page=2");

MyResponse myResponse = client.execute(request, myResponseHandler);

request.reset();
client.getConnectionManager().shutdown();

推荐答案

您应该至少将字符集添加到Content-Type中(默认值是latin1):

You should add your charset into the Content-Type at least (the default is latin1):

request.setHeader("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8");

如果这不起作用,则可能是服务器错误.您可能希望尝试以multipart/form-data( RFC 2388 )的形式提交表单网址已编码.已经有一个 StackOverflow答案带有您可以使用的示例.

If that doesn't work, it could be a server bug. You may want to try submitting the form as multipart/form-data (RFC 2388) instead of URL encoded. There is already a StackOverflow answer with an example that you can use.

这篇关于Apache HttpClient-在查询中使用utf-8字符将请求发布到ETools.ch的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆