阅读网页内容 [英] Reading the content of web page

查看：180 发布时间：2016/11/19 14:20:06 java character-encoding inputstreamreader

本文介绍了阅读网页内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

您好
我想使用java读取包含德语字符的网页的内容，不幸的是，德语字符显示为奇怪的字符。
任何帮助请
这里是我的代码：

Hi I want to read the content of a web page that contains a German characters using java , unfortunately , the German characters appear as strange characters . Any help please here is my code:

String link = "some german link";

            URL url = new URL(link);
            BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
            String inputLine;
            while ((inputLine = in.readLine()) != null) {
                System.out.println(inputLine);
            }

推荐答案

编码。您可以在HTTP标头中找到编码：

You have to set the correct encoding. You can find the encoding in the HTTP header:

Content-Type: text/html; charset=ISO-8859-1

这可能会在（X）HTML文档中被覆盖， a href =http://en.wikipedia.org/wiki/Character_encodings_in_HTML =nofollow> HTML字符编码

This may be overwritten in the (X)HTML document, see HTML Character encodings

我可以想象，你必须考虑许多不同的额外问题来解析网页错误免费。但是有不同的HTTP客户端库可用于Java，例如。 org.apache.httpcomponents 。代码将如下所示：

I can imagine that you have to consider many different additional issues to pars a web page error free. But there are different HTTP client libraries available for Java, e.g. org.apache.httpcomponents. The code will look like this:

DefaultHttpClient httpclient = new DefaultHttpClient();
HttpGet httpGet = new HttpGet("http://www.spiegel.de");

try
{
  HttpResponse response = httpclient.execute(httpGet);
  HttpEntity entity = response.getEntity();
  if (entity != null)
  {
    System.out.println(EntityUtils.toString(entity));
  }
}
catch (ClientProtocolException e) {e.printStackTrace();}
catch (IOException e) {e.printStackTrace();}

这是maven工件：

<dependency>
  <groupId>org.apache.httpcomponents</groupId>
  <artifactId>httpclient</artifactId>
  <version>4.1.1</version>
  <type>jar</type>
  <scope>compile</scope>
</dependency>

这篇关于阅读网页内容的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

阅读网页内容 [英] Reading the content of web page

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

阅读网页内容 [英] Reading the content of web page

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭