如何检测网页的字符集 [英] how can I detect charset of a web page

查看：86 发布时间：2021/4/21 20:24:43 java encoding character-encoding webpage

本文介绍了如何检测网页的字符集的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我只想获取Java语言的网页源，并且只想获取具有正确编码类型的内容.到目前为止，我已经可以获取网页的内容.但是对于某些网页，内容带有荒谬的字符.因此，我需要检测该网页的字符集.

I just want to get the web page source in java language and I just want to get that content with correct encoding type. I am able to get the content of a web page till now. But for some web pages the content comes with absurd characters. So I need to detect charset of that web page.

根据我的小研究，我发现有一个jChardet库可以做到这一点.但是我无法将其导入到我的项目中.有人可以帮我吗?

According to my little research I found that there is a jChardet library to do this. But I couldn't import it to my project. Can someone please help me?

顺便说一下，下面的代码就是读取网页内容的代码

By the way the code below is the code to read the web page content

  StringBuilder builder = new StringBuilder(); 
  InputStream is = fURL.openStream();
  BufferedReader buffer = null;
  buffer = new BufferedReader(new InputStreamReader(is, encodingType));

  int byteRead;
  while ((byteRead = buffer.read()) != -1) {
    builder.append((char) byteRead);
  }
  buffer.close();  

  return builder;

如何检测网页的字符集 [英] how can I detect charset of a web page

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

如何检测网页的字符集 [英] how can I detect charset of a web page

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭