如何在使用HtmlAgilityPack.HtmlDocument.LoadHtml时设置编码 [英] How to set encoding when using HtmlAgilityPack.HtmlDocument.LoadHtml

查看：49 发布时间：2019/6/13 8:47:45 C# C#5

本文介绍了如何在使用HtmlAgilityPack.HtmlDocument.LoadHtml时设置编码的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经拥有HTML页面的来源所以我正在使用

I already have the source of HTML page so i am using

string html_page_source="some page source crawled before";
HtmlDocument hdMyDoc = new HtmlDocument();
hdMyDoc.LoadHtml(html_page_source);

但是我看到没有解码的字符，例如

However i see not decoded characters such as

  
içerisinde 
göründüğünden çok
.
.

那么如何在htmldocument设置自动解码？

我如何设置默认编码来解决这个问题？

以下方法是一个好习惯吗？

So how can i set auto decode at htmldocument ?

How can i set default encoding to solve this problem ?

And would this below method a good practice ?

hdMyDoc.LoadHtml(HttpUtility.HtmlDecode(html_page_source));

C＃ .net 4.5最新版，WPF应用程序

C# .net 4.5 latest , WPF application

Html Agility Pack配备了一个名为 HtmlEntity的实用程序类。它有一个带有以下签名的静态方法：

The Html Agility Pack is equiped with a utility class called HtmlEntity. It has a static method with the following signature:

/// <summary>
/// Replace known entities by characters.
/// </summary>
/// <param name="text">The source text.</param>
/// <returns>The result text.</returns>
public static string DeEntitize(string text)

它支持众所周知的实体（如& nbsp; ）和编码字符，例如& ;＃039; 以及。

It supports well-known entities (like  ) and encoded characters such as ' as well.

从文档中提取字符串后，使用此方法进行转换HTML编码的实体返回文本字符。

在尝试加载文档之前，不要对源进行HTML解码;你将完全改变标记的含义。

Once you've extracted the string from the document, use this method to convert the HTML-encoded entities back to text characters.

Don't HTML-decode the source before trying to load the document; you'll completely change the meaning of the markup.

这篇关于如何在使用HtmlAgilityPack.HtmlDocument.LoadHtml时设置编码的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在使用HtmlAgilityPack.HtmlDocument.LoadHtml时设置编码 [英] How to set encoding when using HtmlAgilityPack.HtmlDocument.LoadHtml

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

如何在使用HtmlAgilityPack.HtmlDocument.LoadHtml时设置编码 [英] How to set encoding when using HtmlAgilityPack.HtmlDocument.LoadHtml

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭