无法使用 htmlagilitypack 从 https URL 下载 HTML 数据 [英] Can't download HTML data from https URL using htmlagilitypack

查看：23 发布时间：2021/12/17 14:07:28 c# html https web-scraping html-agility-pack

本文介绍了无法使用 htmlagilitypack 从 https URL 下载 HTML 数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个小"问题 htmlagilitypack(HAP).当我尝试从网站获取数据时出现此错误:

I have a "small" problem htmlagilitypack(HAP). When I tried to get data from a website I get this error:

类型为System.ArgumentException"的未处理异常发生在mscorlib.dll

An unhandled exception of type 'System.ArgumentException' occurred in mscorlib.dll

附加信息:gzip"不是受支持的编码名称.有关定义自定义编码的信息，请参阅文档用于 Encoding.RegisterProvider 方法.

Additional information: 'gzip' is not a supported encoding name. For information on defining a custom encoding, see the documentation for the Encoding.RegisterProvider method.

我正在使用这段代码从网站获取数据:

I'm using this piece of code to get the data from the website:

HtmlWeb page = new HtmlWeb();
var url = "https://kat.cr/";
var data = page.Load(url);

在这段代码之后，我得到了那个错误.我尝试了谷歌的所有方法，但没有任何帮助.

After this code i get that error. I tried everything from the google but nothing helped.

谁能告诉我如何解决这个问题?

Can someone tell me how to resolve this problem ?

谢谢

推荐答案

HtmlWeb 不支持从 https 下载.因此，您可以使用 WebClient 和一些修改自动解压GZip :

HtmlWeb doesn't support downloading from https. So instead, you can use WebClient with a bit of modification to automatically decompress GZip :

class MyWebClient : WebClient
{
    protected override WebRequest GetWebRequest(Uri address)
    {
        HttpWebRequest request = base.GetWebRequest(address) as HttpWebRequest;
        request.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
        return request;
    }
}

然后使用 HtmlDocument.LoadHtml() 从 HTML 字符串填充您的 HtmlDocument 实例:

Then use HtmlDocument.LoadHtml() to populate your HtmlDocument instance from HTML string :

var url = "https://kat.cr/";
var data = new MyWebClient().DownloadString(url);
var doc = new HtmlDocument();
doc.LoadHtml(data);

这篇关于无法使用 htmlagilitypack 从 https URL 下载 HTML 数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

无法使用 htmlagilitypack 从 https URL 下载 HTML 数据 [英] Can't download HTML data from https URL using htmlagilitypack

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

无法使用 htmlagilitypack 从 https URL 下载 HTML 数据 [英] Can&#39;t download HTML data from https URL using htmlagilitypack

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

无法使用 htmlagilitypack 从 https URL 下载 HTML 数据 [英] Can't download HTML data from https URL using htmlagilitypack

登录关闭