如何使用HTML Agility实用程序提取URL的标题，图像和描述 [英] how to extract a url's title, images and description using HTML Agility utility

查看：68 发布时间：2020/11/24 19:42:18 c# asp.net webforms html-agility-pack

本文介绍了如何使用HTML Agility实用程序提取URL的标题，图像和描述的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我要提取标题，说明和内容；使用 HTML Agility实用工具来自URL的图像，到目前为止，我无法找到一个易于理解的示例.可以帮助我做到这一点.

I want to extract Title, Description & images from URL using HTML Agility utility so far i am not able to find an example which is easy to understand & can help me to do it.

如果有人可以帮助我举例说明，以便我可以提取标题，说明和内容，我将不胜感激.让用户选择从一系列图像中选择图像(当我们共享链接时，这类似于Facebook).

I would appreciate if some can help me with example so that i can extract title, description & give user choice to select image from series of image (some thing similar to Facebook when we share a link).

已更新:

我在.aspx页&我在按钮单击事件上触发以下代码.但它对所有值都返回null.可能是我做错了.

I have place a label for title, desc and a button , textbox on the .aspx page & i fire following code on button click event. but it return null for all values. may be i am doing something wrong.

我使用了以下示例URLhttp://edition.cnn.com/2012/10/31/world/asia/india/index.html?hpt = hp_t2

i used following sample URLhttp://edition.cnn.com/2012/10/31/world/asia/india/index.html?hpt=hp_t2

protected void btnGetURLDetails_Click(object sender, EventArgs e)
{
    HtmlDocument doc = new HtmlDocument();
    var response = txtURL.Text;
    doc.LoadHtml(response);

    String title = (from x in doc.DocumentNode.Descendants()
                    where x.Name.ToLower() == "title"
                    select x.InnerText).FirstOrDefault();

    String desc = (from x in doc.DocumentNode.Descendants()
                   where x.Name.ToLower() == "description"
                   select x.InnerText).FirstOrDefault();

    List<String> imgs = (from x in doc.DocumentNode.Descendants()
                         where x.Name.ToLower() == "img"
                         select x.Attributes["src"].Value).ToList<String>();

    lblTitle.Text = title;
    lblDescription.Text = desc;
}

上面的代码为我提供了所有变量的空值

Above code gets me null value for all variable

如果我以此修改代码

HtmlDocument doc = new HtmlDocument();
        var url = txtURL.Text;

        var webGet = new HtmlWeb();
         doc = webGet.Load(url);

在这种情况下，它只会为我获得标题&的价值说明再次为空

in this case it only get me value for title & description is null again

推荐答案

protected void btnGetURLDetails_Click(object sender, EventArgs e)
{
    HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(new Uri(txtURL.Text));
    request.Method = WebRequestMethods.Http.Get;

    HttpWebResponse response = (HttpWebResponse)request.GetResponse();

    StreamReader reader = new StreamReader(response.GetResponseStream());

    String responseString = reader.ReadToEnd();

    response.Close();

    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml(responseString);

    String title = (from x in doc.DocumentNode.Descendants()
                where x.Name.ToLower() == "title"
                select x.InnerText).FirstOrDefault();

    String desc = (from x in doc.DocumentNode.Descendants()
               where x.Name.ToLower() == "meta"
               && x.Attributes["name"] != null
               && x.Attributes["name"].Value.ToLower() == "description"
               select x.Attributes["content"].Value).FirstOrDefault();

    List<String> imgs = (from x in doc.DocumentNode.Descendants()
                     where x.Name.ToLower() == "img"
                     select x.Attributes["src"].Value).ToList<String>();

   lblTitle.Text = title;
   lblDescription.Text = desc;

}

这篇关于如何使用HTML Agility实用程序提取URL的标题，图像和描述的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用HTML Agility实用程序提取URL的标题，图像和描述 [英] how to extract a url's title, images and description using HTML Agility utility

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

如何使用HTML Agility实用程序提取URL的标题，图像和描述 [英] how to extract a url&#39;s title, images and description using HTML Agility utility

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

如何使用HTML Agility实用程序提取URL的标题，图像和描述 [英] how to extract a url's title, images and description using HTML Agility utility

登录关闭