获取具有特定属性的表的HTML源代码 [英] Get HTML source code of table with specific attribute

查看:153
本文介绍了获取具有特定属性的表的HTML源代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图获取具有特定属性的表格的HTML源代码。
下面的代码将帮助您更好地理解。

  public static async任务GetCldInfos()
{
string sURL = @https://www.investing.com/economic-calendar/;
using(HttpClient clientduplicate = new HttpClient())
{
clientduplicate.DefaultRequestHeaders.Add(User-Agent,
Mozilla / 5.0(compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident / 6.0));

using(HttpResponseMessage responseduplicate = await clientduplicate.GetAsync(sURL))
using(HttpContent contentduplicate = responseduplicate.Content)
{
try
{
字符串resultduplicate =等待contentduplicate.ReadAsStringAsync();

// var websiteduplicate = new HtmlDocument();
//websiteduplicate.LoadHtml(resultduplicate);
Debug.WriteLine(resultduplicate);
}
catch(Exception ex1)
{
throw ex1.InnerException;
}
}
}
}

我们访问此处,我们可以选择设置时间范围。
我们选择的时间表相应地修改了表格。
当我做一个http请求来获取源代码时,它会自动给我格林威治标准时间格林威治时间-5:00。



我如何获取源代码例如格林尼治标准时间0:00?

解决方案

使用HTML Agility Pack,您可以使用以下扩展方法获取特定元素具有特定属性:

  public static IEnumerable< HtmlNode> GetNodesByAttr(这个HtmlDocument htmlDoc,字符串标签,字符串attributeName,字符串attributeValue)
{
var allTags = htmlDoc.DocumentNode.Descendants(tag);

返回(来自htmlNode in allTags
选择htmlNode.Attributes
到attrs
attrrs
其中attr.Name == attributeName&& amp ; attr.Value == attributeValue
select attr).Select(attr => attr.OwnerNode).ToList();




$ b

例如,如果你想找到表格 class gmt0,您可以像这样调用扩展方法:

  var websiteduplicate = new HtmlDocument(); 
websiteduplicate.LoadHtml(resultduplicate);

var myElement = websiteduplicate.GetNodesByAttr(table,class,gmt0)。FirstOrDefault();


I am trying to get the HTML source code of a table with a specific attribute. The code below will help you understand more.

public static async Task GetCldInfos()
{
    string sURL = @"https://www.investing.com/economic-calendar/";
    using (HttpClient clientduplicate = new HttpClient())
    {
        clientduplicate.DefaultRequestHeaders.Add("User-Agent",
            "Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.2; WOW64; Trident / 6.0)");

        using (HttpResponseMessage responseduplicate = await clientduplicate.GetAsync(sURL))
        using (HttpContent contentduplicate = responseduplicate.Content)
        {
            try
            {
                string resultduplicate = await contentduplicate.ReadAsStringAsync();

                //var websiteduplicate = new HtmlDocument();
                //websiteduplicate.LoadHtml(resultduplicate);
                Debug.WriteLine(resultduplicate);
            }
            catch (Exception ex1)
            {
                throw ex1.InnerException;
            }
        }
    }
}

When we visit here we have an option to set the time frame. The timeframe we chose modifies the table accordingly. When I do an http request to get the source it automatically gives me the standard GMT which is GMT -5:00.

How can I get the source for example with GMT 0:00?

解决方案

With HTML Agility Pack, you can use the following extension method to get a specific element with a specific attribute:

    public static IEnumerable<HtmlNode> GetNodesByAttr(this HtmlDocument htmlDoc, string tag, string attributeName, string attributeValue)
    {
        var allTags = htmlDoc.DocumentNode.Descendants(tag);

        return (from htmlNode in allTags
                select htmlNode.Attributes
                    into attrs
                    from attr in attrs
                    where attr.Name == attributeName && attr.Value == attributeValue
                    select attr).Select(attr => attr.OwnerNode).ToList();

    }

For example, if you want to find table with class "gmt0", you can call the extension method like this:

var websiteduplicate = new HtmlDocument();
websiteduplicate.LoadHtml(resultduplicate);

var myElement = websiteduplicate.GetNodesByAttr("table", "class", "gmt0").FirstOrDefault();

这篇关于获取具有特定属性的表的HTML源代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆