如何使用HtmlAgilityPack提取所有锚标签? [英] How to extract all anchor tags using HtmlAgilityPack?

查看:82
本文介绍了如何使用HtmlAgilityPack提取所有锚标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望能够从底部输入字符串中提取所有锚标记,但是我当前的函数只能提取最后一个href标记,我希望提取所有锚标签,标题为'选择'



例如,这是我输入的Feed进入底部变量:

  string  bottom =  < P align = justify> ; 选择< BR> 
< A href =
/Article.aspx?article=froth&PUB=250&ISS=22787&SID= 51610 name = target = _blank> froth< / A>
:文字...< BR>
< A href = /Article.aspx?article=Bouncing back& PUB = 250& ISS = 22787& SID = 51603 name = target = _blank>弹回< / A>
:文字...< BR> < / P>





更新的服务器代码:

  var  result =  ; 

while (reader.Read())
{
string bottom = reader.GetString( 1 );

string price = ;
if (bottom.ToLower()。包含( picks))
{

price = bottom.Substring(bottom.IndexOf( < span class =code-string> picks
), 5 );

HtmlDocument html = new HtmlDocument();

html.LoadHtml(bottom);
var anchors = html.DocumentNode.Descendants( A);

foreach var a 锚点)
{
result = a.OuterHtml;
}
}

result.ToString();

}
return article = result;
}









以下代码行不允许我添加提取 所有() 属性,因为它一直给我这个错误 - 方法没有重载'全部'需要0个参数



  if (anchors.All())





任何进一步的指南,如我非常感谢我可能出错的地方。

谢谢

解决方案

所有方法 [ ^ ]验证源集合中的所有元素是否与特定谓词匹配。在没有谓词的情况下调用它是没有任何意义的,所以没有超载来做到这一点。



你可能打算打电话给 Any 方法 [ ^ ],其中 有一个没有谓词的重载。但是,根据您发布的代码,您无需调用任何方法;只需使用 foreach 循环:

  var  anchors = html.DocumentNode.Descendants(  A); 
foreach var a in 锚点)
{
...
}





编辑:

要返回所有锚点,您需要更改方法以返回字符串列表,或者使用分隔符组合字符串:

< pre lang =cs> var result = new List< string>();

while (reader.Read())
{
string bottom = reader.GetString( 1 );
int index = bottom.IndexOf( 选择,StringComparison.OrdinalIgnoreCase);
if (index!= -1)
{
var html = new HtmlDocument();
html.LoadHtml(下);

var anchors = html.DocumentNode.Descendants( A);
foreach var a in 锚点)
{
result.Add(a.OuterHtml);
}
}
}

// 至返回字符串列表:
return 结果;

// 在新行上返回每个结果的单个字符串(.NET 4.0)或更高):
return string .Join(Environment.NewLine,result );

// 在新行上返回每个结果的单个字符串(.NET 3.5或更早):
return string .Join(Environment.NewLine,result .ToArray());


I would like to be able to extract all the anchor tags from the bottom input string, however my current function, is only able to extract the last a href tag, as I would like to extract all of the anchor tags, under the heading 'picks'.

for example, here is my input feed going into bottom variable:

string bottom = "<P align=justify>picks<BR>
<A href="/Article.aspx?article=froth&PUB=250&ISS=22787&SID=51610" name="" target=_blank>froth</A>
: Text...<BR>
<A href="/Article.aspx?article=Bouncing back&PUB=250&ISS=22787&SID=51603" name="" target=_blank>Bouncing back</A>
: Text...<BR></P>"



Updated server code:

    var result = "";

    while (reader.Read())
    {
        string bottom = reader.GetString(1);

        string price = "";
        if (bottom.ToLower().Contains("picks"))
        {

            price = bottom.Substring(bottom.IndexOf("picks"), 5);

            HtmlDocument html = new HtmlDocument();

            html.LoadHtml(bottom);
            var anchors = html.DocumentNode.Descendants("A");

                foreach (var a in anchors)
                {
                    result = a.OuterHtml;
                }
          }

       result.ToString();

    }
    return article = result;
}





The following line of code is not allowing me add extract all() property, as it keeps throwing me this error - No overload for method 'All' takes 0 arguments

if (anchors.All())



any further guide, as to where I may be going wrong would be very much appreciated.
Thanks

解决方案

The All method[^] verifies that all elements in the source collection match a particular predicate. It doesn't make any sense to call it without a predicate, so there isn't an overload to do that.

It's possible that you meant to call the Any method[^], which does have an overload without a predicate. However, given the code you've posted, you don't need to call any method; just use the foreach loop:

var anchors = html.DocumentNode.Descendants("A");
foreach (var a in anchors)
{
   ...
}



EDIT:
To return all of the anchors, you'll either need to change the method to return a list of strings, or combine the strings using a separator:

var result = new List<string>();

while (reader.Read())
{
    string bottom = reader.GetString(1);
    int index = bottom.IndexOf("picks", StringComparison.OrdinalIgnoreCase);
    if (index != -1)
    {
        var html = new HtmlDocument();
        html.LoadHtml(bottom);

        var anchors = html.DocumentNode.Descendants("A");
        foreach (var a in anchors)
        {
            result.Add(a.OuterHtml);
        }
    }
}

// To return a list of strings:
return result;

// To return a single string with each result on a new line (.NET 4.0 or higher):
return string.Join(Environment.NewLine, result);

// To return a single string with each result on a new line (.NET 3.5 or earlier):
return string.Join(Environment.NewLine, result.ToArray());


这篇关于如何使用HtmlAgilityPack提取所有锚标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆