HTML敏捷包:解析一个href标记 [英] HTML Agility pack: parsing an href tag

查看:139
本文介绍了HTML敏捷包:解析一个href标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将如何有效地从这个解析href属性值:

 < TR>
&所述; TD行跨度=1列跨度=1大于7&下; / TD>
&所述; TD行跨度=1列跨度=1>
<一类=undMe的href =的rel =skaterLinkData形状=矩形&GT/冰/ player.htm ID = 8475179?; d。库利科夫< / A>
< / TD>
&所述; TD行跨度=1列跨度=1D 1和D&下; / TD>
&所述; TD行跨度=1列跨度=1大于0&下; / TD>
&所述; TD行跨度=1列跨度=1大于0&下; / TD>
&所述; TD行跨度=1列跨度=1大于0&下; / TD>
[...]

我感兴趣的是具有玩家ID,它是: 8475179 这里是code我到目前为止有:

  //迭代所有行(玩家)
        的for(int i = 1; I< rows.Count ++ I)
        {
            HtmlNodeCollection COLS =行[I] .SelectNodes(.// TD);            //新的球员
            Dim_Player球员=新Dim_Player();                //遍历此行中的所有列
                为(诠释J = 1; J&10 6 ++ j)条
                {
                    开关(J){
                        案例1:player.Name = COLS [J] .InnerText;
                                player.Player_id = Int32.Parse(/ *这就是我想要解析href值* /);
                                打破;
                        案例2:player.Position = COLS [J] .InnerText;打破;
                        案例3:stats.Goals = Int32.Parse(COLS [J] .InnerText);打破;
                        案例4:stats.Assists = Int32.Parse(COLS [J] .InnerText);打破;
                        案例5:stats.Points = Int32.Parse(COLS [J] .InnerText);打破;
                    }
                }


解决方案

根据你的榜样,这个工作对我来说:

 的HTMLDocument HTMLDOC =新的HTMLDocument();
htmlDoc.Load(test.html的);
VAR链接= htmlDoc.DocumentNode
                  .Descendants(一)
                  。首先(X =>!x.Attributes [下课] = NULL
                           &功放;&安培; 。x.Attributes [类]值==undMe);。字符串hrefValue = link.Attributes [HREF]值;
长playerId = Convert.ToInt64(hrefValue.Split(=)[1]);

有关实际使用​​需要添加错误检查等。

How would I effectively parse the href attribute value from this :

<tr>
<td rowspan="1" colspan="1">7</td>
<td rowspan="1" colspan="1">
<a class="undMe" href="/ice/player.htm?id=8475179" rel="skaterLinkData" shape="rect">D. Kulikov</a>
</td>
<td rowspan="1" colspan="1">D</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">0</td>
<td rowspan="1" colspan="1">0</td>
[...]

I am interested in having the player id, which is: 8475179 Here is the code I have so far:

        // Iterate all rows (players)
        for (int i = 1; i < rows.Count; ++i)
        {
            HtmlNodeCollection cols = rows[i].SelectNodes(".//td");

            // new player
            Dim_Player player = new Dim_Player();

                // Iterate all columns in this row
                for (int j = 1; j < 6; ++j)
                {
                    switch (j) {
                        case 1: player.Name = cols[j].InnerText;
                                player.Player_id = Int32.Parse(/* this is where I want to parse the href value */); 
                                break;
                        case 2: player.Position = cols[j].InnerText; break;
                        case 3: stats.Goals = Int32.Parse(cols[j].InnerText); break;
                        case 4: stats.Assists = Int32.Parse(cols[j].InnerText); break;
                        case 5: stats.Points = Int32.Parse(cols[j].InnerText); break;
                    }
                }

解决方案

Based on your example this worked for me:

HtmlDocument htmlDoc = new HtmlDocument();
htmlDoc.Load("test.html");
var link = htmlDoc.DocumentNode
                  .Descendants("a")
                  .First(x => x.Attributes["class"] != null 
                           && x.Attributes["class"].Value == "undMe");

string hrefValue = link.Attributes["href"].Value;
long playerId = Convert.ToInt64(hrefValue.Split('=')[1]);

For real use you need to add error checking etc.

这篇关于HTML敏捷包:解析一个href标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆