使用HtmlAgilityPack C#来从HTML表中的数据 [英] c# using HtmlAgilityPack to get data from HTML table

查看:245
本文介绍了使用HtmlAgilityPack C#来从HTML表中的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图获取信息的一个html表通过解析使用HtmlAgilityPack的HTML。

i am trying to get information out of an html table by parsing the html using HtmlAgilityPack.

下面是个什么HTML如下:

here is what the HTML looks like:

...
...
...
<tbody>
                    <tr>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_18">AA00857</div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div></div>
                            <div class="style_20">TPRCF</div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_21"></div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_21">16908/2</div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_18">&nbsp;ETG_C</div>
                        </td>
                    </tr>
                    <tr>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_18">AA01231</div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div></div>
                            <div class="style_20">TPRCF</div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_21"></div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_21">16909/19</div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_18">&nbsp;ETG_C</div>
                        </td>
                    </tr>
                    <tr>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_18">AA01233</div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div></div>
                            <div class="style_20">TPRCF</div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_21"></div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_21">16907/7</div>
                        </td>
                        <td class="style_19" style="vertical-align: baseline;">
                            <div class="style_18">&nbsp;ETG_C</div>
                        </td>
                    </tr>
...
...

我需要从上面提取这些值:

i need to extract from the above these values:

AA00857, TPRCF, 16908/2, ETG_C

到目前为止,一切我已经是这样的:

so far all i have is this:

HtmlWeb hw = new HtmlWeb();
            HtmlAgilityPack.HtmlDocument htmlDoc = hw.Load(@"http://www.some123123site.com/index");



            if (htmlDoc.DocumentNode != null)
            {
                HtmlAgilityPack.HtmlNode bodyNode = htmlDoc.DocumentNode.SelectSingleNode("//tbody");

                if (bodyNode != null)
                {
                    // Do something with bodyNode
                }
            }

请大家帮帮忙!

please help!

推荐答案

试试这个:

HtmlWeb hw = new HtmlWeb();              
HtmlAgilityPack.HtmlDocument htmlDoc = hw.Load(@"http://www.some123123site.com/index");                 
if (htmlDoc.DocumentNode != null)              
{                   
        foreach(HtmlNode text in htmlDoc.DocumentNode.SelectNodes("//tr/td/div/text()"))
        {     
            Console.WriteLine(text.InnerText);  
        }
}

这篇关于使用HtmlAgilityPack C#来从HTML表中的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆