如何使用HtlmlAgilityPack提取li标签数据 [英] How do I extract li tag data using HtlmlAgilityPack

查看:113
本文介绍了如何使用HtlmlAgilityPack提取li标签数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从以下html中捕获以下值:



- > DURACELL

- > 2345268

- > 5000394002906



I am trying to capture the following values from the html below:

--> DURACELL
--> 2345268
--> 5000394002906

<div id="productDescription">
    <ul>
        <li>
            <strong>Manufacturer:</strong>
            <a href="http://uk.Company.com/duracell">
                DURACELL
            </a>
        </li>
        <li>
            <strong>Order Code:</strong>
            2345268
        </li>
        <li>
            <strong>Manufacturer Part No</strong>
            5000394002906
        </li>
    </ul>
</div>





下面的代码将获取数据,但所有格式仍然存在(标签,第d行ivisions等)。我可以从HAPExplorer看到这些值可以自己捕获。因此,我知道必须有更好的解决方案。





The code below will get the data, but all the formatting is still present (tabs, line divisions etc). I can see from the HAPExplorer that the values can be captured on their own. Therefore I know that there must be a better solution to mine.

IEnumerable<HtmlNode> liContent = document.DocumentNode.SelectNodes("//div[@id='productDescription']/ul/li");

foreach (HtmlNode l in liContent)
{
    Console.WriteLine("InnerText: " + l.InnerText);
}





谢谢。



Thanks.

推荐答案

从判断提供的XML内容似乎要检索< li>元素的最后一个内部文本。

在这种情况下,您可以使用以下内容:

Well judging from the provided XML content it seems that you want to retrieve the <li> element's last presented inner text.
In that case you can use the following:
foreach (HtmlNode l in liContent)
    for (int i = l.ChildNodes.Count - 1; i >= 0; i--)
    {
        string lastInnerText = l.ChildNodes[i].InnerText.Trim();
        if (!string.IsNullOrEmpty(lastInnerText))
        {
            Console.WriteLine("InnerText: " + lastInnerText);
            break;
        }
    }


这篇关于如何使用HtlmlAgilityPack提取li标签数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆