使用C#中的HtmlAgilityPack获取其他元素内的特定元素 [英] Get specific element inside other element with HtmlAgilityPack in C#

查看:399
本文介绍了使用C#中的HtmlAgilityPack获取其他元素内的特定元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在一个项目中,我需要解析很多html文件.我需要从一个<div class="story-body">

I'm working on a project where I need to parse a lot of html files. I need to get every <p> from within one <div class="story-body">

到目前为止,我已经有了这段代码,它可以实现我想要的功能,但是我想知道如何使用xpath表达式来执行此操作.我试过了:

So far I have this code and it does what I want, but I was wondering how to do this using the xpath expression. I tried this:

textBody.SelectNodes ("What to put here? I tried //p but it gives every p in document not inside the one div")

但是没有成功.有什么想法吗?

But without success. Any ideas?

public void Parse(){
   HtmlNode title = doc.DocumentNode.SelectSingleNode ("//h1[(@class='story-header')]");
   HtmlNode textBody = doc.DocumentNode.SelectSingleNode ("//div[(@class='story-body')]");

   XmlText textT;
   XmlText textS;

   string story = "";

   if(title != null){
     textT = xmlDoc.CreateTextNode(title.InnerText);
     titleElement.AppendChild(textT);
     Console.WriteLine(title.InnerText);
   }

   foreach (HtmlNode node in textBody.ChildNodes) {
      if(node.Name == "p" || (node.Name == "span" && node.GetAttributeValue("class", "class") == "cross-head")){
         story += node.InnerText + "\n\n";
         Console.WriteLine(node.InnerText);
      }
   }

   textS = xmlDoc.CreateTextNode (story);

   storyElement.AppendChild (textS);

   try
   {
        xmlDoc.Save("test.xml");            
   }
   catch (Exception e)
   {
        Console.WriteLine(e.Message);
   }
}

推荐答案

这是一件很简单的事情,您只需将.添加到类似于.//p的字符串中,就可以只获得的子节点.当前节点.

That's a rather simple thing to do, you just have to add a . to the string like .//p, that way you get only child nodes of the current node.

另一种方法是像这样调用SelectNodes:

Another way would be to just call SelectNodes like this:

doc.DocumentNode.SelectNodes("//div[(@class='story-body')]/p");

这篇关于使用C#中的HtmlAgilityPack获取其他元素内的特定元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆