使用HtmlAgilityPack和Json解析网页 [英] Parse webpage using HtmlAgilityPack and Json

查看：105 发布时间：2021/5/15 18:36:16 c# json web-scraping html-agility-pack

本文介绍了使用HtmlAgilityPack和Json解析网页的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试从Hotpads解析HTML，并对如何提取脚本标签并将其一部分映射到Json对象感到困惑.通过使用HTMLAgilityPack，我已经加载了一个示例url，它破坏了它的查找位置标签.我计划在

I am trying to parse the HTML from Hotpads and am confused on how to get extract the script tag and map part of it into a Json object.By using HTMLAgilityPack I have loaded an example url and it breaks where it looks for that tag. I plan on deserializing it after

主要方法

   private static void ParseSite()
    {
        var url = "https://hotpads.com/308-s-9th-dr-ponte-vedra-beach-fl-32082-syw3eh/building";
        var web = new HtmlWeb();
        var doc = web.Load(url);

        var link = doc.DocumentNode.SelectSingleNode("//a[contains(.,'window.__PRELOADED_STATE__')]");

        if (link != null)
        {
            Console.WriteLine(link.InnerText);
        }
        Console.ReadLine();
    }

脚本标签:

<script>
 window.__PRELOADED_STATE__ = {{SOME JSON HERE}}
<script>

型号:

public class Contact
{
    public string DATA_MODEL { get; set; }
    public string companyName { get; set; }
    public string contactName { get; set; }
    public string contactPhone { get; set; }
}

推荐答案

我认为您只是忘记了在xpath表达式中将'a'标记替换为'script'标记.我目前无法在代码中进行验证，但是您可以使用chrome dev工具通过期望并在搜索窗口中使用它来对其进行测试.

I think you just forgot to replace the 'a' tag with the 'script' tag in your xpath expression. I can't verify in code at the moment but you can use chrome dev tools to test these by going to expect and using it in the search window.

我将其修改为具有script标签，并且使用chrome dev工具为我工作.这是我在页面上尝试过的xpath:

I modified it to have the script tag instead and it worked for me using the chrome dev tools. This is the xpath I tried on the page:

//script[contains(.,'window.__PRELOADED_STATE__')]

这篇关于使用HtmlAgilityPack和Json解析网页的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用HtmlAgilityPack和Json解析网页 [英] Parse webpage using HtmlAgilityPack and Json

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

使用HtmlAgilityPack和Json解析网页 [英] Parse webpage using HtmlAgilityPack and Json

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭