使用HTMLAgilityPack解析javascript HTML [英] Parsing javascript HTML using HTMLAgilityPack

查看:60
本文介绍了使用HTMLAgilityPack解析javascript HTML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用HTML Agility Pack解析以下HTML.

I have the following HTML that I'm trying to parse using the HTML Agility Pack.

这是HTML代码段:

<body id="station_page" class="">
...
<div>....</div>
<script type="text/javascript"> 
if (Blablabla == undefined) { var Blablabla = {}; }
Blablabla .Data1= "I want this data";
Blablabla .BlablablaData = 
{  "Data2":"I want this data",
"Blablabla":"",
"Blablabla":0   }
{   "Blablabla":123,
"Data3":"I want this data",
"Blablabla":123}
    Blablabla .Data4= I want this data;
</script>...

我正在尝试获取这4个数据变量(Data1,Data2,Data3,Data4).首先,我试图找到javascript:

I'm tring to get those 4 data variable (Data1,Data2,Data3,Data4). first, I tried to found the javascript:

doc.DocumentNode.SelectSingleNode("//script[@type='text/javascript']").InnerHtml

如何检查它是否真的是正确的javascript?找到相关的javascript之后,如何获取这4个数据变量(Data1,Data2,Data3,Data4)?

How can I check if it's really the right javascript? After finding the relevant javascript how can I get those 4 data variable (Data1,Data2,Data3,Data4)?

推荐答案

您无法使用HTML Agility Pack解析javascript,它仅支持HTML解析.您可以使用XPATH来获得所需的脚本,如下所示:

You can't parse javascript with HTML Agility Pack, it only supports HTML parsing. You can get to the script you need with an XPATH like this:

doc.DocumentNode.SelectSingleNode("//script[contains(text(), 'Blablabla')]").InnerHtml

但是您需要使用另一种方法(正则表达式,js语法等)来解析javascript

But you'll need to parse the javascript with another method (regex, js grammar, etc.)

这篇关于使用HTMLAgilityPack解析javascript HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆