可以使用XPath搜索< script>块? [英] Can XPath be used to search a <script> block?
问题描述
我可以熟练选择各种HTML内容。因此,所有人都充满信心地创建了一些应该剥夺网站内容的代码,我偶然发现了一些奇怪的JavaScript代码,这些代码在源代码中标价。
I'm OK skills-wise at selecting all sorts of HTML content. So all confident creating some code that should be ripping content of a site I stumbled across some strange JavaScript code where the source puts its prices in.
<script>
var productConfig = {"attributes":{"178":{"id":"178","code":"bp_flavour","label":"Smaak","options":[{"id":"28","label":"Aardbeien","oldPrice":"0","products":["2292","2294","2296","2702"]}
....更多乱七八糟的东西,而不是每种产品的4种变化:(所以像这样的80条不同的线:)
.... more gibberish and than 4 of each product variation: (so like 80 different lines like this:)
,"childProducts":{
"2292":"price":"64.99","finalPrice":"64.99","no_of_servings":"166","178":"27","179":"34"},
"2292":"price":"17.99","finalPrice":"17.99","no_of_servings":"33","178":"28","179":"25"}
}
</script>
显然2292是当前产品的ID。我想读出最终价格。
Apparently 2292 is the id of the product at hand. I would like to read out the "finalPrice".
我的PHP代码:
$file = $this->curl_get_file_contents($url);
$doc = new DOMDocument();
@$doc->loadHTML($file);
$doc->preserveWhiteSpace = false;
$finder = new DomXPath($doc);
$price_query = $finder->query("//script[contains(.,'finalPrice')]");
$price_raw = $price_query->item(0)->nodeValue;
但是我的查询 // script [包含(。, finalPrice) ]
爆破了整个脚本,我无法找到一种方法来更深入地研究JavaScript,尤其是在JavaScript中。有人知道更多/可以给我一个提示吗?
However my query //script[contains(.,"finalPrice")]
blasts out the whole script I cant find a way to dig deeper and more specifically in the JavaScript. Does anyone know more/could give me a hint?
推荐答案
所以我做了什么:用提供的XPATH查询读出脚本。比:strstr,直到我得到了想要的json部分。接下来是:PHP的json_decode函数。将其放在数组中,然后在数组中搜索所需的内容。这是我的解析代码:
So what I did: read out the script with the provided XPATH query. Than: strstr till i got the json parts I wanted. Next up was: PHP's json_decode function. Puts it in an array than searched the arrays for what i wanted. This is my code for the parsing:
$price_query = $finder->query("//script[contains(.,'finalPrice')]");
$price_raw = $price_query->item(0)->nodeValue;
$price_1 = strstr($price_raw, "childProducts");
$price_2 = str_replace('childProducts":', '', $price_1);
$price_3 = strstr($price_2, ',"priceFromLabel"', true);
$price_data = json_decode($price_3, true);
看起来像带str的废话,但是可以用。供您参考。json_decode ftw!
Looks like crap with the str str but works. Thanks all for your thoughts. json_decode ftw!
这篇关于可以使用XPath搜索< script>块?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!