在脚本标签中解析json var [英] Parsing json var inside script tag

查看：115 发布时间：2020/5/4 8:27:31 python lxml

本文介绍了在脚本标签中解析json var的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在尝试抓取以下'https://sports.bovada.lv/soccer/premier-league'

I'm currently trying to scrape the json output of the follow 'https://sports.bovada.lv/soccer/premier-league'

它具有以下来源

<script type="text/javascript">var swc_market_lists = {"items":[{"description":"Game Lines","id":"23", ... </script>

我正在尝试获取swc_market_lists var

I'm trying to get the contents of the swc_market_lists var

现在的问题是，当我使用以下代码时

Now the issue I have is that when I use the following code

import requests
from lxml import html



url = 'https://sports.bovada.lv/soccer/premier-league'
r = requests.get(url)
tree = html.fromstring(r.content)
var = tree.xpath('//script')
print(var)

我得到一个空的var值.

I get an empty var value.

我也尝试过保存r.text并查看它，但是我没有在其中看到脚本标签.

I have also tried saving the r.text and viewing it but I don't see the script tags in there.

我想念什么?

推荐答案

您需要传递User-Agent标头以使其起作用:

You need to pass the User-Agent header to make it work:

r = requests.get(url, headers={"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.103 Safari/537.36"})

要获取所需的script，您可以检查文本中是否存在swc_market_lists:

To get the desired script, you can check for presence of swc_market_lists in the text:

script = tree.xpath('//script[contains(., "swc_market_lists")]/text()')[0]
print(script)

要提取swc_market_lists变量值:

import re

data = re.search(r"var swc_market_lists = (.*?);$", script).group(1)
print(data)

然后，为了使其易于使用，请使用json.loads()将其加载到Python字典中:

Then, to make it easy to work with it, load it with json.loads() into a Python dictionary:

import json
data = json.loads(data)

这篇关于在脚本标签中解析json var的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在脚本标签中解析json var [英] Parsing json var inside script tag

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在脚本标签中解析json var [英] Parsing json var inside script tag

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭