如何获取在Python中的JavaScript内容 [英] how to fetch javascript contents in python

查看：2428 发布时间：2016/8/5 19:01:03 javascript python html python-2.7 beautifulsoup

本文介绍了如何获取在Python中的JavaScript内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个网站，有数据我想获取存储在一个javascript。我该如何获取呢？

I have a website that has data I want to fetch stored in a javascript. How do I fetch it?

在code是这样的： - http://pastebin.com/zhdWT5HM

The code is this :- http://pastebin.com/zhdWT5HM

我想从VAR playersData行去取。我想取这事 - playerId：showsPlayer（不含引号明显）。我怎么做呢？

I want to fetch from "var playersData" line. I want to fetch this thing :- "playerId":"showsPlayer" (without quotes obviously). How do I do so?

我试过美丽的汤。我现在的剧本是这样的

I've tried beautiful soup. My current script looks like this

q = requests.get('websitelink')
soup = BeautifulSoup(q.text)

searching = soup.findAll('script',{'type':'text/javascript'})
for playerIdin searching:
  x = playerId.find_all('var playersData', limit=1)
  print x

我得到[]作为我的输出。我似乎无法在这里找出我的问题。
请大家帮帮忙家伙和女生：）

I'm getting [] as my output. I can't seem to figure out my problem here. Please help out guys and gals :)

推荐答案

BeautifulSoup 只会帮助查找所需的剧本标记。然后，你将有多种选择：你可以利用JavaScript语法分析器中提取所需的数据，如 SLIMIT ，或使用常规的前pressions：

BeautifulSoup would only help locating the desired script tag. Then, you would have multiple options: you can extract the desired data with a javascript parser, like slimit, or use regular expressions:

import re

from bs4 import BeautifulSoup

page = """
<script type="text/javascript">
            var logged = true;
            var video_id = 59374;
            var item_type = 'official';

            var debug = false;
            var baseUrl = 'http://www.example.com';
            var base_url = 'http://www.example.com/';
            var assetsBaseUrl = 'http://www.example.com/assets';
            var apiBaseUrl = 'http://www.example.com/common';
            var playersData = [{"playerId":"showsPlayer","userId":true,"solution":"flash","playlist":[{"itemId":"5090","itemAK":"Movie"}]];
</script><script type="text/javascript" >
"""
soup = BeautifulSoup(page)

pattern = re.compile(r'"playerId":"(.*?)"', re.MULTILINE | re.DOTALL)
script = soup.find("script", text=pattern)

print pattern.search(script.text).group(1)

打印：

showsPlayer

这篇关于如何获取在Python中的JavaScript内容的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何获取在Python中的JavaScript内容 [英] how to fetch javascript contents in python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

如何获取在Python中的JavaScript内容 [英] how to fetch javascript contents in python

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭