有没有一个Python库,允许你屏蔽一个严重依赖JavaScript的网站? [英] Is there a Python library that allows you to screen-scrape a web site that relies heavily on JavaScript?
问题描述
可能存在重复:
什么是一个很好的工具,屏幕抓取Javascript支持?
我正在尝试对银行网站进行一些屏幕截图。 (我知道,我可能是一个失败者,但忍受着我。)
该网站似乎设置了几个cookie,通过不同的会话相关值,通过JavaScript,然后重定向到主页,如果它找不到这些值。
我一直在想办法找出这些cookie的值通过搜索页面的HTML / JavaScript代码,但相关代码看起来很混乱,所以我很难做到这一点。
有没有Python库模拟启用JavaScript的Web浏览器?我正在考虑像机械化那样:
基本上是一个可以用Python编程的网页浏览器。否则,这是一种解决其他语言的解决方案。
解决方案我回答了类似的问题:点击python中的JavaScript链接?
Possible Duplicate:
What's a good tool to screen-scrape with Javascript support?
I’m trying to do some screen-scraping of my bank’s website. (I know, I’m probably onto a loser, but bear with me.)
The site seems to be setting several cookies, with varying session-related values, via JavaScript, and then redirecting to the home page if it can’t find those values.
I’ve been trying to figure out a way to spot the values of those cookies by searching the HTML/JavaScript code of the pages, but the relevant code looks very obfuscated, so I’m having a hard time doing it.
Is there a Python library that simulates a web browser with JavaScript enabled? I was thinking something like mechanize that also:
- parses the HTML page returned (e.g. with something like lxml)
- parses any JavaScript on the HTML page
- sets any cookies set by the JavaScript
- amends the parsed HTML page with any DOM modifications made by the JavaScript
Basically a web browser that’s programmable in Python. Failing that, a solution in any other language.
I answered a similar question on: Click on a javascript link within python?
这篇关于有没有一个Python库,允许你屏蔽一个严重依赖JavaScript的网站?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!