如何在python中将JavaScript HTML呈现为HTML? [英] How can I render JavaScript HTML to HTML in python?

查看:152
本文介绍了如何在python中将JavaScript HTML呈现为HTML?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我环顾四周,只找到了将URL呈现为HTML的解决方案。但是我需要一种方法来将一个网页(我已经拥有,并且有JavaScript)呈现为正确的HTML。

I have looked around and only found solutions that render a URL to HTML. However I need a way to be able to render a webpage (That I already have, and that has JavaScript) to proper HTML.

旺旺:
网页(使用JavaScript)---> HTML

Want: Webpage (with JavaScript) ---> HTML

不:URL - >网页(使用JavaScript)---> HTML

Not: URL --> Webpage (with JavaScript) ---> HTML

我无法弄清楚如何使其他代码按我想要的方式工作。

I couldn't figure out how to make the other code work the way I wanted.

这是我使用的代码渲染URL:
http://webscraping.com/blog/Scraping-JavaScript-webpages- with-webkit /

This is the code I was using that renders URLs: http://webscraping.com/blog/Scraping-JavaScript-webpages-with-webkit/

为清楚起见,上面的代码采用的是一个网页的URL,其中包含由JavaScript呈现的页面的某些部分,所以如果我刮该页面通常使用say urllib2然后我将不会获得在JavaScript之后呈现的所有链接等。

For clarity, the code above takes a URL of a webpage that has some parts of the page rendered by JavaScript, so if I scrape the page normally using say urllib2 then I won't get all the links etc that are rendered as after the JavaScript.

但是我希望能够抓取页面,再次使用urllib2,然后渲染该页面并获得结果HTML。 (与上面的代码不同,因为它需要一个URL作为它的参数。

However I want to be able to scrape a page, say again with urllib2, and then render that page and get the outcome HTML. (Different to the above code since it takes a URL as it's argument.

感谢任何帮助,谢谢你们:)

Any help is appreciated, thanks guys :)

推荐答案

您可以从命令行 pip install selenium ,然后执行以下操作:

You can pip install selenium from a command line, and then run something like:

from selenium import webdriver
from urllib2 import urlopen

url = 'http://www.google.com'
file_name = 'C:/Users/Desktop/test.txt'

conn = urlopen(url)
data = conn.read()
conn.close()

file = open(file_name,'wt')
file.write(data)
file.close()

browser = webdriver.Firefox()
browser.get('file:///'+file_name)
html = browser.page_source
browser.quit()

这篇关于如何在python中将JavaScript HTML呈现为HTML?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆