使用python访问Chrome DOM树 [英] Access Chrome DOM tree with python
本文介绍了使用python访问Chrome DOM树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
解决方案
我发现最好的方法是使用 selenium.webdriver
:
将selenium.webdriver导入为webdriver
import lxml .html as lh
import lxml.html.clean as clean
rowser = webdriver.Chrome()#获取Chrome本地会话
browser.get(http:// www.webpage.com)#载入页面
content = browser.page_source
cleaner = clean.Cleaner()
content = cleaner.clean_html(content)
doc = lh.fromstring(content)
lxml的形式获取DOM。 html.HtmlElement
Using Chrome DevTools you can see the DOM tree of a page. Is there a way to access and pull out that tree using python?
解决方案
The best way that I found was using selenium.webdriver
:
import selenium.webdriver as webdriver
import lxml.html as lh
import lxml.html.clean as clean
browser = webdriver.Chrome() # Get local session of Chrome
browser.get("http://www.webpage.com") # Load page
content=browser.page_source
cleaner=clean.Cleaner()
content=cleaner.clean_html(content)
doc=lh.fromstring(content)
doc gets the DOM as lxml.html.HtmlElement
这篇关于使用python访问Chrome DOM树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文