在python中使用Selenium来提取javascript生成的HTML?萤火虫? [英] Using selenium with python to extract javascript-generated HTML? Firebug?
问题描述
Python noobie在这里.
Python noobie here.
我有一个数据收集问题.我在此网站上,当我检查Firebug所需的元素时,它会显示包含所需信息的来源.但是,常规源代码(没有Firebug)不会提供此信息.这意味着我也无法通过普通的硒HTML抓取来获取数据.
What I have is a data harvesting problem. I'm on this website, and when I inspect the element that I want with Firebug, it shows the source containing the information I need. However the regular source code (without Firebug) doesn't give me this info. This means I also can't get the data with the normal selenium HTML grabbing, either.
我想知道硒是否有办法像Firebug一样捕获这些数据-我想这是在页面加载javascript或jquery之后生成的HTML.
I'm wondering if there is a way that selenium can grab this data like Firebug does -- I'm guessing this is HTML that is generated after the page loads with javascript or jquery.
这是一张图片: http://i.imgur.com/CXLOHYx.png
您可以看到我想要的信息是灰色"的,这与大多数其他HTML不同.也许这是关于真正的数据类型的一个很好的线索.
You can see that the info I want is 'greyed out', unlike most of the other HTML there. Maybe that is a good clue as to what kind of data that really is.
推荐答案
尝试使用以下代码,查看其是否有效.
Try to use the following code and see if it works.
import selenium.webdriver.support.ui
element = WebDriverWait(driver, 10).until(
lambda driver : driver.find_element_by_xpath("fImageMap > area:nth-child(2)")
)
这篇关于在python中使用Selenium来提取javascript生成的HTML?萤火虫?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!