硒刮JavaScript [英] selenium scraping javascript
问题描述
从selenium import webdriver
chrome_path = rC:\Users\hessien\Desktop\chromedriver_win32\chromedriver.exe
driver = webdriver.Chrome(chrome_path)
driver.get(http:// example.com)
driver.find_element_by_xpath(// * [@ id =header] / div / div [2] / div [3] / ul / li / label / a ).click()
element = driver.find_element_by_xpath(// * [@ id =s])
element.send_keys(example)
驱动程序.find_element_by_xpath(// * [@ id =searchform] / button / span)。click()
driver.find_element_by_xpath(// * [@ id =contenedor ] / div / div [2] / div [1] / div [2] / article / div [2] / div [1] / a)click()
driver.find_element_by_xpath( // * [@ id =playex] / div [1])。click()
elem = driver.find_element_by_xpath(// * [@ id =mediaplayer_media] /视频 )get_attribute( SRC);
print elem
但经过一些搜索,我发现硒主要用作测试框架不是为了刮和爬行!我的问题是硒可以做的工作吗?如果是,如何在HTML按钮中执行python代码?我也使用Django。
如果没有,你可以推荐任何可以完成任务的事情吗?
如果你真的想做一个刮板我建议你使用美丽的汤,这是一个Python库,用于从HTML和XML文件中提取数据。
您可以将python脚本与django集成,这可以在一次点击时触发。
以下是链接。
https://pypi.python.org/pypi/beautifulsoup4
I'm planning on making a website that scrapes a lot of daily updated URLS (JavaScript) from many websites. I did some research and found selenium, already made some code to extract a URL from a website
from selenium import webdriver
chrome_path = r"C:\Users\hessien\Desktop\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("http://example.com")
driver.find_element_by_xpath("""//*[@id="header"]/div/div[2]/div[3]/ul/li/label/a""").click()
element = driver.find_element_by_xpath("""//*[@id="s"]""")
element.send_keys("example")
driver.find_element_by_xpath("""//*[@id="searchform"]/button/span""").click()
driver.find_element_by_xpath("""//*[@id="contenedor"]/div/div[2]/div[1]/div[2]/article/div[2]/div[1]/a""").click()
driver.find_element_by_xpath("""//*[@id="playex"]/div[1]""").click()
elem = driver.find_element_by_xpath("""//*[@id="mediaplayer_media"]/video""").get_attribute("src");
print elem
but after some searches I found out that selenium mainly used as a testing framework not for scraping and crawling!.. my question is can selenium do the work? if yes, how to execute the python code in an HTML button? I'm also using Django. if no, could you recommend anything that can do the task?
If you really want to make a scrapper i recommend you to use Beautiful soup, which is a Python library for pulling data out of HTML and XML files. you can integrate the python script with django which can be triggered on a click. following is the link.
https://pypi.python.org/pypi/beautifulsoup4
这篇关于硒刮JavaScript的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!