使用 Python/PhantomJS/Selenium 滚动无限页面 [英] Scrolling infinite page with Python/PhantomJS/Selenium
问题描述
我正在尝试抓取这个(无限)页面 (www.mydealz.de),但我无法让我的网络驱动程序向下滚动页面.我使用 Python (3.5)、Selenium (3.6) 和 PhantomJS.我已经尝试了几种方法,但 webdriver 只是不会滚动 - 它只是给我第一页.
I'm trying to scrape this one (infinite) page (www.mydealz.de) but I cannot get my webdriver to scroll down the page. Im using Python (3.5), Selenium (3.6) and PhantomJS. I already tried several approaches but the webdriver just wont scroll - it just gives me the first page.
第一种方法(通常的滚动方法):
1st approach (the ususal scrolling approach):
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(1)
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height
第二种方法(只需按几次向下键并释放它,也尝试在两次按下之间等待):
2nd approach (just pressing the down key several times and release it, tried wating inbetween presses, too):
ActionChains(driver).key_down(Keys.ARROW_DOWN).perform()
ActionChains(driver).key_up(Keys.ARROW_DOWN).perform()
第三种方法(找到滚动列表"中的最后一个元素并滚动到其视图以强制滚动):
3rd approach (find the last element in the "scrolling list" and scroll to its view to force scrolling):
posts = driver.find_elements_by_css_selector("div.threadGrid")
driver.execute_script("arguments[0].scrollIntoView();", posts[-1])
到目前为止没有任何效果,有人知道是否有其他方法或我在哪里犯了错误?
Nothing worked so far, does anybody know if there is another approach or where I made an error?
推荐答案
滚动网页直到 url is mydealz.de/?page=3
你可以使用以下代码块:
To scroll through the webpage untill the url is mydealz.de/?page=3
you can use the following block of code :
from selenium import webdriver
driver = webdriver.PhantomJS(executable_path=r'C:\\Utility\\phantomjs-2.1.1-windows\\bin\\phantomjs.exe')
driver.set_window_size(1400,1000)
driver.get("https://www.mydealz.de")
while ("3" not in driver.current_url) :
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
print(driver.current_url)
driver.quit()
控制台输出:
https://www.mydealz.de/?page=3
这篇关于使用 Python/PhantomJS/Selenium 滚动无限页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!