使用 Python/PhantomJS/Selenium 滚动无限页面 [英] Scrolling infinite page with Python/PhantomJS/Selenium

查看:29
本文介绍了使用 Python/PhantomJS/Selenium 滚动无限页面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试抓取这个(无限)页面 (www.mydealz.de),但我无法让我的网络驱动程序向下滚动页面.我使用 Python (3.5)、Selenium (3.6) 和 PhantomJS.我已经尝试了几种方法,但 webdriver 只是不会滚动 - 它只是给我第一页.

I'm trying to scrape this one (infinite) page (www.mydealz.de) but I cannot get my webdriver to scroll down the page. Im using Python (3.5), Selenium (3.6) and PhantomJS. I already tried several approaches but the webdriver just wont scroll - it just gives me the first page.

第一种方法(通常的滚动方法):

1st approach (the ususal scrolling approach):

last_height = driver.execute_script("return document.body.scrollHeight")
while True:
  driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
  time.sleep(1)
  new_height = driver.execute_script("return document.body.scrollHeight")
  if new_height == last_height:
       break
  last_height = new_height

第二种方法(只需按几次向下键并释放它,也尝试在两次按下之间等待):

2nd approach (just pressing the down key several times and release it, tried wating inbetween presses, too):

ActionChains(driver).key_down(Keys.ARROW_DOWN).perform()
ActionChains(driver).key_up(Keys.ARROW_DOWN).perform()

第三种方法(找到滚动列表"中的最后一个元素并滚动到其视图以强制滚动):

3rd approach (find the last element in the "scrolling list" and scroll to its view to force scrolling):

posts = driver.find_elements_by_css_selector("div.threadGrid")
driver.execute_script("arguments[0].scrollIntoView();", posts[-1])

到目前为止没有任何效果,有人知道是否有其他方法或我在哪里犯了错误?

Nothing worked so far, does anybody know if there is another approach or where I made an error?

推荐答案

滚动网页直到 url is mydealz.de/?page=3 你可以使用以下代码块:

To scroll through the webpage untill the url is mydealz.de/?page=3 you can use the following block of code :

from selenium import webdriver

driver = webdriver.PhantomJS(executable_path=r'C:\\Utility\\phantomjs-2.1.1-windows\\bin\\phantomjs.exe')
driver.set_window_size(1400,1000)
driver.get("https://www.mydealz.de")
while ("3" not in driver.current_url) :
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
print(driver.current_url)
driver.quit()

控制台输出:

https://www.mydealz.de/?page=3

这篇关于使用 Python/PhantomJS/Selenium 滚动无限页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆