Python Selenium屏幕截图无法获取整个页面 [英] Python selenium screen capture not getting whole page

查看:412
本文介绍了Python Selenium屏幕截图无法获取整个页面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个通用的网络爬虫,该爬虫将转到一个网站并进行截图.我正在使用Python,Selnium和PhantomJS.问题在于屏幕截图无法捕获页面上的所有图像.例如,如果我去找你管,它不会捕获主页图像下方的图像. (我没有足够高的代表来表示屏幕快照),我认为这可能与动态内容有关,但是我尝试了诸如隐式等待和set_page_load_timeout方法之类的等待功能.因为这是一个通用的搜寻器,所以我不能等待特定的事件(我想搜寻数百个站点).

I am trying to create a generic webcrawler that will go to a site and take a screenshot. I am using Python, Selnium, and PhantomJS. The problem is that the screenshot is not capturing all the images on a page. For example, if I go to you tube, it doesn't capture images below the main page image. (I don't have high enough rep to post screen shot) I think this may have something to do with dynamic content, but I have tried the wait functions such as implicitly wait and on set_page_load_timeout methods. Because this is a generic crawler I can't wait for a specific event (I want to crawl hundreds of sites).

是否可以创建一个通用的网络爬虫,以进行我要尝试的屏幕截图?我使用的代码是:

Is it possible to create a generic webcrawler that can do the screen capture I am trying to do? Code I am using is:

phantom = webdriver.PhantomJS()
phantom.set_page_load_timeout(30)
phantom.get(response.url)
img = phantom.get_screenshot_as_png() #64-bit encoded string
phantom.quit

这是图片

推荐答案

您的建议解决了该问题.使用了以下代码(部分是由于回答另一个问题而被盗):

Your suggestion solved the problem. Used the following code (stolen in part from answer to another question):

driver = webdriver.PhantomJS()    
driver.maximize_window()
driver.get('http://youtube.com')  
scheight = .1
while scheight < 9.9:
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight/%s);" % scheight)
    scheight += .01        
driver.save_screenshot('screenshot.png')

这篇关于Python Selenium屏幕截图无法获取整个页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆