PhantomJS的行为不同于Firefox webdriver [英] PhantomJS acts differently than Firefox webdriver

查看:155
本文介绍了PhantomJS的行为不同于Firefox webdriver的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一些使用Selenium网络驱动程序的代码 - Firefox。大部分的东西似乎工作,但是当我尝试将浏览器更改为PhantomJS时,它开始表现不同。



我正在处理的页面需要慢慢滚动以加载越来越多的结果,这可能是问题所在。



以下是可用于Firefox webdriver的代码,但不适用于PhantomJS:

  def get_url ,start_date,end_date):#日期类似于%Y-%m-%d 
returnhttps://www.pelikan.sk/sk/flights/listdfc=%s&dtc=C%s& rfc = C%s& rtc =%s& dd =%s& rd =%s& px = 1000& ns = 0& prc =& rng = 0& rbd = 0& ct = 0& view = list ('CVIE%20BUD%20BTS',destination,destination,'CVIE%20BUD%20BTS',start_date,end_date)


$ b $ load load_whole_page(self,destination,start_date, end_date):
deb()
$ b $ url = get_url(destination,start_date,end_date)

self.driver.maximize_window()
self.driver .get(url)

wait = WebDriverWait(self.driver,60)
wait.until(EC.invisibility_of_element_locat ed((By.XPATH,'// img [contains(@src,loading)]')))
wait.until(EC.invisibility_of_element_located((By.XPATH,
u'/ / DIV [。 =Poprosímeotrpezlivosť,hľadámepreVásešteviac letov] / prior-sibling :: img')))
i = 0
old_driver_html =''
end = False
while end == False:
i + = 1

results = self.driver.find_elements_by_css_selector(div.flightbox)
print len(results)
if len (结果)> = __ THRESHOLD__:#用于测试目的。默认值:999
break
try:
self.driver.execute_script(arguments [0] .scrollIntoView();,results [0])
self.driver。执行脚本(arguments [0] .scrollIntoView();,results [-1])$ ​​b $ b除外:
self.driver.save_screenshot('screen_before _'+ str()+'。png')
sleep(2)

print'EXCEPTION << <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ;<<<<<<<<<<<<<<<<<<<<<<<<< ;<<<<<<<<<<<<< lt; b> b继续> b< b $ b> b> new_driver_html = self.driver.page_source:
print'END OF PAGE'
break
old_driver_html = new_driver_html

wait.until(wait_for_more_than_n_eleme nts((By.CSS_SELECTOR,'div.flightbox'),len(results)))
sleep(10)

为了检测页面何时满载,我比较了旧版本的html和新的html,这可能不是我应该做的,但是用Firefox就足够了。

下面是加载停止时PhantomJS的屏幕:

使用Firefox,它会加载越来越多的结果,但使用PhantomJS时,它会被卡住,例如10个结果。



有什么想法?这两个驱动程序有什么区别?

解决方案




  • 不要使用自定义的等待我已经在之前帮助过您了

  • 设置 window.document.body.scrollTop 先到0然后到 document.body.scrollHeight 连续


工作代码:

  results = [] 
while len(results)< 200:
results = driver.find_elements_by_css_selector(div.flightbox)

print len(results)

$ scroll
driver.execute_script(参数[0] .scrollIntoView();,results [0])
driver.execute_script(window.document.body.scrollTop = 0;)
driver.execute_script(window.document。 body.scrollTop = document.body.scrollHeight;)
driver.execute_script(arguments [0] .scrollIntoView();,results [-1])$ ​​b $ b






第二版(无限循环,如果没有滚动):

  results = [] 
而真:
尝试:
wait.until (除了TimeoutException外:
break

results = self.driver.find_elements_by_css_selector(By.CSS_SELECTOR,div.flightbox),len(results)))
(wait_for_more_than_n_elements (5):

print len(结果)

#滚动
试试:
self.driver.execute_script(
arguments [0] .scrollIntoView();
window.document.body.scrollTop = 0;
window.document.body.scrollTop = document.body.scrollHeight;
arguments [1] .scrollIntoView();
,结果[0],结果[-1])$ ​​b $ b除了StaleElementReferenceException:
break#这意味着更多的结果被加载

printDONE 。结果计数:%d%len(results)

请注意, wait_for_more_than_n_elements 预期条件替换:

  return count> = self。以
$ / code>

计:

  return count> self.count 






直到(EC.visibility_of_element_located((By.TAG_NAME,'header')))
footer = wait.until(EC.visibility_of_element_located((By.TAG_NAME,'footer')))

结果= []
而真:
尝试:
wait.until(等待更多_than_n_elements((By.CSS_SELECTOR,div.flightbox),len(结果)))
TimeoutException:
break

results = self.driver.find_elements_by_css_selector(div.flightbox)
print len(results)

#scroll
for xrange(5):
self.driver.execute_script(
arguments [0] .scrollIntoView();
arguments [1] .scrollIntoView();
,header,footer)
sleep(1)


I'm working on some code in which I use Selenium web driver - Firefox. Most of things seems to work but when I try to change the browser to PhantomJS, It starts to behave differently.

The page I'm processing is needed to be scrolled slowly to load more and more results and that's probably the problem.

Here is the code which works with Firefox webdriver, but doesn't work with PhantomJS:

def get_url(destination,start_date,end_date): #the date is like %Y-%m-%d 
    return "https://www.pelikan.sk/sk/flights/listdfc=%s&dtc=C%s&rfc=C%s&rtc=%s&dd=%s&rd=%s&px=1000&ns=0&prc=&rng=0&rbd=0&ct=0&view=list" % ('CVIE%20BUD%20BTS',destination, destination,'CVIE%20BUD%20BTS', start_date, end_date)



def load_whole_page(self,destination,start_date,end_date):
        deb()

        url = get_url(destination,start_date,end_date)

        self.driver.maximize_window()
        self.driver.get(url)

        wait = WebDriverWait(self.driver, 60)
        wait.until(EC.invisibility_of_element_located((By.XPATH, '//img[contains(@src, "loading")]')))
        wait.until(EC.invisibility_of_element_located((By.XPATH,
                                                       u'//div[. = "Poprosíme o trpezlivosť, hľadáme pre Vás ešte viac letov"]/preceding-sibling::img')))
        i=0
        old_driver_html = ''
        end = False
        while end==False:
            i+=1

            results = self.driver.find_elements_by_css_selector("div.flightbox")
            print len(results)
            if len(results)>=__THRESHOLD__: # for testing purposes. Default value: 999
                break
            try:
                self.driver.execute_script("arguments[0].scrollIntoView();", results[0])
                self.driver.execute_script("arguments[0].scrollIntoView();", results[-1])            
            except:
                self.driver.save_screenshot('screen_before_'+str()+'.png')
                sleep(2)

                print 'EXCEPTION<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<'
                continue 

            new_driver_html = self.driver.page_source
            if new_driver_html == old_driver_html:
                print 'END OF PAGE'
                break
            old_driver_html = new_driver_html

            wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, 'div.flightbox'), len(results)))
        sleep(10)

To detect when the page is full loaded, I compare old copy of html and new html which is probably not what I'm supposed to do but with Firefox it is sufficient.

Here is the screen of PhantomJS when the loading is stopped:

With Firefox, it loads more and more results, but with PhantomJS it is stucked on for example 10 results.

Any ideas? What are the differences between these two drivers?

解决方案

Two key things that helped me to solve it:

  • do not use that custom wait I've helped you with before
  • set the window.document.body.scrollTop first to 0 and then to document.body.scrollHeight in a row

Working code:

results = []
while len(results) < 200:
    results = driver.find_elements_by_css_selector("div.flightbox")

    print len(results)

    # scroll
    driver.execute_script("arguments[0].scrollIntoView();", results[0])
    driver.execute_script("window.document.body.scrollTop = 0;")
    driver.execute_script("window.document.body.scrollTop = document.body.scrollHeight;")
    driver.execute_script("arguments[0].scrollIntoView();", results[-1])


Version 2 (endless loop, stop if there is nothing loaded on scroll anymore):

results = []
while True:
    try:
        wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
    except TimeoutException:
        break

    results = self.driver.find_elements_by_css_selector("div.flightbox")
    print len(results)

    # scroll
    for _ in xrange(5):
        try:
            self.driver.execute_script("""
                arguments[0].scrollIntoView();
                window.document.body.scrollTop = 0;
                window.document.body.scrollTop = document.body.scrollHeight;
                arguments[1].scrollIntoView();
            """, results[0], results[-1])
        except StaleElementReferenceException:
            break  # here it means more results were loaded

print "DONE. Result count: %d" % len(results)

Note that I've changed the comparison in the wait_for_more_than_n_elements expected condition. Replaced:

return count >= self.count

with:

return count > self.count


Version 3 (scrolling from header to footer multiple times):

header = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'header')))
footer = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'footer')))

results = []
while True:
    try:
        wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
    except TimeoutException:
        break

    results = self.driver.find_elements_by_css_selector("div.flightbox")
    print len(results)

    # scroll
    for _ in xrange(5):
        self.driver.execute_script("""
            arguments[0].scrollIntoView();
            arguments[1].scrollIntoView();
        """, header, footer)
        sleep(1)

这篇关于PhantomJS的行为不同于Firefox webdriver的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆