PhantomJS的行为不同于Firefox webdriver [英] PhantomJS acts differently than Firefox webdriver
问题描述
我正在处理的页面需要慢慢滚动以加载越来越多的结果,这可能是问题所在。
以下是可用于Firefox webdriver的代码,但不适用于PhantomJS:
def get_url ,start_date,end_date):#日期类似于%Y-%m-%d
returnhttps://www.pelikan.sk/sk/flights/listdfc=%s&dtc=C%s& rfc = C%s& rtc =%s& dd =%s& rd =%s& px = 1000& ns = 0& prc =& rng = 0& rbd = 0& ct = 0& view = list ('CVIE%20BUD%20BTS',destination,destination,'CVIE%20BUD%20BTS',start_date,end_date)
$ b $ load load_whole_page(self,destination,start_date, end_date):
deb()
$ b $ url = get_url(destination,start_date,end_date)
self.driver.maximize_window()
self.driver .get(url)
wait = WebDriverWait(self.driver,60)
wait.until(EC.invisibility_of_element_locat ed((By.XPATH,'// img [contains(@src,loading)]')))
wait.until(EC.invisibility_of_element_located((By.XPATH,
u'/ / DIV [。 =Poprosímeotrpezlivosť,hľadámepreVásešteviac letov] / prior-sibling :: img')))
i = 0
old_driver_html =''
end = False
while end == False:
i + = 1
results = self.driver.find_elements_by_css_selector(div.flightbox)
print len(results)
if len (结果)> = __ THRESHOLD__:#用于测试目的。默认值:999
break
try:
self.driver.execute_script(arguments [0] .scrollIntoView();,results [0])
self.driver。执行脚本(arguments [0] .scrollIntoView();,results [-1])$ b $ b除外:
self.driver.save_screenshot('screen_before _'+ str()+'。png')
sleep(2)
print'EXCEPTION << <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ;<<<<<<<<<<<<<<<<<<<<<<<<< ;<<<<<<<<<<<<< lt; b> b继续> b< b $ b> b> new_driver_html = self.driver.page_source:
print'END OF PAGE'
break
old_driver_html = new_driver_html
wait.until(wait_for_more_than_n_eleme nts((By.CSS_SELECTOR,'div.flightbox'),len(results)))
sleep(10)
为了检测页面何时满载,我比较了旧版本的html和新的html,这可能不是我应该做的,但是用Firefox就足够了。
下面是加载停止时PhantomJS的屏幕:
使用Firefox,它会加载越来越多的结果,但使用PhantomJS时,它会被卡住,例如10个结果。
有什么想法?这两个驱动程序有什么区别?
- 不要使用自定义的等待我已经在之前帮助过您了
- 设置
window.document.body.scrollTop
先到0然后到document.body.scrollHeight
连续
工作代码:
results = []
while len(results)< 200:
results = driver.find_elements_by_css_selector(div.flightbox)
print len(results)
$ scroll
driver.execute_script(参数[0] .scrollIntoView();,results [0])
driver.execute_script(window.document.body.scrollTop = 0;)
driver.execute_script(window.document。 body.scrollTop = document.body.scrollHeight;)
driver.execute_script(arguments [0] .scrollIntoView();,results [-1])$ b $ b
第二版(无限循环,如果没有滚动):
results = []
而真:
尝试:
wait.until (除了TimeoutException外:
break
results = self.driver.find_elements_by_css_selector(By.CSS_SELECTOR,div.flightbox),len(results)))
(wait_for_more_than_n_elements (5):
的
print len(结果)
#滚动
试试:
self.driver.execute_script(
arguments [0] .scrollIntoView();
window.document.body.scrollTop = 0;
window.document.body.scrollTop = document.body.scrollHeight;
arguments [1] .scrollIntoView();
,结果[0],结果[-1])$ b $ b除了StaleElementReferenceException:
break#这意味着更多的结果被加载
printDONE 。结果计数:%d%len(results)
请注意, wait_for_more_than_n_elements
预期条件替换:
return count> = self。以
$ / code>
计:
return count> self.count
直到(EC.visibility_of_element_located((By.TAG_NAME,'header')))
footer = wait.until(EC.visibility_of_element_located((By.TAG_NAME,'footer')))
结果= []
而真:
尝试:
wait.until(等待更多_than_n_elements((By.CSS_SELECTOR,div.flightbox),len(结果)))
TimeoutException:
break
results = self.driver.find_elements_by_css_selector(div.flightbox)
print len(results)
#scroll
for xrange(5):
self.driver.execute_script(
arguments [0] .scrollIntoView();
arguments [1] .scrollIntoView();
,header,footer)
sleep(1)
I'm working on some code in which I use Selenium web driver - Firefox. Most of things seems to work but when I try to change the browser to PhantomJS, It starts to behave differently.
The page I'm processing is needed to be scrolled slowly to load more and more results and that's probably the problem.
Here is the code which works with Firefox webdriver, but doesn't work with PhantomJS:
def get_url(destination,start_date,end_date): #the date is like %Y-%m-%d
return "https://www.pelikan.sk/sk/flights/listdfc=%s&dtc=C%s&rfc=C%s&rtc=%s&dd=%s&rd=%s&px=1000&ns=0&prc=&rng=0&rbd=0&ct=0&view=list" % ('CVIE%20BUD%20BTS',destination, destination,'CVIE%20BUD%20BTS', start_date, end_date)
def load_whole_page(self,destination,start_date,end_date):
deb()
url = get_url(destination,start_date,end_date)
self.driver.maximize_window()
self.driver.get(url)
wait = WebDriverWait(self.driver, 60)
wait.until(EC.invisibility_of_element_located((By.XPATH, '//img[contains(@src, "loading")]')))
wait.until(EC.invisibility_of_element_located((By.XPATH,
u'//div[. = "Poprosíme o trpezlivosť, hľadáme pre Vás ešte viac letov"]/preceding-sibling::img')))
i=0
old_driver_html = ''
end = False
while end==False:
i+=1
results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
if len(results)>=__THRESHOLD__: # for testing purposes. Default value: 999
break
try:
self.driver.execute_script("arguments[0].scrollIntoView();", results[0])
self.driver.execute_script("arguments[0].scrollIntoView();", results[-1])
except:
self.driver.save_screenshot('screen_before_'+str()+'.png')
sleep(2)
print 'EXCEPTION<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<'
continue
new_driver_html = self.driver.page_source
if new_driver_html == old_driver_html:
print 'END OF PAGE'
break
old_driver_html = new_driver_html
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, 'div.flightbox'), len(results)))
sleep(10)
To detect when the page is full loaded, I compare old copy of html and new html which is probably not what I'm supposed to do but with Firefox it is sufficient.
Here is the screen of PhantomJS when the loading is stopped:
With Firefox, it loads more and more results, but with PhantomJS it is stucked on for example 10 results.
Any ideas? What are the differences between these two drivers?
Two key things that helped me to solve it:
- do not use that custom wait I've helped you with before
- set the
window.document.body.scrollTop
first to 0 and then todocument.body.scrollHeight
in a row
Working code:
results = []
while len(results) < 200:
results = driver.find_elements_by_css_selector("div.flightbox")
print len(results)
# scroll
driver.execute_script("arguments[0].scrollIntoView();", results[0])
driver.execute_script("window.document.body.scrollTop = 0;")
driver.execute_script("window.document.body.scrollTop = document.body.scrollHeight;")
driver.execute_script("arguments[0].scrollIntoView();", results[-1])
Version 2 (endless loop, stop if there is nothing loaded on scroll anymore):
results = []
while True:
try:
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
except TimeoutException:
break
results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
# scroll
for _ in xrange(5):
try:
self.driver.execute_script("""
arguments[0].scrollIntoView();
window.document.body.scrollTop = 0;
window.document.body.scrollTop = document.body.scrollHeight;
arguments[1].scrollIntoView();
""", results[0], results[-1])
except StaleElementReferenceException:
break # here it means more results were loaded
print "DONE. Result count: %d" % len(results)
Note that I've changed the comparison in the wait_for_more_than_n_elements
expected condition. Replaced:
return count >= self.count
with:
return count > self.count
Version 3 (scrolling from header to footer multiple times):
header = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'header')))
footer = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'footer')))
results = []
while True:
try:
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
except TimeoutException:
break
results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
# scroll
for _ in xrange(5):
self.driver.execute_script("""
arguments[0].scrollIntoView();
arguments[1].scrollIntoView();
""", header, footer)
sleep(1)
这篇关于PhantomJS的行为不同于Firefox webdriver的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!