Selenium返回的HTML源与在浏览器中查看的HTML源不同 [英] Selenium returns different html source than viewed in browser

查看:562
本文介绍了Selenium返回的HTML源与在浏览器中查看的HTML源不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正试图通过从此网站上单击加载更多"按钮来使用Selenium来加载结果页面.一个>. 但是,由selenium加载的html页面的源代码未显示(加载)浏览时可以看到的实际产品. 这是我的代码:

Im trying to use Selenium to load next page with results by clicking Load More button from this site. However the source code of the html page loaded by selenium does not show(load) actual products which one can see when browsing. Here is my code:

from selenium import webdriver      
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import os
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

#browser = webdriver.Firefox()#Chrome('./chromedriver.exe')
URL = "https://thekrazycouponlady.com/coupons-for/costco"
PATIENCE_TIME = 60
LOAD_MORE_BUTTON_XPATH = '//button[@class = "kcl-btn ng-scope"]/span' 
caps = DesiredCapabilities.PHANTOMJS
# driver = webdriver.Chrome(r'C:\Python3\selenium\webdriver\chromedriver_win32\chromedriver.exe')
caps["phantomjs.page.settings.userAgent"] = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"
driver = webdriver.PhantomJS(r'C:\Python3\selenium\webdriver\phantomjs-2.1.1-windows\bin\phantomjs.exe',service_log_path=os.path.devnull,desired_capabilities=caps)
driver.get(URL)

while True:
    try:
        time.sleep(20)
        html = driver.page_source.encode('utf-8')
        print(html)
        loadMoreButton = driver.find_element_by_xpath(LOAD_MORE_BUTTON_XPATH)


        loadMoreButton.click()

    except Exception as e:
        print (e)
        break
print ("Complete")

driver.quit()

不确定是否可以在此处附加示例html文件以供参考. 无论如何,问题是什么?如何通过浏览器加载与硒完全相同的页面?

Not sure if I can attach sample html file here for reference. Anyway, what is the problem and how do I load exactly the same page with selenium as i do via browser?

推荐答案

它可能是由于使用了PhantomJS,已经不再维护,并且从Selenium 3.8.1开始已弃用.改用无头Chrome.

It might be due to the use of PhantomJS, it isn't maintained any more and deprecated from Selenium 3.8.1. Use Chrome headless instead.

options = Options()
options.headless = True
driver = webdriver.Chrome(CHROMEDRIVER_PATH, chrome_options=options)

这篇关于Selenium返回的HTML源与在浏览器中查看的HTML源不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆