当 href 包含 javascript:__doPostBack() 时如何通过页码进行分页 [英] How to paginate through the page numbers when href contains javascript:__doPostBack()

查看:58
本文介绍了当 href 包含 javascript:__doPostBack() 时如何通过页码进行分页的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试抓取这个网站 http://www.mfa.gov.tr/sub.ar.mfa?dcabec54-44b3-4aaa-a725-70d0caa8a0ae但是当我想转到下一页时我不能,因为链接没有改变你会发现页面链接是这样的

I'm trying to scrape this website http://www.mfa.gov.tr/sub.ar.mfa?dcabec54-44b3-4aaa-a725-70d0caa8a0ae but when I want to go to next page I can't because the link doesn't change you will find that pages links are like that

href="javascript:__doPostBack('sb$grd','Page$1')"

我有一个我尝试过的代码,但它只转到第 2 页,然后给了我一个错误:故事元素引用:元素未附加到页面文档

I have a code that I tried but it only goes to page 2 and then gave me an error: tale element reference: element is not attached to the page document

from selenium import webdriver
url = 'http://www.mfa.gov.tr/sub.ar.mfa?dcabec54-44b3-4aaa-a725-70d0caa8a0ae'
driver = webdriver.Chrome()
driver.get(url)
btn = [w for w in driver.find_elements_by_xpath('//*[@id="sb_grd"]/tbody/tr[26]/td/table/tbody/tr/td/a')]
for b in btn:
    driver.execute_script("arguments[0].click();", b)

推荐答案

使用 href 属性为 "javascript:__doPostBack('sb$grd','Page$2')" 你需要为 element_to_be_clickable() 引入 WebDriverWait 并且你可以使用下面的 定位器策略:

To paginate through the page numbers with href attribute as "javascript:__doPostBack('sb$grd','Page$2')" you need to induce WebDriverWait for the element_to_be_clickable() and you can use the following Locator Strategies:

  • 代码块:

  • Code Block:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

options = webdriver.ChromeOptions() 
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("http://www.mfa.gov.tr/sub.ar.mfa?dcabec54-44b3-4aaa-a725-70d0caa8a0ae")
while True:
    try:
        WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//table[@id='sb_grd']//table/tbody/tr//td/span//following::td[1]/a"))).click()
        print("Next page clicked")
    except TimeoutException:
        print("No more pages")
        break
driver.quit()

  • 控制台输出:

  • Console Output:

    Next page clicked
    Next page clicked
    Next page clicked
    .
    .
    .
    No more pages
    

  • 您可以在以下位置找到相关的详细讨论:

    You can find a relevant detailed discussion in:

    这篇关于当 href 包含 javascript:__doPostBack() 时如何通过页码进行分页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆