如何在新选项卡中打开网站内的每个产品,以便通过 Python 使用 Selenium 进行抓取 [英] How to open each product within a website in a new tab for scraping using Selenium through Python

查看:24
本文介绍了如何在新选项卡中打开网站内的每个产品,以便通过 Python 使用 Selenium 进行抓取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 selenium 抓取网站https://www.medline.com/catalog/category-products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03"

I am scraping a web site using selenium "https://www.medline.com/catalog/category-products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03"

对于单页和单个产品,我可以通过传递产品 url 来抓取,但我正在尝试通过 selenium 来实现,即自动选择产品页面一一选择所有产品后,它应该移动到下一页,打开产品详细信息页面后,它应该刮掉,这是由美丽的汤完成的这里是来自基本网址的产品网址https://www.medline.com/产品/SensiCare-无粉丁腈橡胶-检查手套/SensiCare/Z05-PF00342?question=&index=P1&indexCount=1"

For single page and single product i am able to scrape by passing the product url but i am trying to do so by selenium i.e auto selection of product an page after select all the product one by one and it should move to next page and after opening product details page it should scrape which is done by beautiful soup here is product url from the base url "https://www.medline.com/product/SensiCare-Powder-Free-Nitrile-Exam-Gloves/SensiCare/Z05-PF00342?question=&index=P1&indexCount=1"

这是我的代码:

chromeOptions = webdriver.ChromeOptions()
chromeOptions.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(executable_path='C:/Users/ptiwar34/Documents/chromedriver.exe', chrome_options=chromeOptions, desired_capabilities=chromeOptions.to_capabilities())
driver.get("https://www.medline.com/catalog/category-products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03")

while True:
    try:  
        WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'resultGalleryViewRow')]//div[@class='medGridProdTitle']//a[contains(@href]"))).click()
        print("Clicked for next page")
    except TimeoutException:
        print("No more pages")
        break
driver.quit()

这里不会抛出错误

它没有为每个产品打开页面,我想在新标签中打开每个产品,抓取后删除并打开新产品的新标签

It does not open page for each product , I want to open each product in new tab and after scraping it delete and open the new tab for a new product

推荐答案

来自网页 https://www.medline.com/catalog/category-products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03新标签 并抓取它,您必须为 WebDriverWait="https://stackoverflow.com/questions/50844779/how-to-handle-multiple-windows-in-python-selenium-with-firefox-driver/50859297#50859297">number_of_windows_to_be(2),您可以使用以下定位器策略一个>:

From the webpage https://www.medline.com/catalog/category-products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03 to open each product in new tab and scrape it you have to induce WebDriverWait for the number_of_windows_to_be(2) and you can use the following Locator Strategies:

  • 代码块:

  • Code Block:

  from selenium import webdriver
  from selenium.webdriver.support.ui import WebDriverWait
  from selenium.webdriver.common.by import By
  from selenium.webdriver.support import expected_conditions as EC
  import time

  chrome_options = webdriver.ChromeOptions() 
  chrome_options.add_argument("start-maximized")
  driver = webdriver.Chrome(options=chrome_options, executable_path=r'C:\WebDrivers\chromedriver.exe')

  driver.get("https://www.medline.com/catalog/category-products.jsp?itemId=Z05-CA02_03&N=111079+4294770643&iclp=Z05-CA02_03")
  my_hrefs = [my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 5).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[contains(@class, 'resultGalleryViewRow')]//div[@class='medGridProdTitle']//a")))]
  windows_before  = driver.current_window_handle # Store the parent_window_handle for future use
  for my_href in my_hrefs:
      driver.execute_script("window.open('" + my_href +"');")
      WebDriverWait(driver, 10).until(EC.number_of_windows_to_be(2)) # Induce  WebDriverWait for the number_of_windows_to_be 2
      windows_after = driver.window_handles
      new_window = [x for x in windows_after if x != windows_before][0] # Identify the newly opened window
      driver.switch_to.window(new_window) # switch_to the new window
      time.sleep(3) # perform your webscraping here
      print(driver.title) # print the page title or your perform your webscraping
      driver.close() # close the window
      driver.switch_to.window(windows_before) # switch_to the parent_window_handle
  driver.quit() #quit your program

  • 控制台输出:

  • Console Output:

      SensiCare Powder-Free Nitrile Exam Gloves | Medline Industries, Inc.
      MediGuard Vinyl Synthetic Exam Gloves | Medline Industries, Inc.
      CURAD Stretch Vinyl Exam Gloves | Medline Industries, Inc.
      CURAD Nitrile Exam Gloves | Medline Industries, Inc.
      SensiCare Ice Blue Powder-Free Nitrile Exam Gloves | Medline Industries, Inc.
      MediGuard Synthetic Exam Gloves | Medline Industries, Inc.
      Accutouch Synthetic Exam Gloves | Medline Industries, Inc.
      Aloetouch Ice Powder-Free Nitrile Exam Gloves | Medline Industries, Inc.
      Aloetouch 3G Powder-Free Synthetic Exam Gloves | Medline Industries, Inc.
      SensiCare Powder-Free Stretch Vinyl Sterile Exam Gloves | Medline Industries, Inc.
      CURAD Powder-Free Textured Latex Exam Gloves | Medline Industries, Inc.
      Accutouch Chemo Nitrile Exam Gloves | Medline Industries, Inc.
      Aloetouch 12" Powder-Free Nitrile Exam Gloves | Medline Industries, Inc.
      Ultra Stretch Synthetic Exam Gloves | Medline Industries, Inc.
      Generation Pink 3G Synthetic Exam Gloves | Medline Industries, Inc.
      SensiCare Extended Cuff Powder-Free Nitrile Exam Gloves | Medline Industries, Inc.
      Eudermic MP High-Risk Powder-Free Latex Exam Gloves | Medline Industries, Inc.
      Aloetouch Powder-Free Latex Exam Gloves | Medline Industries, Inc.
      CURAD Powder-Free Nitrile Exam Gloves | Medline Industries, Inc.
      Medline Sterile Powder-Free Latex Exam Gloves | Medline Industries, Inc.
      SensiCare Silk Powder-Free Nitrile Exam Gloves | Medline Industries, Inc.
      Medline Sterile Powder-Free Latex Exam Glove Pairs | Medline Industries, Inc.
      MediGuard 2.0 Nitrile Exam Gloves | Medline Industries, Inc.
      Designer Boxed Vinyl Exam Gloves | Medline Industries, Inc.
    

  • 您可以在以下位置找到一些相关的详细讨论:

    You can find a couple of relevant detailed discussions in:

    这篇关于如何在新选项卡中打开网站内的每个产品,以便通过 Python 使用 Selenium 进行抓取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆