使用动态鼠标悬停事件抓取网站 [英] Scrape website with dynamic mouseover event
问题描述
我正在尝试抓取从鼠标悬停事件动态生成的数据.我想从
的哈希率分布图表中获取信息
I am trying to scrape data which is generated dynamically from mouseover events.
I want to capture the information from the Hash Rate Distribution chart from
https://slushpool.com/stats/?c=btc which is generated when you scroll over each circle.
The code below gets the html data from the website, and returns the table which is filled once the mouse passes over a circle. However, I have not been able to figure out how to trigger the mouseover event for each circle to fill the table.
from lxml import etree
from xml.etree import ElementTree
from selenium import webdriver
driver_path = "#Firefox web driver"
browser = webdriver.Firefox(executable_path=driver_path)
browser.get("https://slushpool.com/stats/?c=btc")
page = browser.page_source #Get page html
tree = etree.HTML(page) #create etree
table_Xpath = '/html/body/div[1]/div/div/div/div/div[5]/div[1]/div/div/div[2]/div[2]/div[2]/div/table'
table =tree.xpath(table_Xpath) #get table using Xpath
print(ElementTree.tostring(table[0])) #Returns empty table.
#Should return data from each mouseover event
Is there a way to trigger the mouseover event for each circle, then extract the generated data.
Thank you in advance for the help!
To trigger the mouseover event for each circle you have to induce WebDriverWait for the visibility_of_all_elements_located()
and you can use the following Locator Strategies:
Code Block:
from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.common.action_chains import ActionChains chrome_options = webdriver.ChromeOptions() chrome_options.add_argument("start-maximized") chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"]) chrome_options.add_experimental_option('useAutomationExtension', False) driver = webdriver.Chrome(options=chrome_options, executable_path=r'C:UtilityBrowserDriverschromedriver.exe') driver.get("https://slushpool.com/stats/?c=btc") driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1//span[text()='Distribution']")))) elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//h1//span[text()='Distribution']//following::div[1]/*[name()='svg']//*[name()='g']//*[name()='g' and @class='paper']//*[name()='circle']"))) for element in elements: ActionChains(driver).move_to_element(element).perform()
Browser Snapshot:
这篇关于使用动态鼠标悬停事件抓取网站的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!