将 BeautifulSoup 元素解析为 Selenium [英] Parse BeautifulSoup element into Selenium
本文介绍了将 BeautifulSoup 元素解析为 Selenium的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想获取一个使用selenium的网站的源代码;使用 BeautifulSoup 查找特定元素;然后将其作为 selenium.webdriver.remote.webelement 对象解析回 selenium.像这样:
I want to get the source code of a website using selenium; find a particular element using BeautifulSoup; and then parse it back into selenium as a selenium.webdriver.remote.webelement object. Like so:
driver.get("www.google.com")
soup = BeautifulSoup(driver.source)
element = soup.find(title="Search")
element = Selenium.webelement(element)
element.click()
我怎样才能做到这一点?
How can I achieve this?
推荐答案
对我有用的通用解决方案是计算 bs4 元素的 xpath,然后用它来查找 selenium 中的元素,
A general solution that worked for me is to compute the xpath of the bs4 element, then use that to find the element in selenium,
xpath = xpath_soup(soup_element)
selenium_element = driver.find_element_by_xpath(xpath)
...
import itertools
def xpath_soup(element):
"""
Generate xpath of soup element
:param element: bs4 text or node
:return: xpath as string
"""
components = []
child = element if element.name else element.parent
for parent in child.parents:
"""
@type parent: bs4.element.Tag
"""
previous = itertools.islice(parent.children, 0, parent.contents.index(child))
xpath_tag = child.name
xpath_index = sum(1 for i in previous if i.name == xpath_tag) + 1
components.append(xpath_tag if xpath_index == 1 else '%s[%d]' % (xpath_tag, xpath_index))
child = parent
components.reverse()
return '/%s' % '/'.join(components)
这篇关于将 BeautifulSoup 元素解析为 Selenium的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文