如何使用Selenium和Python从由空格分隔的textnodes中获取文本 [英] How to get text from textnodes seperated by whitespace using Selenium and Python

查看:72
本文介绍了如何使用Selenium和Python从由空格分隔的textnodes中获取文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在此页面上:

https://fantasy.premierleague.com/statistics

当您单击任何"i"时,播放器旁边的图标,将显示一个弹出窗口.然后,我想获得玩家的姓.这就是检查元素"的方式.看起来像(空白"实际上出现在一个框内):

When you click on any "i" icon next to a player, a popup window appears. Then, i want to get the surname of the player. This is how "inspect element" looks like ("whitespace" actually appears within a box):

<h2 class="ElementDialog__ElementHeading-gmefnd-2 ijAScJ">
 Kevin
 whitespace
 De Bruyne

我想做的是获取空格后出现的文本.我可以使用以下方式获取全文(即姓名和姓氏):

What i want to do is to take the text that appears after the whitespace. I can get the full text (ie both name and surname) using this:

player_full_name = driver.find_element_by_xpath('//*[@class="ElementDialog__ElementHeading-gmefnd-2 ijAScJ"]').text

但是我怎样才能只获得姓氏(即空格后出现的内容)?请注意,对于其他玩家,可能是这样的:

but how can i get the surname only (ie what appears after the whitespace)? Note that for other players it could have been like this:

<h2 class="ElementDialog__ElementHeading-gmefnd-2 ijAScJ">
 Gabriel Fernando
 whitespace
 de Jesus

或这样:

<h2 class="ElementDialog__ElementHeading-gmefnd-2 ijAScJ">
 Dean
 whitespace
 Henderson

即拆分文本并采用最后一个或两个元素将不起作用.

ie splitting the text and taking the last one or two elements will not work.

推荐答案

播放器的姓氏是其父级中第二个或最后一个文本节点定位器策略:

The surname of the player is the second or last text node within it's parent WebElement. So extract the surname e.g. De Bruyne from Kevin De Bruyne you can use either of the following Locator Strategies:

  • 使用 CSS_SELECTOR childNodes strip():

driver.get("https://fantasy.premierleague.com/statistics")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//table//tbody/tr/td/button"))).click()
print( driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h2.ElementDialog__ElementHeading-gmefnd-2")))).strip())

  • 控制台输出:

  • Console Output:

    De Bruyne
    

  • 使用 CSS_SELECTOR childNodes splitlines():

    driver.get("https://fantasy.premierleague.com/statistics")
    WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//table//tbody/tr/td/button"))).click()
    print( driver.execute_script('return arguments[0].lastChild.textContent;', WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h2.ElementDialog__ElementHeading-gmefnd-2")))).splitlines())
    

  • 控制台输出:

  • Console Output:

    ['De Bruyne']
    

  • 注意:您必须添加以下导入:

  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

  • 您可以在以下位置找到几个相关的详细讨论:

    You can find a couple of relevant detailed discussions in:

    这篇关于如何使用Selenium和Python从由空格分隔的textnodes中获取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆