当值具有实体时,Selenium WebDriver get_attribute返回href属性的截断值 [英] Selenium WebDriver get_attribute returns truncated value of href attribute when value has entities
问题描述
我正在尝试使用Selenium Webdriver(Python)从应用程序页面上的锚点选项卡中获取href属性值,并且返回的结果已被剥离.
I am trying to get href attribute value from anchor tab on a page in my application using selenium Webdriver (Python) and the result returned has part stripped off.
这是HTML代码段-
<a class="nla-row-text" href="/shopping/brands?search=kamera&nm=Canon&page=0" data-reactid="790">
这是我正在使用的代码-
Here is the code I am using -
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
driver = webdriver.Firefox()
driver.get("xxxx")
url_from_attr = driver.find_element(By.XPATH,"(//div[@class='nla-children mfr']/div/div/a)[1]").get_attribute("href")
url_from_attr_raw = "%r"%url_from_attr
print(" URL from attribute -->> " + url_from_attr)
print(" Raw string -->> " + url_from_attr_raw)
我得到的输出是-
/shopping/brands?search=kamera&page=0
而不是-
/shopping/brands?search=kamera&nm=Canon&page=0 OR
/shopping/brands?search=kamera&nm=Canon&page=0
这是由于URL中的实体表示,因为我看到实体之间的部分被剥离了吗?任何帮助或指针都很好
Is this because of the entity representation in the URL as I see part between entities stripped? Any help or pointer would be great
推荐答案
根据给定的 HTML ,您尝试过的定位器策略存在问题.您已经使用了索引[1]
和容易出错的find_element
.索引例如通过find_elements
返回 List 时,可以应用[1]
.在此用例中,优化的表达式为:
As per the given HTML there is a issue with the Locator Strategy which you have tried. You have used an index [1]
along with find_element
which is error-prone. Index e.g. [1]
can be applied when a List is returned through find_elements
. In this usecase an optimized expression would be :
url_from_attr = driver.find_element(By.XPATH,"//div[@class='nla-children mfr']/div/div/a[@class='nla-row-text']").get_attribute("href")
定位器策略" 可以如下进行更优化:
The Locator Strategy can be more optimized as follows :
url_from_attr = driver.find_element(By.XPATH,"//div[@class='nla-children mfr']//a[@class='nla-row-text']").get_attribute("href")
更新A
根据您的评论,您仍然需要使用索引编制优化的 Locator Strategy :
Update A
As per your comment as you still need to use indexing the optimized Locator Strategy can be :
url_from_attr = driver.find_elements(By.XPATH,"//div[@class='nla-children mfr']//a[@class='nla-row-text'][1]").get_attribute("href")
get_attribute(attribute_name)
get_attribute(attribute_name)
As per the Python-API Source :
def get_attribute(self, name):
"""Gets the given attribute or property of the element.
This method will first try to return the value of a property with the
given name. If a property with that name doesn't exist, it returns the
value of the attribute with the same name. If there's no attribute with
that name, ``None`` is returned.
Values which are considered truthy, that is equals "true" or "false",
are returned as booleans. All other non-``None`` values are returned
as strings. For attributes or properties which do not exist, ``None``
is returned.
:Args:
- name - Name of the attribute/property to retrieve.
Example::
# Check if the "active" CSS class is applied to an element.
is_active = "active" in target_element.get_attribute("class")
"""
attributeValue = ''
if self._w3c:
attributeValue = self.parent.execute_script(
"return (%s).apply(null, arguments);" % getAttribute_js,
self, name)
else:
resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
attributeValue = resp.get('value')
if attributeValue is not None:
if name != 'value' and attributeValue.lower() in ('true', 'false'):
attributeValue = attributeValue.lower()
return attributeValue
更新B
正如您在评论中提到的那样该方法返回的url值在页面上的任何地方都不存在,这意味着您也在尝试访问 href 属性早期的.因此,可以有以下两种解决方案:
Update B
As you mentioned in your comment the url value being returned by the method is not present anywhere on the page which implies that you are trying to access the href attribute too early. So there can be 2 solutions as follows :
-
遍历 DOM树并构造一个 Locator ,它将唯一地标识元素并生成
element_to_be_clickable
,然后提取 href 属性.
Traverse the DOM Tree and construct a Locator which will uniquely identify the element and induce WebDriverwait with expected_conditions as
element_to_be_clickable
and then extract the href attribute.
出于调试目的,您可以为元素添加time.sleep(10)
以便在 HTML DOM 中正确呈现,然后尝试提取 href 属性./p>
For debugging purpose you can add time.sleep(10)
for the element to get rendered properly in the HTML DOM and then try to extract the href attribute.
这篇关于当值具有实体时,Selenium WebDriver get_attribute返回href属性的截断值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!