如何访问标签内的标签(获取标签文本等值)?如何获取(段)p标签内的h1标签的值? [英] How to access tags (get values like text of tags) inside a tag? How to get value of h1 tag inside a (paragraph) p tag?

查看:145
本文介绍了如何访问标签内的标签(获取标签文本等值)?如何获取(段)p标签内的h1标签的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Selenium和Python来解决问题.我想提取一个段落(p标记)内的信息.我正在使用"find_elements_by_tag_name"在页面中找到所有 p 标签.但是我该如何访问该段落中已经存在的一些标签.例如,有一个html文件,其代码类似于

I am working with Selenium with Python to solve a problem. I want to extract information inside a paragraph (p tag). I am using "find_elements_by_tag_name" to locate all the p tags in the page. But how can I access some tags that are already inside that paragraph. For example there is html file which ahs a code like

<p> This is a paragraph <h1> but this is a h1 tag </h1></p>

我曾经使用硒来打开页面

I have used selenium to open the page like

br=webdriver.Chrome()
br.get('file:///C:/Users/Shady/Desktop/New%20Text%20Document.html')

我能够通过

p_tags=br.find_elements_by_tag_name('p')

它仅显示一个元素,而当我这样做时

It shows only one element and when I do

print(x[0].text)

仅显示

This is a paragraph

如何访问p标签内的 h1 标签. X_path可以工作吗?如果是,可以请您共享代码吗?

How can I access the h1 tag inside the p tag. Can X_path would work? if Yes, Can you please share the code?

推荐答案

<h1>标记实际上是<p>标记的后代.因此,在代码试验中,您已经标识了<p>标记并提取了正确给出这是一个段落的文本.

The <h1> tag is actually a descendent of the <p> tag. So in your code trials you have identified the <p> tag and extracted the text which correctely gave This is a paragraph.

因此要提取文本,但这是一个h1标记,您必须到达后代<h1>,并且可以使用以下任一

So to extract the text but this is a h1 tag you have to reach to the descendent <h1> and you can use either of the following Locator Strategies:

  • 使用css_selector:

print(driver.find_element_by_css_selector("p>h1").get_attribute("innerHTML"))

  • 使用xpath:

    print(driver.find_element_by_xpath("//p/h1").get_attribute("innerHTML"))
    

  • 这篇关于如何访问标签内的标签(获取标签文本等值)?如何获取(段)p标签内的h1标签的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆