如何忽略标签，同时得到一个美丽的汤元素的.string？ [英] How do I ignore tags while getting the .string of a Beautiful Soup element?

查看：138 发布时间：2016/8/5 18:59:09 python dom html-parsing beautifulsoup

本文介绍了如何忽略标签，同时得到一个美丽的汤元素的.string？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我与有孩子的标签，我想忽略或删除，这样文本仍然存在HTML元素的工作。刚才，如果我尝试 .string 与标签的任何元素，我得到的是无。

 进口BS4汤= bs4.BeautifulSoup（
    ＆LT; DIV ID =主＆GT;
      ＆所述p为H.;这是一个段落＆下; / P＆GT;
      ＆LT; P＆gt;这是一个段落＆LT;跨度类=测试＆gt;在一个标签＆lt; / SPAN方式＆gt;＆LT; / P＆GT;
      ＆LT; P＆GT;这是另一款＆LT; / P＆GT;
    ＆LT; / DIV＆GT;
）主要= soup.find（ID ='主'）
儿童在main.children：
    打印child.string

输出：

 这是一个段落。
没有
这是另一个段落。

我想第二行是这是一个标记一个段落。。我该怎么做呢？

解决方案

 为孩子soup.find（ID ='主'）：
    如果isinstance（儿童，bs4.Tag）：
        打印child.text

和，你会得到：

 这是一个段落。
这是一个标签的一个段落。
这是另一个段落。

I'm working with HTML elements that have child tags, which I want to "ignore" or remove, so that the text is still there. Just now, if I try to .string any element with tags, all I get is None.

import bs4

soup = bs4.BeautifulSoup("""
    <div id="main">
      <p>This is a paragraph.</p>
      <p>This is a paragraph <span class="test">with a tag</span>.</p>
      <p>This is another paragraph.</p>
    </div>
""")

main = soup.find(id='main')
for child in main.children:
    print child.string

Output:

This is a paragraph.
None
This is another paragraph.

I want the second line to be This is a paragraph with a tag.. How do I do this?

解决方案

for child in soup.find(id='main'):
    if isinstance(child, bs4.Tag):
        print child.text

And, you'll get:

This is a paragraph.
This is a paragraph with a tag.
This is another paragraph.

这篇关于如何忽略标签，同时得到一个美丽的汤元素的.string？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何忽略标签，同时得到一个美丽的汤元素的.string？ [英] How do I ignore tags while getting the .string of a Beautiful Soup element?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何忽略标签，同时得到一个美丽的汤元素的.string？ [英] How do I ignore tags while getting the .string of a Beautiful Soup element?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭