测试beautifulsoup中是否存在children标签 [英] Test if children tag exists in beautifulsoup

查看:70
本文介绍了测试beautifulsoup中是否存在children标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个已定义结构但标签数量不同的 XML 文件,例如

i have an XML file with an defined structure but different number of tags, like

file1.xml:

<document>
  <subDoc>
    <id>1</id>
    <myId>1</myId>
  </subDoc>
</document>

file2.xml:

<document>
  <subDoc>
    <id>2</id>
  </subDoc>
</document>

现在我想检查标签 myId 是否存在.所以我做了以下事情:

Now i like to check, if the tag myId exits. So i did the following:

data = open("file1.xml",'r').read()
xml = BeautifulSoup(data)

hasAttrBs = xml.document.subdoc.has_attr('myID')
hasAttrPy = hasattr(xml.document.subdoc,'myID')
hasType = type(xml.document.subdoc.myid)

结果是file1.xml:

The result is for file1.xml:

hasAttrBs -> False
hasAttrPy -> True
hasType ->   <class 'bs4.element.Tag'>

file2.xml:

hasAttrBs -> False
hasAttrPy -> True
hasType -> <type 'NoneType'>

好的, 不是 的属性.

Okay, <myId> is not an attribute of <subdoc>.

但是我如何测试子标签是否存在?

But how i can test, if an sub-tag exists?

//顺便说一句:我真的不喜欢遍历整个子文档,因为那样会很慢.我希望找到一种方法,可以直接解决/询问该元素.

// By the way: I'm don't really like to iterate trough the whole subdoc, because that will be very slow. I hope to find an way where I can direct address/ask that element.

推荐答案

查找子标签是否存在的最简单方法就是

The simplest way to find if a child tag exists is simply

childTag = xml.find('childTag')
if childTag:
    # do stuff

<小时>

更具体地说是 OP 的问题:


More specifically to OP's question:

如果不知道XML doc的结构,可以使用soup的.find()方法.像这样:

If you don't know the structure of the XML doc, you can use the .find() method of the soup. Something like this:

with open("file1.xml",'r') as data, open("file2.xml",'r') as data2:
    xml = BeautifulSoup(data.read())
    xml2 = BeautifulSoup(data2.read())

    hasAttrBs = xml.find("myId")
    hasAttrBs2 = xml2.find("myId")

如果您知道结构,您可以通过访问标记名称作为属性来获取所需的元素,例如 xml.document.subdoc.myid.所以整个事情会像这样:

If you do know the structure, you can get the desired element by accessing the tag name as an attribute like this xml.document.subdoc.myid. So the whole thing would go something like this:

with open("file1.xml",'r') as data, open("file2.xml",'r') as data2:
    xml = BeautifulSoup(data.read())
    xml2 = BeautifulSoup(data2.read())

    hasAttrBs = xml.document.subdoc.myid
    hasAttrBs2 = xml2.document.subdoc.myid
    print hasAttrBs
    print hasAttrBs2

印刷品

<myid>1</myid>
None

这篇关于测试beautifulsoup中是否存在children标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆