BeautifulSoup不会解析从本地文件加载的XML [英] BeautifulSoup doesn't parse XML loaded from local file

查看:125
本文介绍了BeautifulSoup不会解析从本地文件加载的XML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当尝试从本地加载的文件中解析(查找元素)XML时,使用BeautifulSoup的我的Python脚本获取None:

My Python script utilizing BeautifulSoup gets None when attempting to parse (find an element from) XML from a locally loaded file:

xmlData = None

with open('conf//test2.xml', 'r') as xmlFile:
    xmlData = xmlFile.read()

# this creates a soup object out of xmlData,
# which is properly loaded from file above
xmlSoup = BeautifulSoup(xmlData, "html.parser")

# this resolves to None
subElemX = xmlSoup.root.singleelement.find('subElementX', recursive=False)

文件:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<root>
    <singleElement>
        <subElementX>XYZ</subElementX>
    </singleElement>
    <repeatingElement id="1"/>
    <repeatingElement id="2"/>
</root>

我也有一个REST GET服务,该服务返回相同的XML,但是当我使用 requests.get ,它可以很好地解析:

I also have a REST GET service that returns the same XML but when I read that using requests.get, it is parsed fine:

resp = requests.get(serviceURL, headers=headers)

respXML = resp.content.decode("utf-8")

restSoup = BeautifulSoup(respXML, "html.parser")

为什么它与REST响应一起工作,而不与从本地文件中读出的数据一起工作?

Why does it work with the REST response and not with the data read out of a local file?

更新:虽然我了解python区分大小写,并且是单个 e lement!=单个 E 元素,但在解析时忽略大小写网络服务.

UPDATE: While I understand that python is case sensitive and singleelement !=singleElement, the case is disregarded when parsing the web service.

推荐答案

要使其工作的两件事:

  • 将功能从html.parser更改为xml(您正在解析XML数据,XML!= HTML)
  • singleelement更改为singleElement
  • change the features from html.parser to xml (you are parsing XML data, XML != HTML)
  • change singleelement to singleElement

已应用更改(对我有用)

Changes applied (works for me):

xmlSoup = BeautifulSoup(xmlData, "xml")

subElemX = xmlSoup.root.singleElement.find('subElementX', recursive=False)
print(subElemX)  # prints <subElementX>XYZ</subElementX>

这篇关于BeautifulSoup不会解析从本地文件加载的XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆