python:基本XML解析错误(使用lxml) [英] python: error with basic XML parsing (with lxml)

查看：49 发布时间：2021/10/1 20:48:20 python xml xml-parsing

本文介绍了python:基本XML解析错误(使用lxml)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 lxml 使用 python 解析 XML 文件，但在基本尝试时出错.我使用这篇文章和 lxml 教程引导.

I am trying to parse an XML file with python using lxml, but get an error on basic attempts. I use this post and the lxml tutorials to bootstrap.

我的 XML 文件基本上是根据下面的记录构建的(我对其进行了修剪以使其更易于阅读):

My XML file is basically built from records below (I trimmed it down so that it is easier to read):

<?xml version="1.0" ?>
<?xml-stylesheet href="file:///usr/share/nmap/nmap.xsl" type="text/xsl"?>
<nmaprun scanner="nmap" args="nmap -sV -p135,12345 -oX 10.232.0.0.16.xml 10.232.0.0/16" start="1340201347" startstr="Wed Jun 20 16:09:07 2012" version="5.21" xmloutputversion="1.03">
<host>
  <hostnames>
    <hostname name="host1.example.com" type="PTR"/>
  </hostnames>
</host>
</nmaprun>

我通过这个复杂的脚本运行它:

I run it through this complicated script:

from lxml import etree

d = etree.parse("myfile.xml")
for host in d.findall("host"):
    aa = host.find("hostnames/hostname")
    print aa.attrib["name"]

我得到 AttributeError: 'NoneType' object has no attribute 'attrib' 在 print 行.我检查了 d、host 和 aa 的值，它们都被定义为元素.

I get AttributeError: 'NoneType' object has no attribute 'attrib' on the print line. I checked the value of d, host and aa and they are all defined as Elements.

如果这很明显(而且很可能是)，请预先道歉.

Upfront apologies if this is something obvious (and it probably is).

我按要求添加了 XML 文件的标题(我仍在阅读和重读答案:))

I added the header of the XML file as requested (I am still reading and rereading the answers :))

谢谢！

推荐答案

虽然使用 XPath 会更有意义，但您的代码在单独运行时已经可以正常工作，只要处理主机找不到主机名的情况:

Though it would make more sense to use XPath, your code already works fine when standing alone, so long as one handles the case where a host has no hostnames found:

doc = lxml.etree.XML("""
  <nmaprun>
    <host>
      <hostnames>
        <hostname name="host1.example.com" type="PTR"/>
      </hostnames>
    </host>
  </nmaprun>""")
for host in doc.findall('host'):
  host_el = host.find('hostnames/hostname')
  if host_el is not None:
    print host_el.attrib['name']

使用 XPath(doc.xpath() 而不是 doc.find() 或 doc.findall())，可以做到更好的是，仅过滤带有名称的主机名，从而完全避免错误记录:

With XPath (doc.xpath() rather than doc.find() or doc.findall()), one could do better, filtering only for hostnames with a name and thus avoiding the faulty records altogether:

host[hostnames/hostname/@name] 会找到至少有一个 hostnames 和 hostname<的 host/code> 带有 name 属性.
//hostnames/hostname/@name 将直接仅返回名称本身(如果使用 lxml，则将这些作为字符串公开).



host[hostnames/hostname/@name] will find hosts which have at least one hostnames with a hostname with a a name attribute.
//hostnames/hostname/@name will directly return only the names themselves (if using lxml, exposing these as strings).


                        这篇关于python:基本XML解析错误(使用lxml)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

python:基本XML解析错误(使用lxml) [英] python: error with basic XML parsing (with lxml)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

python:基本XML解析错误(使用lxml) [英] python: error with basic XML parsing (with lxml)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭