简单的dom遍历Python使用xml.etree.ElementTree [英] Simple dom traversing in Python using xml.etree.ElementTree

查看:210
本文介绍了简单的dom遍历Python使用xml.etree.ElementTree的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

E.g。考虑解析一个 pom.xml 文件:

E.g. consider parsing a pom.xml file:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">

    <parent>
        <groupId>com.parent</groupId>
        <artifactId>parent</artifactId>
        <version>1.0-SNAPSHOT</version>
        <relativePath>../pom.xml</relativePath>
    </parent>

    <modelVersion>2.0.0</modelVersion>
    <groupId>com.parent.somemodule</groupId>
    <artifactId>some_module</artifactId>
    <packaging>jar</packaging>
    <version>1.0-SNAPSHOT</version>
    <name>Some Module</name>
    ...

代码:

import xml.etree.ElementTree as ET

tree = ET.parse(pom)
root = tree.getroot()

groupId = root.find("groupId")
artifactId = root.find("artifactId")

groupId artifactId 都是。为什么他们是根的直接后代?我试图用 tree groupId = tree.find(groupId)替换)但是没有改变任何东西。

Both groupId and artifactId are None. Why when they are the direct descendants of the root? I tried to replace the root with tree (groupId = tree.find("groupId")) but that didn't change anything.

推荐答案

问题是你不要有一个名为 groupId 的小孩,您有一个名为 {http://maven.apache.org/POM/ 4.0.0} groupId ,因为etree不会忽略XML命名空间,它使用通用名称。请参阅effbot文档中的使用命名空间和合格名称

The problem is that you don't have a child named groupId, you have a child named {http://maven.apache.org/POM/4.0.0}groupId, because etree doesn't ignore XML namespaces, it uses "universal names". See Working with Namespaces and Qualified Names in the effbot docs.

这篇关于简单的dom遍历Python使用xml.etree.ElementTree的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆