Python 中的条件 XML 解析 [英] Conditional XML parsing in Python

查看:29
本文介绍了Python 中的条件 XML 解析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果父元素有某些信息,我想选择非常大的xml文件中所有子元素的信息.如果像示例代码一样,节点 sn 的属性包含 elliptic="yes",则选择 v 节点并检索属性值(例如 wd="vulgui").

I would like to select the information of all child elements in very large xml file if its parent has certain information. If, as in the sample code, the attribute of the node sn contains elliptic="yes", then select the v node and retrieve attribute values (e.g. wd="vulgui").

 <sentence>
<sadv arg="argM" func="cc" tem="tmp">
  <sadv>
    <grup.adv>
      <r lem="després" pos="rg" wd="Després"/>
      <sp>
        <prep>
          <s lem="de" pos="sps00" postype="preposition" wd="de"/>
        </prep>
        <sn entityref="nne">
          <spec gen="m" num="p">
            <z lem="15" ne="number" wd="15"/>
          </spec>
          <grup.nom gen="m" num="p">
            <n gen="m" lem="any" num="p" pos="ncmp000" postype="common" sense="16:10917509" wd="anys"/>
            <sp>
              <prep>
                <s lem="de" pos="sps00" postype="preposition" wd="de"/>
              </prep>
              <sn entityref="nne">
                <spec gen="f" num="s">
                  <d coreftype="ident" entity="entity3" entityref="nne" gen="f" lem="el_seu" num="s" person="3" pos="dp3fs0" postype="possessive" wd="la_seva"/>
                </spec>
                <grup.nom gen="f" num="s">
                  <n gen="f" lem="creació" num="s" pos="ncfs000" postype="common" sense="16:00583085" wd="creació"/>
                </grup.nom>
              </sn>
            </sp>
          </grup.nom>
        </sn>
      </sp>
    </grup.adv>
  </sadv>
  <f lem="," pos="fc" punct="comma" wd=","/>
</sadv>
<sn arg="arg0" coreftype="ident" **elliptic="yes"** entity="entity3" entityref="nne" func="suj" tem="agt"/>
<grup.verb>
  <v lem="presentar" lss="A32.ditransitive-patient-benefactive" mood="indicative" num="p" person="3" pos="vmip3p0" postype="main" tense="present" **wd="presenten"**/>
</grup.verb>
<sn arg="arg1" entityref="spec" func="cd" tem="pat">
  <spec gen="m" num="s">
    <d gen="m" lem="un" num="s" pos="di0ms0" postype="indefinite" wd="un"/>
  </spec>
  <grup.nom gen="m" num="s">
    <s.a gen="m" num="s">
      <grup.a gen="m" num="s">
        <a gen="m" lem="nou" num="s" pos="aq0ms0" postype="qualificative" wd="nou"/>
      </grup.a>
    </s.a>
    <n gen="m" lem="disc" num="s" pos="ncms000" postype="common" sense="16:03112307" wd="disc"/>
    <sn entityref="ne" ne="other">
      <f lem="," pos="fc" punct="comma" wd=","/>
      <grup.nom>
        <f lem="'" pos="fz" punct="mathsign" wd="'"/>
        <n lem="Electroretard" ne="other" pos="np0000a" postype="proper" sense="16:cs1" wd="Electroretard"/>
        <f lem="'" pos="fz" punct="mathsign" wd="'"/>
      </grup.nom>
    </sn>
  </grup.nom>
</sn>
<f lem="." pos="fp" punct="period" wd="."/>

我无法想出解决方案:

for sn in root.iter('sn'):
rank = sn.get('elliptic')
if rank == 'yes':

我怎样才能继续这行代码?我想过这样的事情:

How could I continue this line of code? I thought something like:

"遍历所有父级包含@elliptic="yes"的子级

"iterate through all children whose parents contain @elliptic="yes"

推荐答案

据我所知,最简单的方法是构建 xpath 并将其放入 try ->if/except 块:

Well as I understand the simplest way is to build xpath and put it in try ->if/except block:

xpath = '(//sn[@elliptic="yes"])[1]'

现在创建一个 if 语句来检查该元素是否在您的 xml 组中以及它是否存在,然后执行您需要的操作.例如.如果这是真的,则使用另一个 xpath 或其他工具来提取所需的内容.

Now create a if statement that would check if this element is in you xml group and if it exists, then do what you need. E.g. if this true, then use another xpath's or etc to extract what is needed.

附言这个 [1] 表示您正在 xml 中搜索第一个元素,如果有超过 1 个元素,则没有它,它可能会中断.所以创建迭代器 i ,它会进入你的 xpath (//sn[@elliptic="yes"])[i]

p.s. this [1] means that you are searching for 1st element in xml, if there is more then 1 then without it, it can break. So create iterator i that would go in your xpath (//sn[@elliptic="yes"])[i]

这篇关于Python 中的条件 XML 解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆