使用python读取XML文本的问题 [英] Issue in reading text in XML using python
问题描述
我正在尝试阅读以下具有以下内容的XML文件:
I am trying to read the following XML file which has following content:
<tu creationdate="20100624T160543Z" creationid="SYSTEM" usagecount="0">
<prop type="x-source-tags">1=A,2=B</prop>
<prop type="x-target-tags">1=A,2=B</prop>
<tuv xml:lang="EN">
<seg>Modified <ut x="1"/>Denver<ut x="2"/> Score</seg>
</tuv>
<tuv xml:lang="DE">
<seg>Modifizierter <ut x="1"/>Denver<ut x="2"/>-Score</seg>
</tuv>
</tu>
使用以下代码
tree = ET.parse(tmx)
root = tree.getroot()
seg = root.findall('.//seg')
for n in seg:
print(n.text)
它给出了以下输出:
Modified
Modifizierter
我期望的是
Modified Denver Score
Modifizierter Denver -Score
有人可以解释为什么只显示部分seg吗?
Can someone explain why only part of seg is displayed?
推荐答案
您需要了解 http://infohost. nmt.edu/tcc/help/pubs/pylxml/web/etree-view.html .
丹佛"是第一个<ut>
元素的tail
,而得分"是第二个<ut>
元素的tail
.这些字符串不是<seg>
元素的text
的一部分.
"Denver" is the tail
of the first <ut>
element and " Score" is the tail
of the second <ut>
element. These strings are not part of the text
of the <seg>
element.
除了kgbplus提供的解决方案(与ElementTree和lxml一起使用)之外,对于lxml,您还可以使用以下方法来获取所需的输出:
In addition to the solution provided by kgbplus (which works with both ElementTree and lxml), with lxml you can also use the following methods to get the wanted output:
for n in seg:
print("".join(n.xpath("text()")))
itertext()
for n in seg:
print("".join(n.itertext()))
这篇关于使用python读取XML文本的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!