python lxml 树,line[] 创建多行,需要单行输出 [英] python lxml tree, line[] creating multiple lines, desire single line output

查看:42
本文介绍了python lxml 树,line[] 创建多行,需要单行输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 lxml 用 python 创建一个 xml 文件.我正在逐行解析文件,查找字符串,如果该字符串存在,则创建一个子元素.我正在为 SubElement 分配一个值,该值存在于我正在搜索的字符串之后的解析文件中.

问题:如何将所有 xml 输出放到 output.xml 文件中的一行中?使用 line[12:] 似乎是问题所在.请参阅以下详细信息.

每行示例文件内容:

[testclass] unique_value_horse[测试类] unique_value_cat[测试类] unique_value_bird

Python 代码:

当我对如下所示的字符串进行硬编码时,输出的 xml 是 xml 树的一个连续行.完美的!见下文.

 with open(file) as openfile:对于 openfile 中的行:如果[testclass]"在线:tagxyz = etree.SubElement(subroot, "tagxyz")tagxyz.text = "硬编码值"

当我尝试分配第 13 个字符作为值时,我在每个 SubElement 的输出 xml 中得到一个新行.这会导致输出 xml 文件的接收者出错.见下文.

 with open(file) as openfile:对于 openfile 中的行:如果[testclass]"在线:tagxyz = etree.SubElement(subroot, "tagxyz")tagxyz.text = 行 [12:]

我认为在同一行进行分配可能会有所帮助,但这似乎无关紧要.见下文.

 with open(file) as openfile:对于 openfile 中的行:如果[testclass]"在线:etree.SubElement(subroot, "tagxyz").text = line[12:]

我尝试使用 etree.XMLParser(remove_blank_text=True),并在事后解析输出 xml 文件并重新创建文件,但这似乎没有帮助.我知道这应该会有所帮助,但要么我用错了它,要么它实际上无法解决我的问题.见下文.

 with open("output.xml", 'w') as f:f.write(etree.tostring(project))解析器 = etree.XMLParser(remove_blank_text=True)tree = etree.parse("output.xml", 解析器)使用 open("output2.xml", 'w') 作为 fl:fl.write(etree.tostring(tree))

解决方案

你的行包括行分隔符,\n.您可以使用 str.rstrip():

 with open(file) as openfile:对于 openfile 中的行:如果[testclass]"在线:etree.SubElement(subroot, "tagxyz").text = line.rstrip('\n')

将来,使用 repr() 函数 调试此类问题;您将很容易看到由 Python 转义序列表示的换行符:

<预><代码>>>>line = '[testclass] unique_value_horse\n'>>>打印(行)[测试类] unique_value_horse>>>打印(repr(行))'[testclass] unique_value_horse\n'>>>打印(repr(line.rstrip('\n')))'[testclass] unique_value_horse'

I'm am creating an xml file with python using lxml. I am parsing through a file by line, looking for a string, and if that string exists, I create a SubElement. I am assigning the the SubElement a value which exists in the parsed file after the string I'm searching for.

Question: how do I get all the xml output onto one line in the output.xml file? Using line[12:] appears to be the problem. See below details.

Example file content per line:

[testclass] unique_value_horse
[testclass] unique_value_cat
[testclass] unique_value_bird

Python code:

When I hardcode a string such as below, the output xml is one continuous line for the xml tree. Perfect! See below.

with open(file) as openfile:
    for line in openfile:
        if "[testclass]" in line:
            tagxyz = etree.SubElement(subroot, "tagxyz")
            tagxyz.text = "hardcodevalue"

When I try and assign the 13th character onward as the value, I get a new line in the output xml per SubElement. This is causing errors for the receiver of the output xml file. See below.

with open(file) as openfile:
    for line in openfile:
        if "[testclass]" in line:
            tagxyz = etree.SubElement(subroot, "tagxyz")
            tagxyz.text = line[12:]

I thought making the assignment on the same line might help, but it does not seem to matter. See below.

with open(file) as openfile:
    for line in openfile:
        if "[testclass]" in line:
            etree.SubElement(subroot, "tagxyz").text = line[12:]

I have tried to employ etree.XMLParser(remove_blank_text=True), and parse the output xml file AFTER the fact and recreate the file, but that doesn't seem to help. I understand this should help, but either I'm using it wrong, or it won't actually solve my problem. See below.

with open("output.xml", 'w') as f:
    f.write(etree.tostring(project))

parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse("output.xml", parser)

with open("output2.xml", 'w') as fl:
    fl.write(etree.tostring(tree))

解决方案

Your lines include the line separator, \n. You can strip the line with str.rstrip():

with open(file) as openfile:
    for line in openfile:
        if "[testclass]" in line:
            etree.SubElement(subroot, "tagxyz").text = line.rstrip('\n')

In future, use the repr() function to debug such issues; you'll readily see the newline represented by its Python escape sequence:

>>> line = '[testclass] unique_value_horse\n'
>>> print(line)
[testclass] unique_value_horse

>>> print(repr(line))
'[testclass] unique_value_horse\n'
>>> print(repr(line.rstrip('\n')))
'[testclass] unique_value_horse'

这篇关于python lxml 树,line[] 创建多行,需要单行输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆