如何在python中将xml文件转换为csv输出? [英] How to convert xml file to csv output in python?

查看:92
本文介绍了如何在python中将xml文件转换为csv输出?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个基本的 XML 文件,它正在从我无法控制的数据库中提取出来.

我希望在 CSV 文件中获得以下输出:

任务开始完成工作 1 20200202055415725 20200202055423951工作 2 20200202055810390 20200202055819000Job3 20200202055814687 20200202055816708

我尝试了几种方法,以下似乎是我最接近正确输出的方法,但即使这样也无法正常工作:

导入 xml.etree.ElementTree 作为 ET导入 csvtree = ET.parse('Jobs.xml')root = tree.getroot()使用 open('Output.csv', 'w') 作为 csv_file:writer = csv.writer(csv_file, delimiter=',')对于 root.findall('Job1Start') 中的 TaskName:starttime = TaskName.find('Time').text任务 = "作业 1"writer.writerows(zip(task, starttime))打印(作业1",开始时间)

我从中得到的输出如下所示.它的格式不正确,我只能在 Job1 上搜索开始时间:

有人遇到过类似问题吗?

解决方案

writerows 而不是 writerow 导致单字符问题和 csv.writer>.writerows 需要一个列表列表(或更准确地说是一个可迭代的可迭代对象)并且字符串是可迭代的,因此字符串列表满足要求,但内部列表"项是单个字符.>根据文档,

csv.writer 还需要 newline='',并且在 Windows 上缺少此参数会在打开 CSV 时显示为行之间的额外空行在 Excel 中.

这是一个解决方案:

导入 xml.etree.ElementTree 作为 ET导入 csvtree = ET.parse('Jobs.xml')root = tree.getroot()# 使用 newline='' 每个 csv 文档.这修复了输出中的空白行使用 open('Output.csv', 'w', newline='') 作为 csv_file:writer = csv.writer(csv_file)writer.writerow('任务开始完成'.split())对于范围(1,4)中的工作:start = root.find(f'Job{job}Start/Time').textend = root.find(f'Job{job}End/Time').text# 使用 writerow 而不是 writerows...latter 需要列表列表.writer.writerow([f'Job{job}',start,end])

输出:

任务,开始,完成Job1,20200202055415725,20200202055423951Job2,20200202055810390,20200202055819000Job3,20200202055814687,20200202055816708

I have a basic XML file that is being pulled from a database outside of my control.

<?xml version="1.0" encoding="utf-8"?>
<data>
<Job1Start><Time>20200202055415725</Time></Job1Start>
<Job1End><Time>20200202055423951</Time></Job1End>
<Job2Start><Time>20200202055810390</Time></Job2Start>
<Job3Start><Time>20200202055814687</Time></Job3Start>
<Job2End><Time>20200202055819000</Time></Job2End>
<Job3End><Time>20200202055816708</Time></Job3End>
</data>

I'm looking to get the following output in a CSV file:

Task    Start               Finish
Job1    20200202055415725   20200202055423951
Job2    20200202055810390   20200202055819000
Job3    20200202055814687   20200202055816708

I have tried a few methods, the below seems to be the closest I have gotten to a correct output but even this isn't working correctly:

import xml.etree.ElementTree as ET
import csv

tree = ET.parse('Jobs.xml')
root = tree.getroot()

with open('Output.csv', 'w') as csv_file:
        writer = csv.writer(csv_file, delimiter=',')
        for TaskName in root.findall('Job1Start'):
            starttime = TaskName.find('Time').text
            task = "Job1"
            writer.writerows(zip(task, starttime))
            print("Job1", starttime)

The output I get from this is shown below. Its formatting is incorrect and I've only been able to search for the start time on Job1:

Anyone have experience with a similar problem?

解决方案

writerows instead of writerow causes the single character problem and csv.writer. writerows expects a list of lists (or more accurately an iterable of iterables) and strings are iterable, so a list of strings meets the requirement, but the inner "list" item is a single character.

csv.writer also requires newline='' per the documentation, and on Windows lack of this parameter shows up as extra blank lines between rows when a CSV is opened in Excel.

Here's a solution:

import xml.etree.ElementTree as ET
import csv

tree = ET.parse('Jobs.xml')
root = tree.getroot()

# Use newline='' per csv docs.  This fixes the blanks lines in your output
with open('Output.csv', 'w', newline='') as csv_file:
        writer = csv.writer(csv_file)
        writer.writerow('Task Start Finish'.split())
        for job in range(1,4):
            start = root.find(f'Job{job}Start/Time').text
            end = root.find(f'Job{job}End/Time').text
            # Use writerow not writerows...latter expects list of lists.
            writer.writerow([f'Job{job}',start,end])

Output:

Task,Start,Finish
Job1,20200202055415725,20200202055423951
Job2,20200202055810390,20200202055819000
Job3,20200202055814687,20200202055816708

这篇关于如何在python中将xml文件转换为csv输出?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆