解析所有XML文件并将其转换为一个CSV文件 [英] Parse all the XML files and convert them to one CSV file

查看:315
本文介绍了解析所有XML文件并将其转换为一个CSV文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写代码,在其中搜索目录中的所有XML文件,然后解析这些XML并将一些数据保存到CSV文件.我在该目录中有50多个XML文件.每当我运行代码时,都会创建一个CSV文件,但它仅输出最后一个xml文件的数据.如何将所有XML文件的数据打印到CSV文件中?请帮助 这是我的代码:

I am trying to write a code where it search all the XML files in directory then parse those XML and save some data to a CSV file. I have 50 plus XML files in that directory. Whenever I run my code a CSV file created but it only prints data of the last xml file. How can i print all the XML file's data to a CSV file?Please help Here is my code :

from xml.dom.minidom import parse
import csv
import os

def writeToCSV(frelation):
    csvfile = open('data.csv', 'w')
    fieldnames = ['sub', 'sup']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    relation = frelation.getElementsByTagName("predicate")
    for elem in relation:
        sub = elem.attributes['sub'].value
        for elem1 in elem.getElementsByTagName("sup"):
            sup = elem1.attributes['name'].value
            writer.writerow({'sub': sub, 'sup': sup})


for root, dirs, files in os.walk('data/frames'):
    for file in files:
        if (file.endswith('.xml')):
            xmldoc = parse(os.path.join(root, file))
            frelation = xmldoc.getElementsByTagName("frameset")[0]
            relation = frelation.getElementsByTagName("predicate")
            writeToCSV(frelation)

推荐答案

U正在WriteToCSV中一次又一次覆盖同一文件,如下所示:

U are overwriting the same file again and again in the WriteToCSV , may be a little change as below:

def writeToCSV(frelation,file_id):
    csvfile = open('data'+str(file_id)+'.csv', 'w')
    fieldnames = ['sub', 'sup']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    relation = frelation.getElementsByTagName("predicate")
    for elem in relation:
        sub = elem.attributes['sub'].value
        for elem1 in elem.getElementsByTagName("sup"):
            sup = elem1.attributes['name'].value
            writer.writerow({'sub': sub, 'sup': sup})

file_id=1;
for root, dirs, files in os.walk('data/frames'):
    for file in files:
        if (file.endswith('.xml')):
            xmldoc = parse(os.path.join(root, file))
            frelation = xmldoc.getElementsByTagName("frameset")[0]
            relation = frelation.getElementsByTagName("predicate")
            writeToCSV(frelation,file_id)
            file_id+=1

如果只需要一个CSV文件,则需要以附加模式打开该文件,如果该文件不存在,则a +模式表示创建文件:

if you want only one CSV file, u need to open the file in append mode, a+ mode indicates create file if does not exist.:

def writeToCSV(frelation):
        csvfile = open('data.csv', 'a+')
        fieldnames = ['sub', 'sup']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        relation = frelation.getElementsByTagName("predicate")
        for elem in relation:
            sub = elem.attributes['sub'].value
            for elem1 in elem.getElementsByTagName("sup"):
                sup = elem1.attributes['name'].value
                writer.writerow({'sub': sub, 'sup': sup})

无需其他代码即可进行更改.

No changes required in other code.

这篇关于解析所有XML文件并将其转换为一个CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆