解析所有XML文件并将其转换为一个CSV文件 [英] Parse all the XML files and convert them to one CSV file
问题描述
我正在尝试编写代码,在其中搜索目录中的所有XML文件,然后解析这些XML并将一些数据保存到CSV文件.我在该目录中有50多个XML文件.每当我运行代码时,都会创建一个CSV文件,但它仅输出最后一个xml文件的数据.如何将所有XML文件的数据打印到CSV文件中?请帮助 这是我的代码:
I am trying to write a code where it search all the XML files in directory then parse those XML and save some data to a CSV file. I have 50 plus XML files in that directory. Whenever I run my code a CSV file created but it only prints data of the last xml file. How can i print all the XML file's data to a CSV file?Please help Here is my code :
from xml.dom.minidom import parse
import csv
import os
def writeToCSV(frelation):
csvfile = open('data.csv', 'w')
fieldnames = ['sub', 'sup']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
relation = frelation.getElementsByTagName("predicate")
for elem in relation:
sub = elem.attributes['sub'].value
for elem1 in elem.getElementsByTagName("sup"):
sup = elem1.attributes['name'].value
writer.writerow({'sub': sub, 'sup': sup})
for root, dirs, files in os.walk('data/frames'):
for file in files:
if (file.endswith('.xml')):
xmldoc = parse(os.path.join(root, file))
frelation = xmldoc.getElementsByTagName("frameset")[0]
relation = frelation.getElementsByTagName("predicate")
writeToCSV(frelation)
推荐答案
U正在WriteToCSV中一次又一次覆盖同一文件,如下所示:
U are overwriting the same file again and again in the WriteToCSV , may be a little change as below:
def writeToCSV(frelation,file_id):
csvfile = open('data'+str(file_id)+'.csv', 'w')
fieldnames = ['sub', 'sup']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
relation = frelation.getElementsByTagName("predicate")
for elem in relation:
sub = elem.attributes['sub'].value
for elem1 in elem.getElementsByTagName("sup"):
sup = elem1.attributes['name'].value
writer.writerow({'sub': sub, 'sup': sup})
file_id=1;
for root, dirs, files in os.walk('data/frames'):
for file in files:
if (file.endswith('.xml')):
xmldoc = parse(os.path.join(root, file))
frelation = xmldoc.getElementsByTagName("frameset")[0]
relation = frelation.getElementsByTagName("predicate")
writeToCSV(frelation,file_id)
file_id+=1
如果只需要一个CSV文件,则需要以附加模式打开该文件,如果该文件不存在,则a +模式表示创建文件:
if you want only one CSV file, u need to open the file in append mode, a+ mode indicates create file if does not exist.:
def writeToCSV(frelation):
csvfile = open('data.csv', 'a+')
fieldnames = ['sub', 'sup']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
relation = frelation.getElementsByTagName("predicate")
for elem in relation:
sub = elem.attributes['sub'].value
for elem1 in elem.getElementsByTagName("sup"):
sup = elem1.attributes['name'].value
writer.writerow({'sub': sub, 'sup': sup})
无需其他代码即可进行更改.
No changes required in other code.
这篇关于解析所有XML文件并将其转换为一个CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!