Python:将XML提取到DataFrame( pandas ) [英] Python: Extracting XML to DataFrame (Pandas)
本文介绍了Python:将XML提取到DataFrame( pandas )的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
a有一个看起来像这样的XML文件:
a have an XML file that looks like this:
<?xml version="1.0" encoding="utf-8"?>
<comments>
<row Id="1" PostId="2" Score="0" Text="(...)" CreationDate="2011-08-30T21:15:28.063" UserId="16" />
<row Id="2" PostId="17" Score="1" Text="(...)" CreationDate="2011-08-30T21:24:56.573" UserId="27" />
<row Id="3" PostId="26" Score="0" Text="(...)" UserId="9" />
</comments>
我想做的是将ID,Text和CreationDate列提取到熊猫DF中,我尝试了以下操作:
What I'm trying to do is to extract ID, Text and CreationDate colums into pandas DF and I've tryied following:
import xml.etree.cElementTree as et
import pandas as pd
path = '/.../...'
dfcols = ['ID', 'Text', 'CreationDate']
df_xml = pd.DataFrame(columns=dfcols)
root = et.parse(path)
rows = root.findall('.//row')
for row in rows:
ID = row.find('Id')
text = row.find('Text')
date = row.find('CreationDate')
print(ID, text, date)
df_xml = df_xml.append(pd.Series([ID, text, date], index=dfcols), ignore_index=True)
print(df_xml)
但是输出是:无无无
请问如何解决此问题?谢谢
Could you please tell how to fix this? THanks
推荐答案
查看全文