pandas.DataFrame.from_dict不使用OrderedDict保留订单 [英] pandas.DataFrame.from_dict not preserving order using OrderedDict
问题描述
我想将荷兰统计局(CBS)的OData XML数据提要导入到我们的数据库中.我认为使用lxml和pandas应该很简单.通过使用OrderDict,我想保留列的顺序以提高可读性,但是由于某种原因我无法正确处理.
I want to import OData XML datafeeds from the Dutch Bureau of Statistics (CBS) into our database. Using lxml and pandas I thought this should be straigtforward. By using OrderDict I want to preserve the order of the columns for readability, but somehow I can't get it right.
from collections import OrderedDict
from lxml import etree
import requests
import pandas as pd
# CBS URLs
base_url = 'http://opendata.cbs.nl/ODataFeed/odata'
datasets = ['/37296ned', '/82245NED']
feed = requests.get(base_url + datasets[1] + '/TypedDataSet')
root = etree.fromstring(feed.content)
# all record entries start at tag m:properties, parse into data dict
data = []
for record in root.iter('{{{}}}properties'.format(root.nsmap['m'])):
row = OrderedDict()
for element in record:
row[element.tag.split('}')[1]] = element.text
data.append(row)
df = pd.DataFrame.from_dict(data)
df.columns
检查data
,OrderDict的顺序正确.但是,查看df.head()
时,列是否已按CAPS的字母顺序进行排序?
Inspecting data
, the OrderDict is in the right order. But looking at df.head()
the columns have been sorted alphabetically with CAPS first?
有帮助吗?
推荐答案
示例中的内容似乎不一致,因为data
是list
而不是dict
,但假设您确实有OrderedDict
:
Something in your example seems to be inconsistent, as data
is a list
and no dict
, but assuming you really have an OrderedDict
:
尝试在创建DataFrame时显式指定列顺序:
Try to explicitly specify your column order when you create your DataFrame:
# ... all your data collection
df = pd.DataFrame(data, columns=data.keys())
这应该为您的DataFrame提供与OrderedDict中完全相同的排序顺序的列(通过data.keys()
生成的列表)
This should give you your DataFrame with the columns ordered just in exact the way they are in the OrderedDict (via the data.keys()
generated list)
这篇关于pandas.DataFrame.from_dict不使用OrderedDict保留订单的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!