pandas 到D3.将数据帧序列化为JSON [英] Pandas to D3. Serializing dataframes to JSON
问题描述
我有一个包含以下列的DataFrame,没有重复项:
I have a DataFrame with the following columns and no duplicates:
['region', 'type', 'name', 'value']
可以看作是如下的层次结构
that can be seen as a hierarchy as follows
grouped = df.groupby(['region','type', 'name'])
我想将此层次结构序列化为JSON对象.
I would like to serialize this hierarchy as a JSON object.
如果有人感兴趣,其背后的动机是最终将类似于此的可视化组合在一起,这需要一个JSON
文件.
If anyone is interested, the motivation behind this is to eventually put together a visualization like this one which requires a JSON
file.
为此,我需要将grouped
转换为以下内容:
To do so, I need to convert grouped
into the following:
new_data['children'][i]['name'] = region
new_data['children'][i]['children'][j]['name'] = type
new_data['children'][i]['children'][j]'children'][k]['name'] = name
new_data['children'][i]['children'][j]'children'][k]['size'] = value
...
其中region
,type
,name
对应于层次结构的不同级别(由i
,j
和k
索引)
where region
, type
, name
correspond to different levels of the hierarchy (indexed by i
, j
and k
)
Pandas/Python中有一种简单的方法可以做到这一点吗?
Is there an easy way in Pandas/Python to do this?
推荐答案
按照这些思路进行操作可能会使您到达那里.
Something along these lines might get you there.
from collections import defaultdict
tree = lambda: defaultdict(tree) # a recursive defaultdict
d = tree()
for _, (region, type, name, value) in df.iterrows():
d['children'][region]['name'] = region
...
json.dumps(d)
矢量化的解决方案会更好,也许可以利用groupby的速度,但是我想不出这样的解决方案.
A vectorized solution would be better, and maybe something that takes advantage of the speed of groupby, but I can't think of such a solution.
还要看看df.groupby(...).groups
,它返回一个字典.
Also take a look at df.groupby(...).groups
, which return a dict.
另请参见此答案.
这篇关于 pandas 到D3.将数据帧序列化为JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!