pandas 到D3.将数据帧序列化为JSON [英] Pandas to D3. Serializing dataframes to JSON

查看:63
本文介绍了 pandas 到D3.将数据帧序列化为JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含以下列的DataFrame,没有重复项:

I have a DataFrame with the following columns and no duplicates:

['region', 'type', 'name', 'value']

可以看作是如下的层次结构

that can be seen as a hierarchy as follows

grouped = df.groupby(['region','type', 'name'])

我想将此层次结构序列化为JSON对象.

I would like to serialize this hierarchy as a JSON object.

如果有人感兴趣,其背后的动机是最终将类似于的可视化组合在一起,这需要一个JSON文件.

If anyone is interested, the motivation behind this is to eventually put together a visualization like this one which requires a JSON file.

为此,我需要将grouped转换为以下内容:

To do so, I need to convert grouped into the following:

new_data['children'][i]['name'] = region
new_data['children'][i]['children'][j]['name'] = type
new_data['children'][i]['children'][j]'children'][k]['name'] = name
new_data['children'][i]['children'][j]'children'][k]['size'] = value
...

其中regiontypename对应于层次结构的不同级别(由ijk索引)

where region, type, name correspond to different levels of the hierarchy (indexed by i, j and k)

Pandas/Python中有一种简单的方法可以做到这一点吗?

Is there an easy way in Pandas/Python to do this?

推荐答案

按照这些思路进行操作可能会使您到达那里.

Something along these lines might get you there.

from collections import defaultdict

tree = lambda: defaultdict(tree)  # a recursive defaultdict
d = tree()
for _, (region, type, name, value) in df.iterrows():
    d['children'][region]['name'] = region
    ...

json.dumps(d)

矢量化的解决方案会更好,也许可以利用groupby的速度,但是我想不出这样的解决方案.

A vectorized solution would be better, and maybe something that takes advantage of the speed of groupby, but I can't think of such a solution.

还要看看df.groupby(...).groups,它返回一个字典.

Also take a look at df.groupby(...).groups, which return a dict.

另请参见此答案.

这篇关于 pandas 到D3.将数据帧序列化为JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆