使用Python/pandas将CSV转换为嵌套JSON [英] CSV to nested JSON using Python/pandas

查看:232
本文介绍了使用Python/pandas将CSV转换为嵌套JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将平面CSV转换为嵌套JSON格式.这是我的数据:

I'm trying to convert a flat CSV to a nested JSON format. This is my data:

# data.csv
company_id,company_name,income_type,income_amt
1,"Foobar Inc","royalties",5000000
2,"ACME Corp","sales",3000000
2,"ACME Corp","rent",1000000

并且需要转换为以下JSON结构:

And need to convert to the following JSON structure:

{"data": [{
            "company_id": 1,
            "name": "Foobar Inc",
            "income": ["royalties": 5000000]
        }, 
        {
            "company_id": 2,
            "company_name": "ACME Corp",
            "income": [
                "sales": 3000000,
                "rent": 1000000
            ]
        }]
}

但是我当前的代码(基于,并使用Python和pandas库):

But my current code (based on this and using Python and the pandas library):

# script.py
import json
import pandas as pd

df = pd.read_csv('data.csv')

def get_nested_rec(key, grp):
rec = {}

    rec['company_id'] = key[0]
    rec['company_name'] = key[1]

    for field in ['income_type']:
        income_types = list(grp[field].unique())
        rec['income'] = income_types

    return rec

records = []

for key, grp in df.groupby(['company_id','company_name','income_type','income_amt']):
    rec = get_nested_rec(key, grp)
    records.append(rec)

records = dict(data = records)

print(json.dumps(records, indent=4))

输出以下格式:

{"data": [
        {
            "company_id": 1,
            "company_name": "Foobar Inc", 
            "income": [
                "royalties"
            ]
        }, 
        {
            "company_id": 2,
            "company_name": "ACME Corp",
            "income": [
                "sales"
            ]
        }, 
        {
            "company_id": 2,
            "company_name": "ACME Corp",
            "income": [
                "rent"
            ]
        }
    ]}

弄清楚如何将具有相同company_id的行组合到单个对象中并添加income_amt值.

Hitting a wall in figuring out how to combine rows with the same company_id into a single object and add in the income_amt values.

推荐答案

您可以这样做:

for key, grp in df.groupby('company_id'):
    records.append({
        "company_id": key,
        "company_name": grp.company_name.iloc[0],
        "income": {
            row.income_type: row.income_amt for row in grp.itertuples()
        }})

那给你:

[{'company_id': 1,
  'company_name': 'Foobar Inc',
  'income': {'royalties': 5000000}},
 {'company_id': 2,
  'company_name': 'ACME Corp',
  'income': {'rent': 1000000, 'sales': 3000000}}]

这篇关于使用Python/pandas将CSV转换为嵌套JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆