从CSV创建嵌套JSON [英] Create nested JSON from CSV

查看:322
本文介绍了从CSV创建嵌套JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读从flat csv 创建嵌套JSON,但它didn



我有一个由Google文档创建的大型电子表格,包含11行和74列(某些列未占用)。



我创建了一个关于 Google云端硬盘< a>。当导出为 CSV 时,它如下所示:

  id,name ,email,phone,picture01,picture02,picture03,status 
1,Alice,alice @ gmail.com,2131232,image01_01
[this is a image],image01_02,image01_03,single
2,Bob,bob @ gmail.com,2854839,image02_01,image02_02
[description to image 2],married
3,Frank,frank @ gmail.com,987987,image03_01,image03_02 ,, single
4,Shawn,shawn @ gmail.com ,, image04_01 ,,, single

现在我想要一个 JSON 结构,其格式如下:

  {
persons:[
{
type:config.profile,
id:1,
email:alice@gmail.com,
pictureId:p01,
statusId:s01
},
{
type:config.pictures,
id:p01,
album:[
{
image:image01_01,
description:这是一张图片
},
{
image:image_01_02,
description:
}
{
image:image_01_03,
description:
}
]
},
{
type:config.status,
id:s01,
status:single
},
{
type:config.profile,
id:2,
email:bob@gmail.com,
pictureId:p02
statusId:s02
},
{
type:config.pictures,
id:p02,
album:[
{
image:image02_01,
description:
},
{
image:image_02_02,
description:description to image 2
}
]
},
{
type config.status,
id:s02,
status:married
}
]
} code>



等等。



我的理论方法是每行经过 CSV 文件(这里开始第一个问题:现在每行都等于一行,但有时候几行,因此我需要计算逗号? )。每行等于 config.profile 的一个块,包括 id / code>, pictureId statusId (根据行号生成后两个) 。



然后对于每一行,使用相同的 id生成 config.pictures 作为插入 config.profile 块中的一个。 相册是一个与图片一样多的元素的数组。



最后每行有一个 config.status block,它再次具有与 config.profile中给出的相同的 id ,以及具有相应状态的 status 的一个条目。



无关如何创建嵌套和有条件的JSON文件。



我刚刚到了转换 CSV CSV ,而不是直接在 CSV 中给出的任何嵌套和其他信息c $ c> type pictureId statusId 等。



任何帮助。如果使用其他脚本语言(如 ruby​​ )编程更容易,我会很乐意切换到这些脚本语言。



之前有人认为这是一个家庭作业或什么。不是这样。我只想自动执行非常烦琐的复制和粘贴任务。

解决方案

csv 模块会很好地处理CSV

  import csv 
with open('my_csv.csv')as csv_file:
对于csv.reader(csv_file)中的行:
#do work

$ c> csv.reader 对象是一个迭代器 - 你可以通过循环遍历CSV中的行。每一行都是一个列表,所以你可以得到每个字段 row [0] row [1] 。请注意,这将加载第一行(只包含您的情况下的字段名称)。



由于我们在第一行给我们的字段名称,可以使用 csv.DictReader ,以便每行中的字段可以作为 row ['id'] row ['name'] 等。这也将跳过我们的第一行:

  import csv 
with open('my_csv.csv')as csv_file:
for csv.DictReader(csv_file):
#do work

对于JSON导出,请使用 json 模块。 json.dumps()将使用Python数据结构,例如列表和字典,并返回适当的JSON字符串:

  import json 
my_data = {'id':123,'name':'Test User','emails':['test@example.com','test @ hotmail .com']}
my_data_json = json.dumps(my_data)

生成JSON输出与您发布的完全一样,您可以这样做:

  output = {'persons':[]} 
with open('my_csv.csv')as csv_file:
for csv.DictReader(csv_file):
output ['persons']。append({
'type' :'config.profile',
'id':person ['id'],
#...在这里添加其他字段(电子邮件等)...
})

#...对config.pictures,config.status等做类似的操作...

output_json = json.dumps(output)



output_json 将包含您想要的JSON输出。



但是,我建议您仔细考虑您输出的JSON输出的结构 - 目前,您正在定义一个外部字典,没有目的,并且添加所有您人下的配置数据 - 您可能需要重新考虑此事。


I already read Create nested JSON from flat csv, but it didn't help in my case.

I have quite a big spreadsheet created with Google Docs consisting of 11 rows and 74 columns (some columns are not occupied).

I created an example on Google Drive. When exported as a CSV it looks like this:

id,name,email,phone,picture01,picture02,picture03,status
1,Alice,alice@gmail.com,2131232,"image01_01
[this is an image]",image01_02,image01_03,single
2,Bob,bob@gmail.com,2854839,image02_01,"image02_02
[description to image 2]",,married
3,Frank,frank@gmail.com,987987,image03_01,image03_02,,single
4,Shawn,shawn@gmail.com,,image04_01,,,single

Now I would like to have a JSON structure, which looks like this:

{
    "persons": [
        {
            "type": "config.profile",
            "id": "1",
            "email": "alice@gmail.com",
            "pictureId": "p01",
            "statusId": "s01"
        },
        {
            "type": "config.pictures",
            "id": "p01",
            "album": [
                {
                    "image": "image01_01",
                    "description": "this is an image"
                },
                {
                    "image": "image_01_02",
                    "description": ""
                },
                {
                    "image": "image_01_03",
                    "description": ""
                }
            ]
        },
        {
            "type": "config.status",
            "id": "s01",
            "status": "single"
        },
        {
            "type": "config.profile",
            "id": "2",
            "email": "bob@gmail.com",
            "pictureId": "p02",
            "statusId": "s02"
        },
        {
            "type": "config.pictures",
            "id": "p02",
            "album": [
                {
                    "image": "image02_01",
                    "description": ""
                },
                {
                    "image": "image_02_02",
                    "description": "description to image 2"
                }
            ]
        },
        {
            "type": "config.status",
            "id": "s02",
            "status": "married"
        }
    ]
}

And so on for the other lines.

My theoretical approach would be to go through the CSV file per row (here starts the first problem: now every row is equal to one line, but sometimes several, thus I need to count the commas?). Each row is equal to a block of config.profile, including the id, email, pictureId, and statusId (the latter two are being generated depending on the row number).

Then for each row a config.pictures block is generated with the same id as the one inserted in the config.profile block. The album is an array of as many elements as pictures are given.

Lastly each row has a config.status block, which, again, has the same id as the one given in config.profile, and one entry of status with the corresponding status.

I'm entirely clueless how to create the nested and conditional JSON file.

I just got to the point where I convert the CSV to valid JSON, without any nesting and additional info, which are not directly given in the CSV, like the type, pictureId, statusId, and so on.

Any help is appreciated. If it is easier to program this in another script language (like ruby), I would gladly switch to those).

Before someone thinks this is a homework or whatnot. It is not. I just want to automate an otherwise very tiresome copy&paste task.

解决方案

The csv module will handle the CSV reading nicely - including handling line breaks that are within quotes.

import csv
with open('my_csv.csv') as csv_file:
   for row in csv.reader(csv_file):
       # do work

The csv.reader object is an iterator - you can iterate through the rows in the CSV by using a for loop. Each row is a list, so you can get each field as row[0], row[1], etc. Be aware that this will load the first row (which just contains field names in your case).

As we have field names given to us in the first row, we can use csv.DictReader so that fields in each row can be accessed as row['id'], row['name'], etc. This will also skip the first row for us:

import csv
with open('my_csv.csv') as csv_file:
   for row in csv.DictReader(csv_file):
       # do work

For the JSON export, use the json module. json.dumps() will take Python data structures such as lists and dictionaries and return the appropriate JSON string:

import json
my_data = {'id': 123, 'name': 'Test User', 'emails': ['test@example.com', 'test@hotmail.com']}
my_data_json = json.dumps(my_data)

If you want to generate JSON output exactly as you posted, you'd do something like:

output = {'persons': []}
with open('my_csv.csv') as csv_file:
    for person in csv.DictReader(csv_file):
        output['persons'].append({
            'type': 'config.profile',
            'id': person['id'],
            # ...add other fields (email etc) here...
        })

        # ...do similar for config.pictures, config.status, etc...

output_json = json.dumps(output)

output_json will contain the JSON output that you want.

However, I'd suggest you carefully consider the structure of the JSON output that you're after - at the moment, you're defining an outer dictionary that serves no purpose, and you're adding all your 'config' data directly under 'persons' - you may want to reconsider this.

这篇关于从CSV创建嵌套JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆