使用Python将CSV转换为JSON(以特定格式) [英] Convert CSV to JSON (in specific format) using Python

查看:421
本文介绍了使用Python将CSV转换为JSON(以特定格式)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用python 2.7将csv文件转换为json文件.下面是我尝试的python代码,但没有给我预期的结果.另外,我想知道是否有比我简化的版本.感谢您的帮助.

I would like to convert a csv file into a json file using python 2.7. Down below is the python code I tried but it is not giving me expected result. Also, I would like to know if there is any simplified version than mine. Any help is appreciated.

zipcode,date,state,val1,val2,val3,val4,val5
95110,2015-05-01,CA,50,30.00,5.00,3.00,3
95110,2015-06-01,CA,67,31.00,5.00,3.00,4
95110,2015-07-01,CA,97,32.00,5.00,3.00,6

这是预期的json文件(ExpectedJsonFile.json):

{
        "zipcode": "95110", 
        "state": "CA", 
        "subset": [
            {
                "date": "2015-05-01",
                "val1": "50", 
                "val2": "30.00", 
                "val3": "5.00", 
                "val4": "3.00", 
                "val5": "3"
            }, 
            {
                "date": "2015-06-01", 
                "val1": "67", 
                "val2": "31.00", 
                "val3": "5.00", 
                "val4": "3.00", 
                "val5": "4"
            }, 
            {
                "date": "2015-07-01", 
                "val1": "97", 
                "val2": "32.00", 
                "val3": "5.00", 
                "val4": "3.00", 
                "val5": "6"
            }
        ]

}

这是我尝试过的python代码:

import pandas as pd
from itertools import groupby 
import json    

df = pd.read_csv('SampleCsvFile.csv')

names = df.columns.values.tolist()
data = df.values

master_list2 = [ (d["zipcode"], d["state"], d) for d in [dict(zip(names, d)) for d in data] ]
intermediate2 = [(k, [x[2] for x in list(v)]) for k,v in groupby(master_list2, lambda t: (t[0],t[1]) )]
nested_json2 = [dict(zip(names,(k[0][0], k[0][1], k[1]))) for k in [(i[0], i[1]) for i in intermediate2]]

#print json.dumps(nested_json2, indent=4)
with open('ExpectedJsonFile.json', 'w') as outfile:
     outfile.write(json.dumps(nested_json2, indent=4))

推荐答案

由于您已经在使用熊猫,因此我尝试从数据框方法中获取尽可能多的里程.我还最终在离您的实现很远的地方徘徊.不过,我认为这里的关键是不要对列表和/或字典的理解变得过于聪明.您可以很容易地使自己和每个读取您的代码的人感到困惑.

Since you are using pandas already, I tried to get as much mileage as I could out of dataframe methods. I also ended up wandering fairly far afield from your implementation. I think the key here, though, is don't try to get too clever with your list and/or dictionary comprehensions. You can very easily confuse yourself and everyone who reads your code.

import pandas as pd
from itertools import groupby 
from collections import OrderedDict
import json    

df = pd.read_csv('SampleCsvFile.csv', dtype={
            "zipcode" : str,
            "date" : str,
            "state" : str,
            "val1" : str,
            "val2" : str,
            "val3" : str,
            "val4" : str,
            "val5" : str
        })

results = []

for (zipcode, state), bag in df.groupby(["zipcode", "state"]):
    contents_df = bag.drop(["zipcode", "state"], axis=1)
    subset = [OrderedDict(row) for i,row in contents_df.iterrows()]
    results.append(OrderedDict([("zipcode", zipcode),
                                ("state", state),
                                ("subset", subset)]))

print json.dumps(results[0], indent=4)
#with open('ExpectedJsonFile.json', 'w') as outfile:
#    outfile.write(json.dumps(results[0], indent=4))

将所有json数据类型写为字符串并保留其原始格式的最简单方法是强制read_csv将它们解析为字符串.但是,如果在写json之前需要对值进行任何数字操作,则必须允许read_csv对它们进行数字解析,并将其强制转换为正确的字符串格式,然后再转换为json.

The simplest way to have all the json datatypes written as strings, and to retain their original formatting, was to force read_csv to parse them as strings. If, however, you need to do any numerical manipulation on the values before writing out the json, you will have to allow read_csv to parse them numerically and coerce them into the proper string format before converting to json.

这篇关于使用Python将CSV转换为JSON(以特定格式)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆