根据行数拆分JSON文件 [英] Splitting JSON files based on row counts
问题描述
更新: 我正在将具有约1000个奇数行的csv文件(例如19-01-2018.csv)转换为json文件,即19-01-2018.json
UPDATE: I am converting my csv file, for example, 19-01-2018.csv with some 1000 odd rows to a json file, namely 19-01-2018.json
要求是,创建的json文件每个必须分成30行.因此,我的目标文件将类似于19-01-2018_1.json,19-01-2018_2.json等.
The requirement is that, the json files created need to be split up to 30 rows each. So, my target files will be like 19-01-2018_1.json, 19-01-2018_2.json etc.
源json看起来像这样:
source json looks like this:
创建的每个json文件都需要进一步拆分为每个30行的单独json,因为我需要提取到Azure且存在大小限制.
And each of those json files created need to be further split into separate jsons with 30 rows each, since I need to ingest to Azure and there is a size constraint.
下面是我用来将csv转换为json的代码.我希望将json进一步分成每个30行的json.
Below is the code I used to convert csv to json. I wish to further split the json's into jsons of 30 rows each.
for i in files:
csvfile = open(path+i, 'r')
jsonfile = open(output+i.split('.')[0]+'.json', 'w')
reader = csv.DictReader(csvfile)
for row in reader:
json.dump(row, jsonfile)
jsonfile.write('\n')
任何帮助将不胜感激.
谢谢, Shyam
推荐答案
将每行追加到列表中,并且当列表大小达到30
时,将其转储到文件中.
Append each row to a list, and when the list size reaches 30
, dump it to the file.
for i in files:
out_index = 0
with open(path+i, 'r') as csvfile
reader = csv.DictReader(csvfile)
rowlist = []
for row in reader:
rowlist.append(row)
if len(rowlist) == 30:
dump_list_to_json(rowlist, path+i, out_index)
rowlist = []
out_index += 1
# dump the last batch
if len(rowlist) > 0:
dump_list_to_json(rowlist, path+i, out_index)
def dump_list_to_json(rowlist, csv_filename, index):
json_filename = csv_filename.replace('.csv', '_'+index+'.csv')
with open(json_filename, 'w') as jsonfile:
json.dump(rowlist, jsonfile);
jsonfile.write('\n')
这篇关于根据行数拆分JSON文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!