在python中从多个JSON文件提取信息到单个CSV文件 [英] Extracting infromation from multiple JSON files to single CSV file in python
问题描述
我有一个包含多个字典的JSON文件:
I have a JSON file with multiple dictionaries:
{"team1participants":
[ {
"stats": {
"item1": 3153,
"totalScore": 0,
...
}
},
{
"stats": {
"item1": 2123,
"totalScore": 5,
...
}
},
{
"stats": {
"item1": 1253,
"totalScore": 1,
...
}
}
],
"team2participants":
[ {
"stats": {
"item1": 1853,
"totalScore": 2,
...
}
},
{
"stats": {
"item1": 21523,
"totalScore": 5,
...
}
},
{
"stats": {
"item1": 12503,
"totalScore": 1,
...
}
}
]
}
b $ b
换句话说,JSON有多个键。每个键都有一个列表,其中包含各个参与者的统计信息。
In other words, the JSON has multiple keys. Each key has a list containing statistics of individual participants.
我有很多这样的JSON文件,我想将其解压缩为单个CSV文件。我当然可以手动这样做,但这是非常乏味。我知道DictWriter,但它似乎只适用于单个词典。我也知道字典可以连接,但它会有问题,因为所有的字典有相同的键。
I have many such JSON files, and I want to extract it to a single CSV file. I can of course do this manually, but this is very tedious. I know of DictWriter, but it seems to work only for single dictionaries. I also know that dictionaries can be concatenated, but it will be problematic because all dictionaries have the same keys.
如何有效地提取这个CSV文件? p>
How can I efficiently extract this to a CSV file?
推荐答案
您可以使数据整齐,以便每一行都是唯一的观察结果。
You can make your data tidy so that each row is a unique observation.
teams = []
items = []
scores = []
for team in d:
for item in d[team]:
teams.append(team)
items.append(item['stats']['item1'])
scores.append(item['stats']['totalScore'])
# Using Pandas.
import pandas as pd
df = pd.DataFrame({'team': teams, 'item': items, 'score': scores})
>>> df
item score team
0 1853 2 team2participants
1 21523 5 team2participants
2 12503 1 team2participants
3 3153 0 team1participants
4 2123 5 team1participants
5 1253 1 team1participants
您也可以使用列表解析而不是循环。
You could also use a list comprehension instead of a loop.
results = [[team, item['stats']['item1'], item['stats']['totalScore']]
for team in d for item in d[team]]
df = pd.DataFrame(results, columns=['team', 'item', 'score'])
然后,您可以执行数据透视表,例如:
You can then do a pivot table, for example:
>>> df.pivot_table(values='score ', index='team ', columns='item', aggfunc='sum').fillna(0)
item 1253 1853 2123 3153 12503 21523
team
team1participants 1 0 5 0 0 0
team2participants 0 2 0 0 1 5
此外,现在它是一个数据框架,很容易将它保存为CSV。
Also, now that it is a dataframe, it is easy to save it as a CSV.
df.to_csv(my_file_name.csv)
这篇关于在python中从多个JSON文件提取信息到单个CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!