将JSON文件转换为Pandas数据框 [英] Convert JSON file to Pandas dataframe
问题描述
我想将JSON转换为Pandas数据框.
I would like to convert a JSON to Pandas dataframe.
我的JSON看起来像: 像:
My JSON looks like: like:
{
"country1":{
"AdUnit1":{
"floor_price1":{
"feature1":1111,
"feature2":1112
},
"floor_price2":{
"feature1":1121
}
},
"AdUnit2":{
"floor_price1":{
"feature1":1211
},
"floor_price2":{
"feature1":1221
}
}
},
"country2":{
"AdUnit1":{
"floor_price1":{
"feature1":2111,
"feature2":2112
}
}
}
}
我使用以下代码从GCP中读取了文件:
I read the file from GCP using this code:
project = Context.default().project_id
sample_bucket_name = 'my_bucket'
sample_bucket_path = 'gs://' + sample_bucket_name
print('Object: ' + sample_bucket_path + '/json_output.json')
sample_bucket = storage.Bucket(sample_bucket_name)
sample_bucket.create()
sample_bucket.exists()
sample_object = sample_bucket.object('json_output.json')
list(sample_bucket.objects())
json = sample_object.read_stream()
我的目标是获得如下所示的Pandas数据框:
My goal to get Pandas dataframe which looks like:
我尝试使用 json_normalize ,但没有成功.
I tried using json_normalize, but didn't succeed.
推荐答案
您可以使用:
def flatten_dict(d):
""" Returns list of lists from given dictionary """
l = []
for k, v in sorted(d.items()):
if isinstance(v, dict):
flatten_v = flatten_dict(v)
for my_l in reversed(flatten_v):
my_l.insert(0, k)
l.extend(flatten_v)
elif isinstance(v, list):
for l_val in v:
l.append([k, l_val])
else:
l.append([k, v])
return l
此函数接收字典(包括值也可以是列表的嵌套)并将其展平为列表.
This function receives a dictionary (including nesting where values could also be lists) and flattens it to a list of lists.
然后,您可以简单地:
df = pd.DataFrame(flatten_dict(my_dict))
其中my_dict
是您的JSON对象.
以您的示例为例,运行print(df)
时得到的是:
Where my_dict
is your JSON object.
Taking your example, what you get when you run print(df)
is:
0 1 2 3 4
0 country1 AdUnit1 floor_price1 feature1 1111
1 country1 AdUnit1 floor_price1 feature2 1112
2 country1 AdUnit1 floor_price2 feature1 1121
3 country1 AdUnit2 floor_price1 feature1 1211
4 country1 AdUnit2 floor_price2 feature1 1221
5 country2 AdUnit1 floor_price1 feature1 2111
6 country2 AdUnit1 floor_price1 feature2 2112
在创建数据框时,您可以命名列和索引
And when you create the dataframe, you can name your columns and index
这篇关于将JSON文件转换为Pandas数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!