将JSON文件转换为Pandas数据框 [英] Convert JSON file to Pandas dataframe

查看:94
本文介绍了将JSON文件转换为Pandas数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将JSON转换为Pandas数据框.

I would like to convert a JSON to Pandas dataframe.

我的JSON看起来像: 像:

My JSON looks like: like:

{ 
   "country1":{ 
      "AdUnit1":{ 
         "floor_price1":{ 
            "feature1":1111,
            "feature2":1112
         },
         "floor_price2":{ 
            "feature1":1121
         }
      },
      "AdUnit2":{ 
         "floor_price1":{ 
            "feature1":1211
         },
         "floor_price2":{ 
            "feature1":1221
         }
      }
   },
   "country2":{ 
      "AdUnit1":{ 
         "floor_price1":{ 
            "feature1":2111,
            "feature2":2112
         }
      }
   }
}

我使用以下代码从GCP中读取了文件:

I read the file from GCP using this code:

project = Context.default().project_id
sample_bucket_name = 'my_bucket'
sample_bucket_path = 'gs://' + sample_bucket_name
print('Object: ' + sample_bucket_path + '/json_output.json')

sample_bucket = storage.Bucket(sample_bucket_name)
sample_bucket.create()
sample_bucket.exists()

sample_object = sample_bucket.object('json_output.json')
list(sample_bucket.objects())
json = sample_object.read_stream()

我的目标是获得如下所示的Pandas数据框:

My goal to get Pandas dataframe which looks like:

我尝试使用 json_normalize ,但没有成功.

I tried using json_normalize, but didn't succeed.

推荐答案

您可以使用:

def flatten_dict(d):
    """ Returns list of lists from given dictionary """
    l = []
    for k, v in sorted(d.items()):
        if isinstance(v, dict):
            flatten_v = flatten_dict(v)
            for my_l in reversed(flatten_v):
                my_l.insert(0, k)

            l.extend(flatten_v)

        elif isinstance(v, list):
            for l_val in v:
                l.append([k, l_val])

        else:
            l.append([k, v])

    return l

此函数接收字典(包括值也可以是列表的嵌套)并将其展平为列表.

This function receives a dictionary (including nesting where values could also be lists) and flattens it to a list of lists.

然后,您可以简单地:

df = pd.DataFrame(flatten_dict(my_dict))

其中my_dict是您的JSON对象. 以您的示例为例,运行print(df)时得到的是:

Where my_dict is your JSON object. Taking your example, what you get when you run print(df) is:

          0        1             2         3     4
0  country1  AdUnit1  floor_price1  feature1  1111
1  country1  AdUnit1  floor_price1  feature2  1112
2  country1  AdUnit1  floor_price2  feature1  1121
3  country1  AdUnit2  floor_price1  feature1  1211
4  country1  AdUnit2  floor_price2  feature1  1221
5  country2  AdUnit1  floor_price1  feature1  2111
6  country2  AdUnit1  floor_price1  feature2  2112

在创建数据框时,您可以命名列和索引

And when you create the dataframe, you can name your columns and index

这篇关于将JSON文件转换为Pandas数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆