使用 json_normalize 将嵌套的 JSON 转换为 Dataframe [英] Convert Nested JSON in to Dataframe using json_normalize

查看:84
本文介绍了使用 json_normalize 将嵌套的 JSON 转换为 Dataframe的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试将 JSON 数据展平为从 API 响应生成的 Pandas 数据帧.

Trying to flattened JSON data in to pandas dataframe generated from API response.

在下面的代码中,我使用 Json_normlize 将原始 json 数据展平到不同的列

In below code, i flattened raw json data in to different columns using Json_normlize

raw_json = json_normalize(data["data"],sep="_")

Pages 包含 col1 & 中提到的 JSON 数据col2.Using apply(pd.Series),将页面数据拆分为 2 个不同的列,名为 col1 &col2

Pages contains JSON data mentioned in col1 & col2.Using apply(pd.Series), split pages data in to 2 different columns named col1 & col2

raw_pages=raw_json['pages'].apply(pd.Series)  

现在,我需要帮助从 col1 & 中提取(id、choice_id、row_id、simple_text & 标题)col2.

Now, I need help in extracting (id, choice_id,row_id,simple_text & heading) from col1 & col2.

Col1:

{
  'id': '124550266',
  'questions': [
    {
      'id': '471362125',
      'answers': [
        {
          'tag_data': [
            {
              'hexcolor': '671E75',
              'label': 'sm_processed_q',
              'tag_type': 'sentiment'
            },
            {
              'hexcolor': '671E75',
              'label': 'sm_processed_q',
              'tag_type': 'sentiment'
            }
          ],
          'simple_text': '9999999999'
        }
      ],
      'family': 'open_ended',
      'subtype': 'single',
      'heading': 'Please confirm your registered number with us'
    },
    {
      'id': '471362121',
      'answers': [
        {
          'choice_id': '3114700230',
          'row_id': '3114700224',
          'simple_text': '5'
        }
      ],
      'family': 'matrix',
      'subtype': 'rating',
      'heading': 'How likely is it that you would recommend xxxxx to a friend or colleague?'
    }
  ]
}

COl2:

{
  'id': '124550267',
  'questions': [
    {
      'id': '471362124',
      'answers': [
        {
          'choice_id': '3114700246',
          'row_id': '3114700251'
          
        },
        {
          'choice_id': '3114700247',
          'row_id': '3114700254'
          
        },
        {
          'choice_id': '3114700248',
          'row_id': '3114700255'
          
        },
        {
          'choice_id': '3114700248',
          'row_id': '3114700257'
          
        },
        {
          'choice_id': '3114700249',
          'row_id': '3114700259'
          
        }
      ],
      'family': 'matrix',
      'subtype': 'rating',
      'heading': 'Dear customer how much would you rate us)'
    },
    {
      'id': '471362122',
      'answers': [
        {
          'tag_data': [
            {
              'hexcolor': '05467E',
              'label': 'sm_positive',
              'tag_type': 'sentiment'
            }
          ],
          'simple_text': 'More variety  should be addwd'
        }
      ],
      'family': 'open_ended',
      'subtype': 'essay',
      'heading': 'Dear Customer,<br>Kindly help us, by providing your valuable suggestions or feedback'
    }
  ]
}

推荐答案

试试这个,d1 和 d2 是你上面发布的字典:

Try this, d1 and d2 are the dicts you posted above:

df1 = pd.json_normalize(d1,
                        record_path=['questions', 'answers'],
                        meta=[['id'], ['questions', 'heading']],
                        meta_prefix='toplevel.').drop(columns=['tag_data'])
print(df1)

df2 = pd.json_normalize(d2,
                        record_path=['questions', 'answers'],
                        meta=[['id'], ['questions', 'heading']],
                        meta_prefix='toplevel.').drop(columns=['tag_data'])
print(df2)

df1:

  simple_text   choice_id      row_id toplevel.id                         toplevel.questions.heading
0  9999999999         NaN         NaN   124550266      Please confirm your registered number with us
1           5  3114700230  3114700224   124550266  How likely is it that you would recommend xxxx...

df2:

    choice_id      row_id                    simple_text toplevel.id                         toplevel.questions.heading
0  3114700246  3114700251                            NaN   124550267          Dear customer how much would you rate us)
1  3114700247  3114700254                            NaN   124550267          Dear customer how much would you rate us)
2  3114700248  3114700255                            NaN   124550267          Dear customer how much would you rate us)
3  3114700248  3114700257                            NaN   124550267          Dear customer how much would you rate us)
4  3114700249  3114700259                            NaN   124550267          Dear customer how much would you rate us)
5         NaN         NaN  More variety  should be addwd   124550267  Dear Customer,<br>Kindly help us, by providing...

这篇关于使用 json_normalize 将嵌套的 JSON 转换为 Dataframe的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆