将Pandas DataFrame转换为JSON [英] Converting Pandas DataFrame to JSON

查看:282
本文介绍了将Pandas DataFrame转换为JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已将数据存储在pandas数据框中,我想将tat转换为JSON格式.可以使用以下代码复制示例数据

I've data stored in pandas dataframe and I want to convert tat into a JSON format. Example data can be replicated using following code

data = {'Product':['A', 'B', 'A'],
        'Zone':['E/A', 'A/N', 'E/A'],
        'start':['08:00:00', '09:00:00', '12:00:00'],
        'end':['12:30:00', '17:00:00', '17:40:00'],
        'seq':['0, 1, 2 ,3 ,4','0, 1, 2 ,3 ,4', '0, 1, 2 ,3 ,4'],
        'store':['Z',"'AS', 'S'", 'Z']
        }

df = pd.DataFrame(data)

我尝试使用以下代码将其转换为JSON格式

I've tried converting it into JSON format using following code

df_parsed = json.loads(df.to_json(orient="records"))

从上方生成的输出

[{'Product': 'A', 'Zone': 'E/A', 'start': '08:00:00', 'end': '17:40:00', 'seq': '0, 1, 2 ,3 ,4', 'store': 'Z'}, {'Product': 'B', 'Zone': 'A/N', 'start': '09:00:00', 'end': '17:00:00', 'seq': '0, 1, 2 ,3 ,4', 'store': 'AS'}, {'Product': 'A', 'Zone': 'E/A', 'start': '08:00:00', 'end': '17:40:00', 'seq': '0, 1, 2 ,3 ,4', 'store': 'Z'}]

所需结果:

{
'A': {'Zone': 'E/A', 
'tp': [{'start': [8, 0], 'end': [12, 0], 'seq': [0, 1, 2 ,3 ,4]},
      {'start': [12, 30], 'end': [17, 40], 'seq': [0, 1, 2 ,3 ,4]}],
      
'store': ['Z']
}, 
'B': {'Zone': 'A/N', 
'tp': [{'start': [9, 0], 'end': [17, 0], 'seq': [0, 1, 2 ,3 ,4]}],
      
'store': ['AS', 'S']
}
}

如果产品属于同一商店,则应将列startendseq的结果汇总,如所需输出所示.如果时间值是"09:00:00",只需表示小时和分钟,则开始时间和结束时间也应表示为[9,0],这样我们就可以从time列中丢弃秒值.

If a product belongs to same store the result for column start, end and seq should be clubbed as shown in desired output. Also start time and end time should be represented like [9,0] if value for time is "09:00:00" only hour and minute needs to be represented so we can discard value of seconds from time columns.

推荐答案

这会有点复杂.因此,您必须逐步进行操作:

This will be complicated a bit. So you have to do it step by step:

def funct(row):
    row['start'] = row['start'].str.split(':').str[0:2]
    row['end'] = row['end'].str.split(':').str[0:2]
    row['store'] = row['store'].str.replace("'", "").str.split(', ')

    d = (row.groupby('Zone')[row.columns[1:]]
        .apply(lambda x: x.to_dict(orient='record'))
        .reset_index(name='tp').to_dict(orient='row'))
    return d

di = df.groupby(['Product'])[df.columns[1:]].apply(funct).to_dict()


di:

{'A': [{'Zone': 'E/A',
   'tp': [{'start': ['08', '00'],
     'end': ['12', '30'],
     'seq': '0, 1, 2 ,3 ,4',
     'store': ['Z']},
    {'start': ['12', '00'],
     'end': ['17', '40'],
     'seq': '0, 1, 2 ,3 ,4',
     'store': ['Z']}]}],
 'B': [{'Zone': 'A/N',
   'tp': [{'start': ['09', '00'],
     'end': ['17', '00'],
     'seq': '0, 1, 2 ,3 ,4',
     'store': ['AS', 'S']}]}]}

说明:

  • 首先创建自己的自定义函数.
  • startend列更改为列表形式.
  • Zone分组,然后将to_dict应用于其余的列.
  • 重置索引并命名具有[{'start': ['08', '00'], 'end': ['12', '30'], 'seq': '0, 1, 2 ,3 ,4',的列 为tp.
  • 现在将to_dict应用于整个结果并将其返回.
  • Explanation:

    • 1st create your own custom function.
    • change the start, end column to a list form.
    • group by Zone and apply to_dict to rest of the columns.
    • reset index and name the column that are having [{'start': ['08', '00'], 'end': ['12', '30'], 'seq': '0, 1, 2 ,3 ,4', as tp.
    • now apply to_dict to the whole result and return it.
    • 最终,一旦您能够将数据框转换为以下格式,剩下的事情对您来说将变得很容易.

      Ultimately you need to convert your dataframe into this below format once you are able to do it the rest of the thing will become easy for you.

      Zone    tp
      E/A    [{'start': ['08', '00'], 'end': ['12', '30'], ...
      A/N    [{'start': ['09', '00'], 'end': ['17', '00'], ... 
      

      import pandas as pd
      import ast
      
      def funct(row):
          y = row['start'].str.split(':').str[0:-1]
          row['start'] = row['start'].str.split(':').str[0:2].apply(lambda x: list(map(int, x)))
          row['end'] = row['end'].str.split(':').str[0:2].apply(lambda x: list(map(int, x)))
          row['seq'] = row['seq'].apply(lambda x: list(map(int, ast.literal_eval(x))))
          row['store'] = row['store'].str.replace("'", "")
      
          d = (row.groupby('Zone')[row.columns[1:-1]]
              .apply(lambda x: x.to_dict(orient='record'))
              .reset_index(name='tp'))
          ######### For store create a different dataframe and then merge it to the other df ########
          d1 = (row.groupby('Zone').agg({'store': pd.Series.unique}))
          d1['store'] = d1['store'].str.split(",")
          d_merged = (pd.merge(d,d1, on='Zone', how='left')).to_dict(orient='record')[0]
          return d_merged
      
      di = df.groupby(['Product'])[df.columns[1:]].apply(funct).to_dict()
      


      di:

      {'A': {'Zone': 'E/A',
        'tp': [{'start': [8, 0], 'end': [12, 30], 'seq': [0, 1, 2, 3, 4]},
         {'start': [12, 0], 'end': [17, 40], 'seq': [0, 1, 2, 3, 4]}],
        'store': ['Z']},
       'B': {'Zone': 'A/N',
        'tp': [{'start': [9, 0], 'end': [17, 0], 'seq': [0, 1, 2, 3, 4]}],
        'store': ['AS', ' S']}}
       
      

      这篇关于将Pandas DataFrame转换为JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆