在大 pandas 数据框中存储复杂字典 [英] store complex dictionary in pandas dataframe

查看：103 发布时间：2017/3/26 3:48:07 python json pandas dictionary dataframe

本文介绍了在大 pandas 数据框中存储复杂字典的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这个问题跟随我以前的一个。这是一个母语字典，之前是
商店字典在大熊猫数据框中

我有一个字典

<$ p $$$ {$ {$ {$'$$$$$$ 'B'：200，'C'：300}，1：{'A'：200，'B'：300，'C'：300}，2：{'A'：500，'B' 'C'：300}}}，
234：{'choice'：1，'city'：'New York'，'choice_set'：{0：{'A'：100，'B'：400 }，1：{'A'：100，'B'：300，'C'：1000}}}，
1876：{'choice'：2，'city'：'New York'，'choice_set '：{0：{'A'：100，'B'：400，'C'：300}，1：{'A'：100，'B'：300，'C' 'A'：600，'B'：200，'C'：100}}
}}，
'伦敦'：{1534：{'choice'：0，'city' '，'choice_set'：{0：{'A'：100，'B'：400，'C '：300}，1：{'A'：200，'B'：300，'C'：300}，2：{'A'：500，'B'：300，'C'：300}}} ，
2134：{'choice'：1，'city'：'London'，'choice_set'：{0：{'A'：100，'B'：600}，1：{'A' 170，'B'：300，'C'：1000}}}，
1776：{'choice'：2，'city'：'London'，'choice_set'：{0：{'A' 100，'B'：400，'C'：500}，1：{'A'：100，'B'：300}，2：{'A'：600，'B'：200，'C' 100'}}}，

'Paris'：{1534：{'choice'：0，'city'：'Paris'，'choice_set'：{0：{'A' 'B'：400，'C'：300}，1：{'A'：200，'B'：300，'C'：300}，2：{'A'：500，'B' 'C'：300}}}，
2134：{'choice'：1，'city'：'Paris'，'choice_set'：{0：{'A'：100，'B'：600} ，1：{'A'：170，'B'：300，'C'：1000}}}，
1776：{'choice'：1，'city'：'Paris'，'choice_set' {0：{'A'：100，'B'：400，'C'：500}，1：{'A'：100，'B'：300}}}
}}
我想要它成为一个大熊猫数据框架这个（一些具体的内容可能不太准确）

  id选项A_0 B_0 C_0 A_1 B_1 C_1 A_2 B_2 C_2纽约伦敦巴黎
 1234 0 100 200 300 200 300 300 500 300 300 1 0 0 
 234 1 100 400  -  100 300 1000  -   -   -  1 0 0 
 1876 2 100 400 300 100 300 1000 600 200 100 1 0 0 
 1534 0 100 200 300 200 300 300 500 300 300 0 1 0 
 2134 1 100 400  -  100 300 1000  -   -   -  0 1 0 
 2006 2 100 400 300 100 300 1000 600 200 100 0 1 0 
 1264 0 100 200 300 200 300 300 500 300 300 0 0 1 
 1454 1 100 400  -  100 300 1000  -   -   -  0 0 1 
 1776 1 100 400 300 100 300  -   -   -   -  0 0 1

在老问题中， sub_dictionary的一种方法：

  df = pd.read_json（json.dumps（dictionary_example））。T 
 
 
 def to_s（r）：
 return pd.read_json（json.dumps（r））。unpack（）
 
 flattened_choice_set = df [choice_set]。apply（to_s）
 
 flattened_choice_set.columns = [' _'。join（（str（col [0]），col [1]））for col in flattened_choice_set.columns] 
 
 result = pd.merge（df，flattened_choice_set，
 left_index = True，right_index = True）.drop（choice_set，axis = 1）

做大字典吗？

所有最好的，
凯文

解决方案

如前所述，以前提供的解决方案不是很整齐。这一个更可读，为您当前的问题提供解决方案。如果可能，您应该重新考虑您的数据结构，但...

  df = pd.DataFrame（）
 question_ids = [ 0,1,2]

为每个城市选择组合创建一行数据帧，其中包含字典在__cample.iteritems（）中的_，city_value的选择集列

  
 city_df = pd.DataFrame.from_dict （city_value）.T 
 city_df = city_df.join（pd.DataFrame（city_df [choice_set]。to_dict（））。T）
 df = df.append（city_df）

将选择集中的奇怪列名加入您的df

  for i in question_ids：
 choice_df = pd.DataFrame（df [i] .to_dict（））。T 
 choice_df.columns = map（lambda x：{ } _ {}。format（x，i），choice_df.columns）
 df = df.join（choice_df）

修复城市列

  df = pd.get_dummies（df，prefix =，prefix_sep =，columns = ['city']）
 df.drop（ question_ids + ['choice_set']，axis = 1，inplace = True）
＃可选从问题中删除NaN：
＃df = df.fillna（0）
 df

This question follows my previous one.it's a mother dictionary of the one before store dictionary in pandas dataframe

I have a dictionary

  dictionary_example={'New York':{1234:{'choice':0,'city':'New York','choice_set':{0:{'A':100,'B':200,'C':300},1:{'A':200,'B':300,'C':300},2:{'A':500,'B':300,'C':300}}},
   234:{'choice':1,'city':'New York','choice_set':{0:{'A':100,'B':400},1:{'A':100,'B':300,'C':1000}}},
   1876:{'choice':2,'city':'New York','choice_set':{0:{'A': 100,'B':400,'C':300},1:{'A':100,'B':300,'C':1000},2:{'A':600,'B':200,'C':100}}
  }},
    'London':{1534:{'choice':0,'city':'London','choice_set':{0:{'A':100,'B':400,'C':300},1:{'A':200,'B':300,'C':300},2:{'A':500,'B':300,'C':300}}},  
   2134:{'choice':1,'city':'London','choice_set':{0:{'A':100,'B':600},1:{'A':170,'B':300,'C':1000}}},
   1776:{'choice':2,'city':'London','choice_set':{0:{'A':100,'B':400,'C':500},1:{'A':100,'B':300},2:{'A':600,'B':200,'C':100}}}},

    'Paris':{1534:{'choice':0,'city':'Paris','choice_set':{0:{'A':100,'B':400,'C':300},1:{'A':200,'B':300,'C':300},2:{'A':500,'B':300,'C':300}}},
   2134:{'choice':1,'city':'Paris','choice_set':{0:{'A':100,'B':600},1:{'A':170,'B':300,'C':1000}}},
   1776:{'choice':1,'city':'Paris','choice_set':{0:{'A': 100,'B':400,'C':500},1:{'A':100,'B':300}}}
  }}

I want it become a pandas data frame like this (some specific value inside maybe not exactly accurate)

id choice  A_0  B_0  C_0  A_1  B_1  C_1  A_2  B_2  C_2 New York London Paris
1234  0     100  200 300  200  300  300  500  300  300    1      0      0
234  1      100  400  -   100  300  1000  -    -    -    1       0      0
1876  2     100  400  300  100  300  1000 600 200 100    1      0       0
1534  0     100  200 300  200  300  300  500  300  300    0      1      0
2134  1      100  400  -   100  300  1000  -    -    -    0       1      0
2006  2     100  400  300  100  300  1000 600 200 100    0      1       0
1264  0     100  200 300  200  300  300  500  300  300    0      0      1
1454  1      100  400  -   100  300  1000  -    -    -    0      0      1
1776  1     100  400  300  100  300     -   -    -    -   0      0       1

In the old question the nice guy provide a way for the sub_dictionary:

df = pd.read_json(json.dumps(dictionary_example)).T


def to_s(r):
    return pd.read_json(json.dumps(r)).unstack()

flattened_choice_set = df["choice_set"].apply(to_s)

flattened_choice_set.columns = ['_'.join((str(col[0]), col[1])) for col in flattened_choice_set.columns] 

result = pd.merge(df, flattened_choice_set, 
         left_index=True, right_index=True).drop("choice_set", axis=1)

Any way to do for the large dictionary?

All the best, Kevin

解决方案

The previously provided solution, as you quote, is not a very neat one. This one is more readable and provides the solution for your current problem. If possible you should reconsider your data structure though...

df = pd.DataFrame()
question_ids = [0,1,2]

Create a dataframe with a row for every city-choice combination, with dictionary in choice set column

for _, city_value in dictionary_example.iteritems():
    city_df = pd.DataFrame.from_dict(city_value).T
    city_df = city_df.join(pd.DataFrame(city_df["choice_set"].to_dict()).T)
    df = df.append(city_df)

Join the weird column names from choice set to your df

for i in question_ids:
    choice_df = pd.DataFrame(df[i].to_dict()).T
    choice_df.columns = map(lambda x: "{}_{}".format(x,i), choice_df.columns)
    df = df.join(choice_df)

Fix the city columns

df = pd.get_dummies(df, prefix="", prefix_sep="", columns=['city'])
df.drop(question_ids + ['choice_set'], axis=1, inplace=True)
# Optional to remove NaN from questions:
# df = df.fillna(0)
df

这篇关于在大 pandas 数据框中存储复杂字典的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在大 pandas 数据框中存储复杂字典 [英] store complex dictionary in pandas dataframe

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在大 pandas 数据框中存储复杂字典 [英] store complex dictionary in pandas dataframe

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭