从合并两个多索引dfs和列索引的元组列表构建dict [英] Build dict from list of tuples combining two multi index dfs and column index

查看:119
本文介绍了从合并两个多索引dfs和列索引的元组列表构建dict的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个多索引数据帧:mean和std

I have two multi-index dataframes: mean and std

arrays = [['A', 'A', 'B', 'B'], ['Z', 'Y', 'X', 'W']]

mean=pd.DataFrame(data={0.0:[np.nan,2.0,3.0,4.0], 60.0: [5.0,np.nan,7.0,8.0], 120.0:[9.0,10.0,np.nan,12.0]}, 
         index=pd.MultiIndex.from_arrays(arrays, names=('id', 'comp')))
mean.columns.name='Times'

std=pd.DataFrame(data={0.0:[10.0,10.0,10.0,10.0], 60.0: [10.0,10.0,10.0,10.0], 120.0:[10.0,10.0,10.0,10.0]}, 
         index=pd.MultiIndex.from_arrays(arrays, names=('id', 'comp')))
std.columns.name='Times'

我的任务是将它们合并到第一级以"{id:"作为字典的字典中,然后是第二级以"{comp:"作为字典的字典,然后为每个comp列出一个元组列表,其中将(时间点,均值,std). 因此,结果应如下所示:

My task is to combine them in a dictionary with '{id:' as first level, followed by second level dictionary with '{comp:' and then for each comp a list of tuples, which combines the (time-points, mean, std). So, the result should look like that:

{'A': {
     'Z': [(60.0,5.0,10.0),
            (120.0,9.0,10.0)],
      'Y': [(0.0,2.0,10.0),
            (120.0,10.0,10.0)]
       },
  'B': {
     'X': [(0.0,3.0,10.0),
            (60.0,7.0,10.0)],
      'W': [(0.0,4.0,10.0),
            (60.0,8.0,10.0),
            (120.0,12.0,10.0)]
       }
 }

另外,当数据中存在NaN时,三元组被忽略,因此在时间0处的值A,Z,在时间60处的值A,Y,在时间120的值X,在时间120的值.

Additionally, when there is NaN in data, the triplets are left out, so value A,Z at time 0, A,Y at time 60 B,X at time 120.

我怎么到达那里?我已经为单行构造了元组列表的字典:

How do I get there? I constructed already a dict of dict of list of tuples for a single line:

iter=0
{mean.index[iter][0]:{mean.index[iter][1]:list(zip(mean.columns, mean.iloc[iter], std.iloc[iter]))}}
>{'A': {'Z': [(0.0, 1.0, 10.0), (60.0, 5.0, 10.0), (120.0, 9.0, 10.0)]}}

现在,我需要扩展到一个字典,在每行{inner dict)上循环,并在每个{outer dict}上添加id.我从迭代和dic理解开始,但是在这里我遇到了问题,使用从iterrows()获得的iter('A','Z')进行索引,并迭代地构建整个dict.

Now, I need to extend to a dictionary with a loop over each line {inner dict) and adding the ids each {outer dict}. I started with iterrows and dic comprehension, but here I have problems, indexing with the iter ('A','Z') which i get from iterrows(), and building the whole dict, iteratively.

{mean.index[iter[1]]:list(zip(mean.columns, mean.loc[iter[1]], std.loc[iter[1]])) for (iter,row) in mean.iterrows()}

创建错误,而我只有内部循环

creates errors, and I would only have the inner loop

KeyError:'标签[Z]不在[索引]中

谢谢!

编辑:在此示例中,我将数字交换为浮点数,因为在此生成的整数之前与我的真实数据不一致,并且在后续的json转储时将失败.

EDIT: I exchanged the numbers to float in this example, because here integers were generated before which was not consistent with my real data, and which would fail in following json dump.

推荐答案

我发现了一种非常全面的方法来编写此嵌套字典:

I found a very comprehensive way of putting up this nested dict:

mean_dict_items=mean.to_dict(orient='index').items()
{k[0]:{u[1]:list(zip(mean.columns, mean.loc[u], std.loc[u]))
      for u,v in mean_dict_items if (k[0],u[1]) == u} for k,l in mean_dict_items}

创建:

{'A': {'Y': [(0.0, 2.0, 10.0), (60.0, nan, 10.0), (120.0, 10.0, 10.0)],
  'Z': [(0.0, nan, 10.0), (60.0, 5.0, 10.0), (120.0, 9.0, 10.0)]},
 'B': {'W': [(0.0, 4.0, 10.0), (60.0, 8.0, 10.0), (120.0, 12.0, 10.0)],
  'X': [(0.0, 3.0, 10.0), (60.0, 7.0, 10.0), (120.0, nan, 10.0)]}}

这篇关于从合并两个多索引dfs和列索引的元组列表构建dict的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆