字典列表中的Pandas DataFrame [英] Pandas DataFrame from list of lists of dicts

查看:138
本文介绍了字典列表中的Pandas DataFrame的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据结构,该数据结构是字典列表的列表:

I have a data structure that is a list of lists of dicts:

[
    [{'Height': 86, 'Left': 1385, 'Top': 215, 'Width': 86},
     {'Height': 87, 'Left': 865, 'Top': 266, 'Width': 87},
     {'Height': 103, 'Left': 271, 'Top': 506, 'Width': 103}],
    ...
]

我可以将其转换为数据框:

I can convert it to a data frame:

detections[0:1]
df = pd.DataFrame(detections)
pd.DataFrame(df.apply(pd.Series).stack())

哪个产量:

这几乎是我想要的,但是:

我如何将每个单元格中的字典转换为具有左",上",宽度",高度"列的行?

How would I turn the dictionary in each of the cells into a row with columns 'Left', 'Top', 'Width' 'Height'?

推荐答案

要添加到 Psidom的答案,列表可以也可以使用itertools.chain.from_iterable展平.

To add to Psidom's answer, the list can also be flattened using itertools.chain.from_iterable.

from itertools import chain

pd.DataFrame(list(chain.from_iterable(detections)))

在我的实验中,对于大量的块"来说,速度大约是它的两倍.

In my experiments this was about twice as fast for a large number of "chunks."

In [1]: %timeit [r for d in detections for r in d]
10000 loops, best of 3: 69.9 µs per loop

In [2]: %timeit list(chain.from_iterable(detections))
10000 loops, best of 3: 34 µs per loop


如果您实际上希望最终数据框中的索引反映原始分组,则可以使用


If you actually want the index in the final data frame to reflect the original grouping, you can accomplish this with

pd.DataFrame(detections).stack().apply(pd.Series)

       Height  Left  Top  Width
0   0      86  1385  215     86
    1      87   865  266     87
    2     103   271  506    103
1   0      86  1385  215     86
    1      87   865  266     87
    2     103   271  506    103

您已经很亲密,但是您需要在堆积索引之后应用pd.Series .

You were close, but you need to apply pd.Series after stacking the indices.

这篇关于字典列表中的Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆