将 pandas 系列列表转换为数据框 [英] Convert pandas series of lists to dataframe

查看:71
本文介绍了将 pandas 系列列表转换为数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由列表组成的系列

I have a series made of lists

import pandas as pd
s = pd.Series([[1, 2, 3], [4, 5, 6]])

,我想要一个DataFrame每列一个列表。

and I want a DataFrame with each column a list.

from_items from_records ,<$ c都不存在$ c> DataFrame Series.to_frame 似乎有效。

该怎么做?

推荐答案

@Hatshepsut指出在注释中, from_items 从0.23版起已弃用。该链接建议改用 from_dict ,因此可以将旧答案修改为:

As @Hatshepsut pointed out in the comments, from_items is deprecated as of version 0.23. The link suggests to use from_dict instead, so the old answer can be modified to:

pd.DataFrame.from_dict(dict(zip(s.index, s.values)))

--------------------------------------------------- ---旧答案---------------------------------------------- ----------------

--------------------------------------------------OLD ANSWER-------------------------------------------------------------

您可以像这样使用 from_items (假设您的列表长度相同):

You can use from_items like this (assuming that your lists are of the same length):

pd.DataFrame.from_items(zip(s.index, s.values))

   0  1
0  1  4
1  2  5
2  3  6

pd.DataFrame.from_items(zip(s.index, s.values)).T

   0  1  2
0  1  2  3
1  4  5  6

取决于您所需的输出。

这可能比使用 apply (如 @Wen的答案中所使用的,

This can be much faster than using an apply (as used in @Wen's answer which, however, does also work for lists of different length):

%timeit pd.DataFrame.from_items(zip(s.index, s.values))
1000 loops, best of 3: 669 µs per loop

%timeit s.apply(lambda x:pd.Series(x)).T
1000 loops, best of 3: 1.37 ms per loop

%timeit pd.DataFrame.from_items(zip(s.index, s.values)).T
1000 loops, best of 3: 919 µs per loop

%timeit s.apply(lambda x:pd.Series(x))
1000 loops, best of 3: 1.26 ms per loop

@Hatshepsut的答案相当快(也适用于列表的不同长度):

Also @Hatshepsut's answer is quite fast (also works for lists of different length):

%timeit pd.DataFrame(item for item in s)
1000 loops, best of 3: 636 µs per loop

%timeit pd.DataFrame(item for item in s).T
1000 loops, best of 3: 884 µs per loop

最快的解决方案似乎是 @Abdou的答案(已针对Python 2进行了测试;也适用于不同长度的列表;在Python 3.6及更高版本中使用 itertools.zip_longest ):

Fastest solution seems to be @Abdou's answer (tested for Python 2; also works for lists of different length; use itertools.zip_longest in Python 3.6+):

%timeit pd.DataFrame.from_records(izip_longest(*s.values))
1000 loops, best of 3: 529 µs per loop

另一个选项:

pd.DataFrame(dict(zip(s.index, s.values)))

   0  1
0  1  4
1  2  5
2  3  6

这篇关于将 pandas 系列列表转换为数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆