如何将数据框转换成一系列列表? [英] How do I turn a dataframe into a series of lists?
本文介绍了如何将数据框转换成一系列列表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
df = pd.DataFrame([[1,2,3,4],[5,6 ,7,8]],['a','b'],['A','B','C','D'])
print df
ABCD
a 1 2 3 4
b 5 6 7 8
I想要将 df
转换成:
pd.Series([[1 ,2,3,4],[5,6,7,8]],['a','b'])
a [1,2,3,4]
b [5,6,7,8]
dtype:object
/ p>
df.apply(list,axis = 1)
哪些只是让我回来一样 df
什么是
解决方案
您可以先转换 DataFrame
to numpy array
by 值
,然后转换为列表,最后创建新的系列
,索引从 df
如果需要更快的解决方案:
print(pd.Series(df.values.tolist(),index = df.index))
a [1,2,3,4]
b [5,6, 7,8]
dtype:object
小DataFrame的时间:
在[76]中:%timeit(pd.Series(df.values.tolist(),index = df.index))
1000循环,最佳3:295μs每循环
在[77]中:%timeit pd.Series(df.T.to_dict('list'))
1000循环,最好的3:每循环685μs
在[78]中:%timeit df.T.apply(tuple).apply(list)
1000循环,最好为3:958μs每循环
和大:
from string import ascii_letters
letters = list(ascii_letters)
df = pd.DataFrame(np.random.choice(range(10),(52 ** 2,52)) ,
pd.MultiIndex.from_product([letter,letters]),
letters)
在[71]中:%timeit(pd.Series(df.values.tol ist(),index = df.index))
100循环,最好3:2.06 ms每循环
在[72]中:%timeit pd.Series(df.T.to_dict ('list'))
1循环,最好3:203 ms每循环
在[73]:%timeit df.T.apply(tuple).apply(list)
1循环,最佳3:506 ms每循环
I have had to do this several times and I'm always frustrated. I have a dataframe:
df = pd.DataFrame([[1, 2, 3, 4], [5, 6, 7, 8]], ['a', 'b'], ['A', 'B', 'C', 'D'])
print df
A B C D
a 1 2 3 4
b 5 6 7 8
I want to turn df
into:
pd.Series([[1, 2, 3, 4], [5, 6, 7, 8]], ['a', 'b'])
a [1, 2, 3, 4]
b [5, 6, 7, 8]
dtype: object
I've tried
df.apply(list, axis=1)
Which just gets me back the same df
What is a convenient/effective way to do this?
解决方案
You can first convert DataFrame
to numpy array
by values
, then convert to list and last create new Series
with index from df
if need faster solution:
print (pd.Series(df.values.tolist(), index=df.index))
a [1, 2, 3, 4]
b [5, 6, 7, 8]
dtype: object
Timings with small DataFrame:
In [76]: %timeit (pd.Series(df.values.tolist(), index=df.index))
1000 loops, best of 3: 295 µs per loop
In [77]: %timeit pd.Series(df.T.to_dict('list'))
1000 loops, best of 3: 685 µs per loop
In [78]: %timeit df.T.apply(tuple).apply(list)
1000 loops, best of 3: 958 µs per loop
and with large:
from string import ascii_letters
letters = list(ascii_letters)
df = pd.DataFrame(np.random.choice(range(10), (52 ** 2, 52)),
pd.MultiIndex.from_product([letters, letters]),
letters)
In [71]: %timeit (pd.Series(df.values.tolist(), index=df.index))
100 loops, best of 3: 2.06 ms per loop
In [72]: %timeit pd.Series(df.T.to_dict('list'))
1 loop, best of 3: 203 ms per loop
In [73]: %timeit df.T.apply(tuple).apply(list)
1 loop, best of 3: 506 ms per loop
这篇关于如何将数据框转换成一系列列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文