如何拆分大 pandas 数据框中的元组列? [英] how to split column of tuples in pandas dataframe?

查看:235
本文介绍了如何拆分大 pandas 数据框中的元组列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大熊猫数据框(这只是一小部分)

I have a pandas dataframe (this is only a little piece)

>>> d1
   y norm test  y norm train  len(y_train)  len(y_test)  \
0    64.904368    116.151232          1645          549   
1    70.852681    112.639876          1645          549   

                                    SVR RBF  \
0   (35.652207342877873, 22.95533537448393)   
1  (39.563683797747622, 27.382483096332511)   

                                        LCV  \
0  (19.365430594452338, 13.880062435173587)   
1  (19.099614489458364, 14.018867136617146)   

                                   RIDGE CV  \
0  (4.2907610988480362, 12.416745648065584)   
1    (4.18864306788194, 12.980833914392477)   

                                         RF  \
0   (9.9484841581029428, 16.46902345373697)   
1  (10.139848213735391, 16.282141345406522)   

                                           GB  \
0  (0.012816232716538605, 15.950164822266007)   
1  (0.012814519804493328, 15.305745202851712)   

                                             ET DATA  
0  (0.00034337162272515505, 16.284800366214057)  j2m  
1  (0.00024811554516431878, 15.556506191784194)  j2m  
>>> 

我想拆分包含元组的所有列。例如,我想用列 LCV-a LCV替换列 LCV b

I want to split all the columns that contain tuples. For example I want to replace the column LCV with the columns LCV-a and LCV-b .

我该怎么做?

编辑:

提出的解决方案不起作用?为什么?

The proposed solution does not work why??

>>> d1['LCV'].apply(pd.Series)
                                          0
0  (19.365430594452338, 13.880062435173587)
1  (19.099614489458364, 14.018867136617146)
>>> 

编辑:
这似乎在工作

This seems to be working

>>> d1['LCV'].apply(eval).apply(pd.Series)
           0          1
0  19.365431  13.880062
1  19.099614  14.018867
>>> 


推荐答案

您可以通过在该列中应用(pd.Series)

In [13]: df = pd.DataFrame({'a':[1,2], 'b':[(1,2), (3,4)]})

In [14]: df
Out[14]:
   a       b
0  1  (1, 2)
1  2  (3, 4)

In [16]: df['b'].apply(pd.Series)
Out[16]:
   0  1
0  1  2
1  3  4

In [17]: df[['b1', 'b2']] = df['b'].apply(pd.Series)

In [18]: df
Out[18]:
   a       b  b1  b2
0  1  (1, 2)   1   2
1  2  (3, 4)   3   4

这是因为它使每个元组成为一个系列,然后被看作是数据帧的一行。

This works because it makes of each tuple a Series, which is then seen as a row of a dataframe.

这篇关于如何拆分大 pandas 数据框中的元组列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆