将pandas函数应用于列以创建多个新列? [英] Apply pandas function to column to create multiple new columns?

查看:139
本文介绍了将pandas函数应用于列以创建多个新列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在熊猫中做到这一点:

How to do this in pandas:

我在单个文本列上有一个函数extract_text_features,返回多个输出列.具体来说,该函数返回6个值.

I have a function extract_text_features on a single text column, returning multiple output columns. Specifically, the function returns 6 values.

该函数有效,但是似乎没有任何合适的返回类型(pandas DataFrame/numpy array/Python list),以便可以正确分配输出df.ix[: ,10:16] = df.textcol.map(extract_text_features)

The function works, however there doesn't seem to be any proper return type (pandas DataFrame/ numpy array/ Python list) such that the output can get correctly assigned df.ix[: ,10:16] = df.textcol.map(extract_text_features)

所以我认为我需要回到df.iterrows()的迭代方式,按照

So I think I need to drop back to iterating with df.iterrows(), as per this?

更新: 用df.iterrows()进行迭代的速度至少要慢20倍,因此我放弃了该功能并将其拆分为六个不同的.map(lambda ...)调用.

UPDATE: Iterating with df.iterrows() is at least 20x slower, so I surrendered and split out the function into six distinct .map(lambda ...) calls.

更新2:围绕 v0.11.0 .因此,很多问题和答案都不太相关.

UPDATE 2: this question was asked back around v0.11.0. Hence much of the question and answers are not too relevant.

推荐答案

基于user1827356的答案,您可以使用df.merge一次完成分配:

Building off of user1827356 's answer, you can do the assignment in one pass using df.merge:

df.merge(df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1})), 
    left_index=True, right_index=True)

    textcol  feature1  feature2
0  0.772692  1.772692 -0.227308
1  0.857210  1.857210 -0.142790
2  0.065639  1.065639 -0.934361
3  0.819160  1.819160 -0.180840
4  0.088212  1.088212 -0.911788

请注意巨大的内存消耗和低速: https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/

Please be aware of the huge memory consumption and low speed: https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/ !

这篇关于将pandas函数应用于列以创建多个新列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆