将 pandas 函数应用于列以创建多个新列? [英] Apply pandas function to column to create multiple new columns?
问题描述
如何在熊猫中做到这一点:
我在单个文本列上有一个函数 extract_text_features
,返回多个输出列.具体来说,该函数返回 6 个值.
该函数有效,但是似乎没有任何正确的返回类型(pandas DataFrame/numpy 数组/Python 列表)使得输出可以正确分配 df.ix[: ,10:16]= df.textcol.map(extract_text_features)
所以我想我需要回到使用 df.iterrows()
进行迭代,按照 这个?
更新:使用 df.iterrows()
迭代至少要慢 20 倍,所以我放弃并将函数拆分为六个不同的 .map(lambda ...)
调用.>
更新 2:这个问题在 v0.11.0.因此,很多问题和答案都不太相关.
根据 user1827356 的回答,您可以使用 df.merge
一次性完成分配:
df.merge(df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1})),left_index=True, right_index=True)textcol 功能1 功能20 0.772692 1.772692 -0.2273081 0.857210 1.857210 -0.1427902 0.065639 1.065639 -0.9343613 0.819160 1.819160 -0.1808404 0.088212 1.088212 -0.911788
请注意巨大的内存消耗和低速:https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/ !
How to do this in pandas:
I have a function extract_text_features
on a single text column, returning multiple output columns. Specifically, the function returns 6 values.
The function works, however there doesn't seem to be any proper return type (pandas DataFrame/ numpy array/ Python list) such that the output can get correctly assigned df.ix[: ,10:16] = df.textcol.map(extract_text_features)
So I think I need to drop back to iterating with df.iterrows()
, as per this?
UPDATE:
Iterating with df.iterrows()
is at least 20x slower, so I surrendered and split out the function into six distinct .map(lambda ...)
calls.
UPDATE 2: this question was asked back around v0.11.0. Hence much of the question and answers are not too relevant.
Building off of user1827356 's answer, you can do the assignment in one pass using df.merge
:
df.merge(df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1})),
left_index=True, right_index=True)
textcol feature1 feature2
0 0.772692 1.772692 -0.227308
1 0.857210 1.857210 -0.142790
2 0.065639 1.065639 -0.934361
3 0.819160 1.819160 -0.180840
4 0.088212 1.088212 -0.911788
EDIT: Please be aware of the huge memory consumption and low speed: https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/ !
这篇关于将 pandas 函数应用于列以创建多个新列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!