如何有效地重新排列 pandas 数据,如下所示? [英] How to efficiently rearrange pandas data as follows?
问题描述
在以下操作的大熊猫中,我需要一些简洁而有效的配方帮助:
I need some help with a concise and first of all efficient formulation in pandas of the following operation:
给出格式的数据框
id a b c d
1 0 -1 1 1
42 0 1 0 0
128 1 -1 0 1
构造以下格式的数据框:
Construct a data frame of the format:
id one_entries
1 "c d"
42 "b"
128 "a d"
也就是说,"one_entries"列包含其原始框架中的条目为1的列的串联名称.
That is, the column "one_entries" contains the concatenated names of the columns for which the entry in the original frame is 1.
推荐答案
这是使用布尔规则和应用lambda函数的一种方法.
Here's one way using boolean rule and applying lambda func.
In [58]: df
Out[58]:
id a b c d
0 1 0 -1 1 1
1 42 0 1 0 0
2 128 1 -1 0 1
In [59]: cols = list('abcd')
In [60]: (df[cols] > 0).apply(lambda x: ' '.join(x[x].index), axis=1)
Out[60]:
0 c d
1 b
2 a d
dtype: object
您可以将结果分配给df['one_entries'] =
应用功能的详细信息.
排第一行.
In [83]: x = df[cols].ix[0] > 0
In [84]: x
Out[84]:
a False
b False
c True
d True
Name: 0, dtype: bool
x
为您提供该行的布尔值,该值大于零. x[x]
将仅返回True
.本质上是一个以列名作为索引的系列.
x
gives you Boolean values for the row, values greater than zero. x[x]
will return only True
. Essentially a series with column names as index.
In [85]: x[x]
Out[85]:
c True
d True
Name: 0, dtype: bool
x[x].index
为您提供列名.
In [86]: x[x].index
Out[86]: Index([u'c', u'd'], dtype='object')
这篇关于如何有效地重新排列 pandas 数据,如下所示?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!