从满足条件的 Pandas 列中提取属性 [英] extract attributes from pandas columns that satisfy a condition
本文介绍了从满足条件的 Pandas 列中提取属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有一个包含 3 个不同变量的频率表:M1、M2 和 M3,在不同的实例中:P1、... P4:
Let's say I have a table of frequencies of 3 different variables: M1, M2 and M3, over different instances: P1, ... P4:
tupl = [(0.7, 0.2, 0.1), (0,0,1), (0.2,0.6,0.2), (0.6,0.4,0)]
df_test = pd.DataFrame(tupl, columns = ["M1", "M2", "M3"], index =["P1", "P2", "P3", "P4"])
现在,对于每一行,我希望能够将每个变量的出现情况提取为字符串,以便最终输出类似于:
Now for each row, I want to be able to extract as a string, the occurrence of each variable, such that the final output would be something like:
output = pd.DataFrame([("M1+M2+M3"), ("M3"), ("M1+M2+M3"), ("M1+M2")], columns = ["label"], index = ["P1", "P2", "P3", "P4"])
我想过使用 np.where(df_test!=0) 之类的东西,但是如何将列名作为字符串粘贴到输出中?
I thought about using something like np.where(df_test!=0) but then how do I paste the column names as a string into the output?
推荐答案
您可以使用 np.where 用标签填充单元格,然后将它们作为字符串连接.
You can use np.where to fill the cells with labels and then join them as a string.
(
df_test.gt(0).apply(lambda x: np.where(x, x.name, None))
.apply(lambda x: '+'.join(x.dropna()), axis=1)
.to_frame('label')
)
label
P1 M1+M2+M3
P2 M3
P3 M1+M2+M3
P4 M1+M2
这篇关于从满足条件的 Pandas 列中提取属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文