从满足条件的 Pandas 列中提取属性 [英] extract attributes from pandas columns that satisfy a condition

查看:40
本文介绍了从满足条件的 Pandas 列中提取属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个包含 3 个不同变量的频率表:M1、M2 和 M3,在不同的实例中:P1、... P4:

Let's say I have a table of frequencies of 3 different variables: M1, M2 and M3, over different instances: P1, ... P4:

tupl = [(0.7, 0.2, 0.1), (0,0,1), (0.2,0.6,0.2), (0.6,0.4,0)]

df_test = pd.DataFrame(tupl, columns = ["M1", "M2", "M3"], index =["P1", "P2", "P3", "P4"])

现在,对于每一行,我希望能够将每个变量的出现情况提取为字符串,以便最终输出类似于:

Now for each row, I want to be able to extract as a string, the occurrence of each variable, such that the final output would be something like:

output = pd.DataFrame([("M1+M2+M3"), ("M3"), ("M1+M2+M3"), ("M1+M2")], columns = ["label"], index = ["P1", "P2", "P3", "P4"])

我想过使用 np.where(df_test!=0) 之类的东西,但是如何将列名作为字符串粘贴到输出中?

I thought about using something like np.where(df_test!=0) but then how do I paste the column names as a string into the output?

推荐答案

您可以使用 np.where 用标签填充单元格,然后将它们作为字符串连接.

You can use np.where to fill the cells with labels and then join them as a string.

(
    df_test.gt(0).apply(lambda x: np.where(x, x.name, None))
    .apply(lambda x: '+'.join(x.dropna()), axis=1)
    .to_frame('label')
)


    label
P1  M1+M2+M3
P2  M3
P3  M1+M2+M3
P4  M1+M2

这篇关于从满足条件的 Pandas 列中提取属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆