按一列分组并显示来自另一列的特定值的可用性 [英] Group by one column and show the availability of specific values from another column

查看：141 发布时间：2018/5/30 14:22:00 python pandas dataframe group-by pandas-groupby

本文介绍了按一列分组并显示来自另一列的特定值的可用性的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有这个数据框：

  df1：
 
 drug_id疾病
 lexapro。 1 HD 
 lexapro.1 MS 
 lexapro.2 HDED 
 lexapro.2 MS 
 lexapro.2 MS 
 lexapro.3 CD 
 lexapro.3汗水
 lexapro.4 HD 
 lexapro.5 WD 
 lexapro.5 FN

我将首先根据drug_id对数据进行分组，然后在疾病列中搜索HD，MS和FN的可用性。然后像这样填写第二个数据框：

  df2：
 drug_id HD MS FN 
 lexapro。 1 1 1 0 
 lexapro.2 0 1 0 
 lexapro.3 0 0 0 
 lexapro.4 1 0 0 
 lexapro.5 0 0 1

这是我的分组代码。

  df1.groupby（'drug_id'，sort = False）.isin（'HD'）

但我不知道如何将1分配给 F2 ['HD'] 'HD'可用于中的 drug_id ，则为每个drug_id $ c> df1 。

谢谢。解决方案选项1 交叉表 pd.crosstab（df.drug_id，df.illness）[['HD'，'MS'，'FN']]。ge（1）.astype（int）疾病HD MS FN drug_id lexapro.1 1 1 0 lexapro.2 0 1 0 lexapro.3 0 0 0 lexapro.4 1 0 0 lexapro.5 0 0 1 选项2 groupby + value_counts + sterack df.groupby（'drug_id'）。illness.value_counts（）\ .unstac ge（1）.astype（int）疾病HD MS FN drug_id lexapro（1）k（）[['HD'，'MS'，'FN']]。 .1 1 1 0 lexapro.2 0 1 0 lexapro.3 0 0 0 lexapro.4 1 0 0 lexapro.5 0 0 1 选项3 get_dummies + sum df.set_index（'drug_id'）。illness.str.get_dummies（）\ .sum（level = 0）[['HD'，'MS'，'FN']]。ge （1）.astype（int） HD MS FN drug_id lexapro.1 1 1 0 lexapro.2 0 1 0 lexapro。 3 0 0 0 lexapro.4 1 0 0 lexapro.5 0 0 1 感谢斯科特波士顿的改进！ I have this dataframe: df1: drug_id illness lexapro.1 HD lexapro.1 MS lexapro.2 HDED lexapro.2 MS lexapro.2 MS lexapro.3 CD lexapro.3 Sweat lexapro.4 HD lexapro.5 WD lexapro.5 FN I am going to first group the data based on drug_id, and search for availability of HD, MS, and FN in the illness column. Then fill in the second data frame like this: df2: drug_id HD MS FN lexapro.1 1 1 0 lexapro.2 0 1 0 lexapro.3 0 0 0 lexapro.4 1 0 0 lexapro.5 0 0 1 This is my code for grouping. df1.groupby('drug_id', sort=False).isin('HD') but I do not know how I can assign 1 to the F2['HD'] for each drug_id, if the 'HD' was available for that drug_id in df1. Thank you. 解决方案 Option 1 crosstab pd.crosstab(df.drug_id, df.illness)[['HD', 'MS', 'FN']].ge(1).astype(int) illness HD MS FN drug_id lexapro.1 1 1 0 lexapro.2 0 1 0 lexapro.3 0 0 0 lexapro.4 1 0 0 lexapro.5 0 0 1 Option 2 groupby + value_counts + unstack df.groupby('drug_id').illness.value_counts()\ .unstack()[['HD', 'MS', 'FN']].ge(1).astype(int) illness HD MS FN drug_id lexapro.1 1 1 0 lexapro.2 0 1 0 lexapro.3 0 0 0 lexapro.4 1 0 0 lexapro.5 0 0 1 Option 3 get_dummies + sum df.set_index('drug_id').illness.str.get_dummies()\ .sum(level=0)[['HD', 'MS', 'FN']].ge(1).astype(int) HD MS FN drug_id lexapro.1 1 1 0 lexapro.2 0 1 0 lexapro.3 0 0 0 lexapro.4 1 0 0 lexapro.5 0 0 1 Thanks to Scott Boston for the improvement! 这篇关于按一列分组并显示来自另一列的特定值的可用性的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

按一列分组并显示来自另一列的特定值的可用性 [英] Group by one column and show the availability of specific values from another column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

按一列分组并显示来自另一列的特定值的可用性 [英] Group by one column and show the availability of specific values from another column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭