划分未堆叠数据框的两列 [英] Dividing two columns of an unstacked dataframe

查看：86 发布时间：2020/5/24 4:28:46 python pandas

本文介绍了划分未堆叠数据框的两列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在熊猫数据框中有两列.

I have two columns in a pandas dataframe.

第1列是ed，其中包含字符串(例如'a'，'a'，'b，'c'，'c'，'a')

Column 1 is ed and contains strings (e.g. 'a','a','b,'c','c','a')

ed column = ['a','a','b','c','c','a']

第2列是工作，还包含字符串(例如'aa'，'bb'，'aa'，'aa'，'bb'，'cc')

Column 2 is job and also contains strings (e.g. 'aa','bb','aa','aa','bb','cc')

job column = ['aa','bb','aa','aa','bb','cc'] #these are example values from column 2 of my pandas data frame

然后我生成一个两列频率表，如下所示:

I then generate a two column frequency table like this:

my_counts= pdata.groupby(['ed','job']).size().unstack().fillna(0)

现在如何将频率表中一列中的频率除以另一列中的频率?我想采用该比率并将其用于argsort()，以便我可以按计算出的比率进行排序，但是我不知道如何引用结果表的每一列.

Now how do I then divide the frequencies in one column by the frequencies in another column of that frequency table? I want to take that ratio and use it to argsort() so that I can sort by the calculated ratio but I don't know how to reference each column of the resulting table.

推荐答案

我将数据初始化如下:

ed_col = ['a','a','b','c','c','a']
job_col = ['aa','bb','aa','aa','bb','cc']
pdata = pd.DataFrame({'ed':ed_col, 'job':job_col})
my_counts= pdata.groupby(['ed','job']).size().unstack().fillna(0)

现在my_counts看起来像这样:

Now my_counts looks like this:

job  aa  bb  cc
ed             
a     1   1   1
b     1   0   0
c     1   1   0

要访问列，可以使用my_counts.aa或my_counts['aa']. 要访问一行，可以使用my_counts.loc['a'].

To access a column, you could use my_counts.aa or my_counts['aa']. To access a row, you could use my_counts.loc['a'].

所以aa的频率除以bb是my_counts['aa'] / my_counts['bb']

So the frequencies of aa divided by bb are my_counts['aa'] / my_counts['bb']

现在，如果要对其进行排序，可以执行以下操作:

and now, if you want to get it sorted, you can do:

my_counts.iloc[(my_counts['aa'] / my_counts['bb']).argsort()]

这篇关于划分未堆叠数据框的两列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

划分未堆叠数据框的两列 [英] Dividing two columns of an unstacked dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

划分未堆叠数据框的两列 [英] Dividing two columns of an unstacked dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭