使用三列中的分组问题制作数据框 [英] Make a dataframe with grouped questions from three columns

查看：42 发布时间：2020/5/24 2:39:23 python pandas dataframe pivot

本文介绍了使用三列中的分组问题制作数据框的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下数据框:

       A               B                  C
  I am motivated     Agree                4
  I am motivated     Strongly Agree       5
  I am motivated     Disagree             6
  I am open-minded   Agree                4
  I am open-minded   Disagree             4
  I am open-minded   Strongly Disagree    3

对于A列中的问题，其中A列是问题，B列是答案，C列是强烈同意"，同意"，不同意"和强烈不同意"的频率.

如何将其转换为以下数据框?

                  Strongly Agree    Agree     Disagree   Strongly Disagree
I am motivated        5               4           6             0
I am open-minded      0               4           4             3

我尝试在groupby()中查找其他帖子中的列，但无法弄清楚.使用python 3

解决方案

由于这些已经是频率计数，因此我们可以假设我们有唯一的Question/Opinion对.因此，我们可以使用set_index和unstack，因为不需要进行汇总.这应该为我们节省一些时间.我们可以使用pivot达到相同的目标，但是，pivot没有fill_value选项，该选项使我们能够保留dtype

df.set_index(['A', 'B']).C.unstack(fill_value=0)

B                 Agree  Disagree  Strongly Agree  Strongly Disagree
A                                                                   
I am motivated        4         6               5                  0
I am open-minded      4         4               0                  3

额外信用
将'B'转换为，列将被排序

df.B = pd.Categorical(
    df.B, ['Strongly Disagree', 'Disagree', 'Agree', 'Strongly Agree'], True)
df.set_index(['A', 'B']).C.unstack(fill_value=0)

B                 Strongly Disagree  Disagree  Agree  Strongly Agree
A                                                                   
I am motivated                    0         6      4               5
I am open-minded                  3         4      4               0

I have the following dataframe:

       A               B                  C
  I am motivated     Agree                4
  I am motivated     Strongly Agree       5
  I am motivated     Disagree             6
  I am open-minded   Agree                4
  I am open-minded   Disagree             4
  I am open-minded   Strongly Disagree    3

Where column A is the question, column B is the answer, and column C is the frequency of "Strongly Agree", "Agree", "Disagree", and "Strongly Disagree" for the questions in column A.

How can I convert it into the following dataframe?

                  Strongly Agree    Agree     Disagree   Strongly Disagree
I am motivated        5               4           6             0
I am open-minded      0               4           4             3

I tried looking at groupby() for columns from other posts but could not figure it out. Using python 3

解决方案

Because these are already frequency counts, we can assume that we have unique Question / Opinion pairs. So, we can use set_index and unstack as there won't be a need to aggregate. This should save us some time with efficiency. We could accomplish the same goal with pivot, however, pivot doesn't have a fill_value option that enables us to preserve dtype

df.set_index(['A', 'B']).C.unstack(fill_value=0)

B                 Agree  Disagree  Strongly Agree  Strongly Disagree
A                                                                   
I am motivated        4         6               5                  0
I am open-minded      4         4               0                  3

Extra Credit
Turn 'B' into a pd.Categorical and the columns will be sorted

df.B = pd.Categorical(
    df.B, ['Strongly Disagree', 'Disagree', 'Agree', 'Strongly Agree'], True)
df.set_index(['A', 'B']).C.unstack(fill_value=0)

B                 Strongly Disagree  Disagree  Agree  Strongly Agree
A                                                                   
I am motivated                    0         6      4               5
I am open-minded                  3         4      4               0

这篇关于使用三列中的分组问题制作数据框的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用三列中的分组问题制作数据框 [英] Make a dataframe with grouped questions from three columns

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用三列中的分组问题制作数据框 [英] Make a dataframe with grouped questions from three columns

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭