使用三列中的分组问题制作数据框 [英] Make a dataframe with grouped questions from three columns
问题描述
我有以下数据框:
A B C
I am motivated Agree 4
I am motivated Strongly Agree 5
I am motivated Disagree 6
I am open-minded Agree 4
I am open-minded Disagree 4
I am open-minded Strongly Disagree 3
对于A列中的问题,其中A列是问题,B列是答案,C列是强烈同意",同意",不同意"和强烈不同意"的频率.>
如何将其转换为以下数据框?
Strongly Agree Agree Disagree Strongly Disagree
I am motivated 5 4 6 0
I am open-minded 0 4 4 3
我尝试在groupby()中查找其他帖子中的列,但无法弄清楚.使用python 3
由于这些已经是频率计数,因此我们可以假设我们有唯一的Question
/Opinion
对.因此,我们可以使用set_index
和unstack
,因为不需要进行汇总.这应该为我们节省一些时间.我们可以使用pivot
达到相同的目标,但是,pivot
没有fill_value
选项,该选项使我们能够保留dtype
df.set_index(['A', 'B']).C.unstack(fill_value=0)
B Agree Disagree Strongly Agree Strongly Disagree
A
I am motivated 4 6 5 0
I am open-minded 4 4 0 3
df.B = pd.Categorical(
df.B, ['Strongly Disagree', 'Disagree', 'Agree', 'Strongly Agree'], True)
df.set_index(['A', 'B']).C.unstack(fill_value=0)
B Strongly Disagree Disagree Agree Strongly Agree
A
I am motivated 0 6 4 5
I am open-minded 3 4 4 0
I have the following dataframe:
A B C
I am motivated Agree 4
I am motivated Strongly Agree 5
I am motivated Disagree 6
I am open-minded Agree 4
I am open-minded Disagree 4
I am open-minded Strongly Disagree 3
Where column A is the question, column B is the answer, and column C is the frequency of "Strongly Agree", "Agree", "Disagree", and "Strongly Disagree" for the questions in column A.
How can I convert it into the following dataframe?
Strongly Agree Agree Disagree Strongly Disagree
I am motivated 5 4 6 0
I am open-minded 0 4 4 3
I tried looking at groupby() for columns from other posts but could not figure it out. Using python 3
Because these are already frequency counts, we can assume that we have unique Question
/ Opinion
pairs. So, we can use set_index
and unstack
as there won't be a need to aggregate. This should save us some time with efficiency. We could accomplish the same goal with pivot
, however, pivot
doesn't have a fill_value
option that enables us to preserve dtype
df.set_index(['A', 'B']).C.unstack(fill_value=0)
B Agree Disagree Strongly Agree Strongly Disagree
A
I am motivated 4 6 5 0
I am open-minded 4 4 0 3
Extra Credit
Turn 'B'
into a pd.Categorical
and the columns will be sorted
df.B = pd.Categorical(
df.B, ['Strongly Disagree', 'Disagree', 'Agree', 'Strongly Agree'], True)
df.set_index(['A', 'B']).C.unstack(fill_value=0)
B Strongly Disagree Disagree Agree Strongly Agree
A
I am motivated 0 6 4 5
I am open-minded 3 4 4 0
这篇关于使用三列中的分组问题制作数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!