创建一个计数的 pandas 数据框 [英] Create a pandas dataframe of counts

查看:61
本文介绍了创建一个计数的 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个包含两列的pandas数据框,第一列是我其中一列的唯一值,第二列是唯一值的计数.

I want to create a pandas dataframe with two columns, the first being the unique values of one of my columns and the second being the count of unique values.

我看过很多帖子(例如

I have seen many posts (such here) as that describe how to get the counts, but the issue I'm running into is when I try to create a dataframe the column values become my index.

样本数据:df = pd.DataFrame({'Color': ['Red', 'Red', 'Blue'], 'State': ['MA', 'PA', 'PA']}).我想结束一个像这样的数据框:

Sample data: df = pd.DataFrame({'Color': ['Red', 'Red', 'Blue'], 'State': ['MA', 'PA', 'PA']}). I want to end up with a dataframe like:

   Color Count
0   Red  2
1  Blue  1

我尝试了以下操作,但是在所有情况下,索引最终都以Color表示,而Count是数据框中的唯一列.

I have tried the following, but in all cases the index ends up as Color and the Count is the only column in the dataframe.

尝试1:

df2 = pd.DataFrame(data=df['Color'].value_counts())
# And resetting the index just gets rid of Color, which I want to keep
df2 = df2.reset_index(drop=True)

尝试2:

df3 = df['Color'].value_counts()
df3 = pd.DataFrame(data=df3, index=range(df3.shape[0]))

尝试3:

df4 = df.groupby('Color')
df4 = pd.DataFrame(df4['Color'].count())

推荐答案

使用value_counts的另一种方法:

In [10]: df = pd.DataFrame({'Color': ['Red', 'Red', 'Blue'], 'State': ['MA', 'PA', 'PA']})

In [11]: df.Color.value_counts().reset_index().rename(columns={'index': 'Color', 0: 'count'})
Out[11]:
  Color  count
0   Red      2
1  Blue      1

这篇关于创建一个计数的 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆