如何在 Pandas DataFrame 中一次获取多列的值计数? [英] How to get value counts for multiple columns at once in Pandas DataFrame?

查看:64
本文介绍了如何在 Pandas DataFrame 中一次获取多列的值计数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个 Pandas DataFrame 有多个带有分类值(0 或 1)的列,是否可以方便地同时获取每一列的 value_counts?

Given a Pandas DataFrame that has multiple columns with categorical values (0 or 1), is it possible to conveniently get the value_counts for every column at the same time?

例如,假设我生成一个 DataFrame 如下:

For example, suppose I generate a DataFrame as follows:

import numpy as np
import pandas as pd
np.random.seed(0)
df = pd.DataFrame(np.random.randint(0, 2, (10, 4)), columns=list('abcd'))

我可以得到这样的 DataFrame:

I can get a DataFrame like this:

   a  b  c  d
0  0  1  1  0
1  1  1  1  1
2  1  1  1  0
3  0  1  0  0
4  0  0  0  1
5  0  1  1  0
6  0  1  1  1
7  1  0  1  0
8  1  0  1  1
9  0  1  1  0

如何方便地获取每列的值计数并方便地获取以下内容?

How do I conveniently get the value counts for every column and obtain the following conveniently?

   a  b  c  d
0  6  3  2  6
1  4  7  8  4

我目前的解决方案是:

pieces = []
for col in df.columns:
    tmp_series = df[col].value_counts()
    tmp_series.name = col
    pieces.append(tmp_series)
df_value_counts = pd.concat(pieces, axis=1)

但一定有更简单的方法,比如堆叠、旋转或分组?

But there must be a simpler way, like stacking, pivoting, or groupby?

推荐答案

只需调用 apply 并通过 pd.Series.value_counts:

In [212]:
df = pd.DataFrame(np.random.randint(0, 2, (10, 4)), columns=list('abcd'))
df.apply(pd.Series.value_counts)
Out[212]:
   a  b  c  d
0  4  6  4  3
1  6  4  6  7

这篇关于如何在 Pandas DataFrame 中一次获取多列的值计数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆