如何获取 pandas DataFrame的行数? [英] How do I get the row count of a pandas DataFrame?

查看:98
本文介绍了如何获取 pandas DataFrame的行数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Pandas获取数据帧df的行数,这是我的代码.

I'm trying to get the number of rows of dataframe df with Pandas, and here is my code.

total_rows = df.count
print total_rows +1

方法2:

total_rows = df['First_columnn_label'].count
print total_rows +1

两个代码段都给我这个错误:

Both the code snippets give me this error:

TypeError:+不支持的操作数类型:实例方法"和整数"

TypeError: unsupported operand type(s) for +: 'instancemethod' and 'int'

我在做什么错了?

推荐答案

您可以使用.shape属性,也可以仅使用len(DataFrame.index).但是,它们之间存在明显的性能差异(len(DataFrame.index)是最快的):

You can use the .shape property or just len(DataFrame.index). However, there are notable performance differences ( len(DataFrame.index) is fastest):

In [1]: import numpy as np

In [2]: import pandas as pd

In [3]: df = pd.DataFrame(np.arange(12).reshape(4,3))

In [4]: df
Out[4]: 
   0  1  2
0  0  1  2
1  3  4  5
2  6  7  8
3  9  10 11

In [5]: df.shape
Out[5]: (4, 3)

In [6]: timeit df.shape
2.77 µs ± 644 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [7]: timeit df[0].count()
348 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [8]: len(df.index)
Out[8]: 4

In [9]: timeit len(df.index)
990 ns ± 4.97 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

正如@Dan Allen在注释len(df.index)df[0].count()中不可互换,因为count排除了NaN s

As @Dan Allen noted in the comments len(df.index) and df[0].count() are not interchangeable as count excludes NaNs,

这篇关于如何获取 pandas DataFrame的行数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆