快速获取数据框中的记录数 [英] Getting the count of records in a data frame quickly

查看:66
本文介绍了快速获取数据框中的记录数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个多达1000万条记录的数据框.如何快速计数? df.count需要很长时间.

I have a dataframe with as many as 10 million records. How can I get a count quickly? df.count is taking a very long time.

推荐答案

无论如何,这将花费很多时间.至少是第一次.

It's going to take so much time anyway. At least the first time.

一种方法是缓存数据帧,因此除了计数之外,您还可以使用它.

One way is to cache the dataframe, so you will be able to more with it, other than count.

例如

df.cache()
df.count()

后续操作不会花费很多时间.

Subsequent operations don't take much time.

这篇关于快速获取数据框中的记录数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆