.value_counts()给出被截断的结果 [英] .value_counts() giving truncated results

查看:685
本文介绍了.value_counts()给出被截断的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Excel文件,其中包含多个单词的单列.我正在尝试计算每个单词的出现频率. 所以如果我有一个清单

I have an excel file with a single column of multiple words. I am trying to count the frequency of occurrence of each word. So If I have a list

Labels 
a
a 
b
b
c
c
c

输出应为

c : 3
b : 2
a : 2

我正在使用以下代码段

import pandas as pd
train = pd.read_csv("ani2.csv")
A = train['Labels'].value_counts()
f = open("ani3.csv",'a')
f.write(str(A))
f.close()

数据集大约有53000个值,我获得的输出被截断了.我获得的输出就是这种格式.

The dataset has about 53000 values and the output I obtained was truncated. The output I obtained was in this format.

z : 1700
y : 1500
x : 1000
...
c : 3
b : 2
a : 2

中间的值由于某种原因而丢失了,我得到的只有三个点.

The values in middle are missing for some reason and all I obtained was three dots.

推荐答案

您正在传递str(A)

只需在A上调用to_csv:

A = train['Labels'].value_counts()
A.to_csv("ani3.csv",mode='a')

完成str(A)后,您会将输出(将受到熊猫显示选项的影响)转换为字符串表示形式,这就是为什么您获得...的原因.

When you did str(A) you're converting the output, which will be affected by the pandas display options, to a string representation which is why you get ....

您可以在这里看到效果:

You can see the effect here:

In [34]:
df = pd.DataFrame(np.random.randn(100,1), columns=['a'])
str(df['a'].value_counts())

Out[34]:
'-1.115774    1\n-0.196748    1\n-0.193616    1\n-0.197265    1\n 0.745611    1\n 0.766238    1\n-0.263205    1\n 0.542410    1\n-1.930702    1\n-0.913680    1\n 1.150879    1\n 0.213193    1\n-1.245947    1\n-2.610836    1\n 1.482863    1\n 0.430732    1\n-1.290851    1\n-0.962350    1\n-0.160461    1\n 1.895585    1\n 0.923683    1\n-1.206336    1\n 0.454317    1\n 0.293499    1\n-1.289761    1\n-0.191499    1\n 1.311149    1\n 0.380678    1\n 0.964312    1\n-0.703558    1\n            ..\n-0.384447    1\n 0.172968    1\n-0.221997    1\n 0.133441    1\n-0.343758    1\n-0.897193    1\n-0.525859    1\n-0.226437    1\n-0.552760    1\n-1.991686    1\n 0.517877    1\n 0.659020    1\n 1.680185    1\n 0.155123    1\n-0.788438    1\n-1.364535    1\n 0.034736    1\n 0.494853    1\n 1.113248    1\n-1.449296    1\n 1.123138    1\n-0.747243    1\n-0.429054    1\n-0.567881    1\n-0.476616    1\n-2.630239    1\n 0.084506    1\n 1.250732    1\n 0.071242    1\n-0.432580    1\nName: a, dtype: int64'

这篇关于.value_counts()给出被截断的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆