在 pandas 中使用read_csv时精度下降 [英] Precision lost while using read_csv in pandas

查看:173
本文介绍了在 pandas 中使用read_csv时精度下降的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在文本文件中有以下格式的文件,试图将其读入pandas数据框.

I have files of the below format in a text file which I am trying to read into a pandas dataframe.

895|2015-4-23|19|10000|LA|0.4677978806|0.4773469340|0.4089938425|0.8224291972|0.8652525793|0.6829942860|0.5139162227|

如您所见,输入文件中的浮点后面有 10 个整数.

As you can see there are 10 integers after the floating point in the input file.

df = pd.read_csv('mockup.txt',header=None,delimiter='|')

当我尝试将其读入数据帧时,我没有得到最后的4个整数

When I try to read it into dataframe, I am not getting the last 4 integers

df[5].head()

0    0.467798
1    0.258165
2    0.860384
3    0.803388
4    0.249820
Name: 5, dtype: float64

如何获得输入文件中显示的完整精度?我有一些矩阵操作需要执行,所以我不能将其转换为字符串.

How can I get the complete precision as present in the input file? I have some matrix operations that needs to be performed so i cannot cast it as string.

我发现我必须对dtype做一些事情,但是我不确定应该在哪里使用它.

I figured out that I have to do something about dtype but I am not sure where I should use it.

推荐答案

这只是显示问题,请参见

It is only display problem, see docs:

#temporaly set display precision
with pd.option_context('display.precision', 10):
    print df

     0          1   2      3   4             5            6             7   \
0  895  2015-4-23  19  10000  LA  0.4677978806  0.477346934  0.4089938425   

             8             9            10            11  12  
0  0.8224291972  0.8652525793  0.682994286  0.5139162227 NaN    

(谢谢您马克·迪金森):

Pandas使用专用的十进制到二进制转换器,该转换器为了提高速度而牺牲了完美的精度.将float_precision='round_trip'传递给read_csv可以解决此问题.请参见文档更多.

Pandas uses a dedicated decimal-to-binary converter that sacrifices perfect accuracy for the sake of speed. Passing float_precision='round_trip' to read_csv fixes this. See the documentation for more.

这篇关于在 pandas 中使用read_csv时精度下降的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆