pandas 读取带有浮点值的csv文件会导致奇怪的舍入和十进制数字 [英] Pandas read csv file with float values results in weird rounding and decimal digits

查看:136
本文介绍了 pandas 读取带有浮点值的csv文件会导致奇怪的舍入和十进制数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含数字值(例如1524.449677)的csv文件.总是精确地有6个小数位.

I have a csv file containing numerical values such as 1524.449677. There are always exactly 6 decimal places.

当我通过熊猫read_csv导入csv文件(和其他列)时,该列会自动获取数据类型object.我的问题是这些值显示为2470.6911370000003,而实际上应为2470.691137.或者值2484.30691显示为2484.3069100000002.

When I import the csv file (and other columns) via pandas read_csv, the column automatically gets the datatype object. My issue is that the values are shown as 2470.6911370000003 which actually should be 2470.691137. Or the value 2484.30691 is shown as 2484.3069100000002.

在某种程度上,这似乎是数据类型问题.我尝试通过将dtype参数指定为{'columnname': np.float64}来通过read_csv导入时显式提供数据类型.问题仍然没有解决.

This seems to be a datatype issue in some way. I tried to explicitly provide the data type when importing via read_csv by giving the dtype argument as {'columnname': np.float64}. Still the issue did not go away.

如何获取导入的值并完全按照源csv文件中的值显示?

How can I get the values imported and shown exactly as they are in the source csv file?

推荐答案

Pandas使用专用的dec 2 bin转换器,该转换器优先考虑精度而不是速度.

Pandas uses a dedicated dec 2 bin converter that compromises accuracy in preference to speed.

float_precision='round_trip'传递给read_csv可以解决此问题.

Passing float_precision='round_trip' to read_csv fixes this.

查看此页面以获得更多详细信息.

Check out this page for more detail on this.

处理完数据后,如果要将其保存回 csv 文件中,则可以将
float_format = "%.nf"传递给相应的方法.

After processing your data, if you want to save it back in a csv file, you can pass
float_format = "%.nf" to the corresponding method.

完整示例:

import pandas as pd

df_in  = pd.read_csv(source_file, float_precision='round_trip')
df_out = ... # some processing of df_in
df_out.to_csv(target_file, float_format="%.3f") # for 3 decimal places

这篇关于 pandas 读取带有浮点值的csv文件会导致奇怪的舍入和十进制数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆