pandas 读取带有浮点值的csv文件会导致奇怪的舍入和十进制数字 [英] Pandas read csv file with float values results in weird rounding and decimal digits
问题描述
我有一个包含数字值(例如1524.449677
)的csv文件.总是精确地有6个小数位.
I have a csv file containing numerical values such as 1524.449677
. There are always exactly 6 decimal places.
当我通过熊猫read_csv
导入csv文件(和其他列)时,该列会自动获取数据类型object
.我的问题是这些值显示为2470.6911370000003
,而实际上应为2470.691137
.或者值2484.30691
显示为2484.3069100000002
.
When I import the csv file (and other columns) via pandas read_csv
, the column automatically gets the datatype object
. My issue is that the values are shown as 2470.6911370000003
which actually should be 2470.691137
. Or the value 2484.30691
is shown as 2484.3069100000002
.
在某种程度上,这似乎是数据类型问题.我尝试通过将dtype
参数指定为{'columnname': np.float64}
来通过read_csv
导入时显式提供数据类型.问题仍然没有解决.
This seems to be a datatype issue in some way. I tried to explicitly provide the data type when importing via read_csv
by giving the dtype
argument as {'columnname': np.float64}
. Still the issue did not go away.
如何获取导入的值并完全按照源csv文件中的值显示?
How can I get the values imported and shown exactly as they are in the source csv file?
推荐答案
Pandas使用专用的dec 2 bin
转换器,该转换器优先考虑精度而不是速度.
Pandas uses a dedicated dec 2 bin
converter that compromises accuracy in preference to speed.
将float_precision='round_trip'
传递给read_csv
可以解决此问题.
Passing float_precision='round_trip'
to read_csv
fixes this.
查看此页面以获得更多详细信息.
Check out this page for more detail on this.
处理完数据后,如果要将其保存回 csv 文件中,则可以将
float_format = "%.nf"
传递给相应的方法.
After processing your data, if you want to save it back in a csv file, you can passfloat_format = "%.nf"
to the corresponding method.
完整示例:
import pandas as pd
df_in = pd.read_csv(source_file, float_precision='round_trip')
df_out = ... # some processing of df_in
df_out.to_csv(target_file, float_format="%.3f") # for 3 decimal places
这篇关于 pandas 读取带有浮点值的csv文件会导致奇怪的舍入和十进制数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!