将指数符号数转换为字符串-解释 [英] Converting exponential notation numbers to strings - explanation

查看:110
本文介绍了将指数符号数转换为字符串-解释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有DataFrame来自这个问题:

temp=u"""Total,Price,test_num
0,71.7,2.04256e+14
1,39.5,2.04254e+14
2,82.2,2.04188e+14
3,42.9,2.04171e+14"""
df = pd.read_csv(pd.compat.StringIO(temp))

print (df)
   Total  Price      test_num
0      0   71.7  2.042560e+14
1      1   39.5  2.042540e+14
2      2   82.2  2.041880e+14
3      3   42.9  2.041710e+14

如果将float s转换为string s,则得到尾随的0:

If convert floats to strings get trailing 0:

print (df['test_num'].astype('str'))
0    204256000000000.0
1    204254000000000.0
2    204188000000000.0
3    204171000000000.0
Name: test_num, dtype: object

解决方案将float s转换为integer64:

Solution is convert floats to integer64:

print (df['test_num'].astype('int64'))
0    204256000000000
1    204254000000000
2    204188000000000
3    204171000000000
Name: test_num, dtype: int64

print (df['test_num'].astype('int64').astype(str))
0    204256000000000
1    204254000000000
2    204188000000000
3    204171000000000
Name: test_num, dtype: object


问题是为什么它会以这种方式转换?


Question is why it convert this way?

我添加了这个拙劣的解释,但感觉应该更好:

I add this poor explanation, but feels it should be better:

糟糕的解释:

您可以检查 dtype 转换后的列-返回float64.

print (df['test_num'].dtype)
float64

转换为字符串后,它将删除指数表示法并强制转换为float s,因此添加了0:

After converting to string it remove exponential notation and cast to floats, so added traling 0:

print (df['test_num'].astype('str'))
0    204256000000000.0
1    204254000000000.0
2    204188000000000.0
3    204171000000000.0
Name: test_num, dtype: object

推荐答案

使用pd.read_csv导入数据且未定义数据类型时, 熊猫做出了有根据的猜测,在这种情况下,决定了该专栏 最好用浮点值表示"2.04256e + 14"之类的值.

When you use pd.read_csv to import data and do not define datatypes, pandas makes an educated guess and in this case decides, that column values like "2.04256e+14" are best represented by a float value.

将其转换回字符串后会添加一个".0".当你写得不错的时候, 转换为int64可以解决此问题.

This, converted back to string adds a ".0". As you corrently write, converting to int64 fixes this.

如果您知道该列仅在输入之前具有int64值(并且 没有空值(np.int64无法处理),您可以在导入时强制使用此类型,以避免不必要的转换.

If you know that the column has int64 values only before input (and no empty values, which np.int64 cannot handle), you can force this type on import to avoid the unneeded conversions.

import numpy as np

temp=u"""Total,Price,test_num
0,71.7,2.04256e+14
1,39.5,2.04254e+14
2,82.2,2.04188e+14
3,42.9,2.04171e+14"""

df = pd.read_csv(pd.compat.StringIO(temp), dtype={2: np.int64})

print(df)

返回

   Total  Price         test_num
0      0   71.7  204256000000000
1      1   39.5  204254000000000
2      2   82.2  204188000000000
3      3   42.9  204171000000000

这篇关于将指数符号数转换为字符串-解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆