Python Pandas科学符号图标 [英] Python Pandas Scientific Notation Iconsistent

查看:223
本文介绍了Python Pandas科学符号图标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究在64位Ubuntu 14.04上使用Pandas(自从我刚刚发现它)重写一些数据分析代码,并且遇到了一些奇怪的行为.我的数据文件如下所示:

I am looking into rewriting some data analysis code using Pandas (since I just discovered it) on Ubuntu 14.04 64-bit and I have hit upon some strange behaviour. My data files look like this:

26/09/2014  00:00:00    2.423009    -58.864655  3.312355E-7 6.257226E-8 302 305
26/09/2014  00:00:00    2.395637    -62.73302   3.321525E-7 7.065322E-8 302 305
26/09/2014  00:00:01    2.332541    -57.763269  3.285718E-7 6.873837E-8 302 305
26/09/2014  00:00:02    2.366828    -51.900812  3.262279E-7 7.397762E-8 302 305
26/09/2014  00:00:03    2.435500    -40.820161  3.241068E-7 6.777224E-8 302 305
26/09/2014  00:00:04    2.428922    -65.573049  3.212358E-7 6.761804E-8 302 305
26/09/2014  00:00:05    2.419931    -59.414711  3.185517E-7 7.243236E-8 302 305
26/09/2014  00:00:06    2.416663    -60.064279  3.209795E-7 6.242328E-8 302 305
26/09/2014  00:00:07    2.411954    -52.586242  3.184297E-7 5.825581E-8 302 304
26/09/2014  00:00:08    2.457342    -61.874388  3.151493E-7 6.327384E-8 303 304

其中的列用制表符分隔.为了将它们读入Pandas,我使用以下简单命令:

Where columns are tab-separated. In order to read these into Pandas, I am using the following simple commands:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

data = pd.read_csv("path/to/file.dat", sep="\t", header=None)
print data

这将产生以下输出:

            0         1         2          3  4             5    6    7
0  26/09/2014  00:00:00  2.423009 -58.864655  0  6.257226e-08  302  305
1  26/09/2014  00:00:00  2.395637 -62.733020  0  7.065322e-08  302  305
2  26/09/2014  00:00:01  2.332541 -57.763269  0  6.873837e-08  302  305
3  26/09/2014  00:00:02  2.366828 -51.900812  0  7.397762e-08  302  305
4  26/09/2014  00:00:03  2.435500 -40.820161  0  6.777224e-08  302  305
5  26/09/2014  00:00:04  2.428922 -65.573049  0  6.761804e-08  302  305
6  26/09/2014  00:00:05  2.419931 -59.414711  0  7.243236e-08  302  305
7  26/09/2014  00:00:06  2.416663 -60.064279  0  6.242328e-08  302  305
8  26/09/2014  00:00:07  2.411954 -52.586242  0  5.825581e-08  302  304
9  26/09/2014  00:00:08  2.457342 -61.874388  0  6.327384e-08  303  304

[10 rows x 8 columns]

此处要注意的重要事项是第4列.将其与第5列以及原始数据进行比较.第5列以科学计数法表示,而第4列则没有.它不只是将列清零或将其转换为int,原因是:

The important thing to notice here is column 4. Compare it to column 5, and to the original data. Column 5 has been rendered in scientific notation, while column 4 has not. It hasn't just zeroed out the column or converted it to int because:

>>> data[4][0]*1e7
3.3123550000000002

这是我所期望的.因此,数据值相同,但表示形式已更改.如果这只是装饰性的事情,那么我可以忍受,但这让我感到不安,我想知道这里发生了什么.

Which is what I would expect. So the data values are the same but the representation has changed. If this is just a cosmetic thing, then I could put up with it, but it makes me feel uneasy and I'd like to know what's going on here.

推荐答案

是的,这很简单,您可以使用

Yes it's a cosmetic thing, you can change this using set_option:

In [21]:

pd.set_option('display.precision',20)
df[4]
Out[21]:
0    0.0000003312355
1    0.0000003321525
2    0.0000003285718
3    0.0000003262279
4    0.0000003241068
5    0.0000003212358
6    0.0000003185517
7    0.0000003209795
8    0.0000003184297
9    0.0000003151493
Name: 4, dtype: float64

基础数据将不会被截断,并且会保留下来,包括在将数据写回csv时

The underlying data will not have been truncated and will be preserved including when you write the data back out to csv

如果您使用的是iPython,则可以检查默认设置是什么,对于显示精度(有效数字),通常为7.

If you are in iPython then you can check what the default settings are, for display precision (significant digits) it is 7 normally.

这篇关于Python Pandas科学符号图标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆