R在读取文件时添加了额外的数字 [英] R is adding extra numbers while reading file

查看:60
本文介绍了R在读取文件时添加了额外的数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直试图读取一个包含日期字段和数字字段的文件.我将数据保存在Excel工作表中,如下所示-

I have been trying to read a file which has date field and a numeric field. I have the data in an excel sheet and looks something like below -

Date          X       
1/25/2008     0.0023456
12/23/2008    0.001987

当我使用 readxl :: read_xlsx 函数在R中阅读此内容时,R中的数据如下所示-

When I read this in R using the readxl::read_xlsx function, the data in R looks like below -

Date          X
1/25/2008     0.0023456000000000
12/23/2009    0.0019870000000000

我尝试使用舍入,格式(nsmall = 7)等功能限制数字,但似乎无济于事.我究竟做错了什么?我还尝试将数据另存为csv和txt,并使用read.csv和read.delim进行读取,但我再次遇到相同的问题.任何帮助将不胜感激!

I have tried limiting the digits using functions like round, format (nsmall = 7), etc. but nothing seems to work. What am I doing wrong? I also tried saving the data as a csv and a txt and read it using read.csv and read.delim but I face the same issue again. Any help would be really appreciated!

推荐答案

如对OP的注释和其他答案中所述,此问题是由于在用于运行R的处理器上处理浮点数学运算的方式引起的,以及与 digits 选项的交互.

As noted in the comments to the OP and the other answer, this problem is due to the way floating point math is handled on the processor being used to run R, and its interaction with the digits option.

为说明起见,我们将使用OP中的数据创建一个Excel电子表格,并演示在调整 options(digits =)选项时会发生什么.

To illustrate, we'll create an Excel spreadsheet with the data from the OP, and demonstrate what happens as we adjust the options(digits=) option.

接下来,我们将编写一个简短的R脚本来说明调整 digits 选项时发生的情况.

Next, we'll write a short R script to illustrate what happens when we adjust the digits option.

> # first, display the number of significant digits set in R
> getOption("digits")
[1] 7
> 
> # Next, read data file from Excel
> library(xlsx)
> 
> theData <- read.xlsx("./data/smallNumbers.xlsx",1,header=TRUE)
> 
> head(theData)
        Date         X
1 2008-01-25 0.0023456
2 2008-12-23 0.0019870
> 
> # change digits to larger number to replicate SO question
> options(digits=17)
> getOption("digits")
[1] 17
> head(theData)
        Date                     X
1 2008-01-25 0.0023456000000000002
2 2008-12-23 0.0019870000000000001
>

但是,打印有效位数的行为因处理器/操作系统而异,因为设置 options(digits = 16)会导致在运行带有Microsoft Windows的Intel i7-6500U处理器的计算机上执行以下操作10:

However, the behavior of printing significant digits varies by processor / operating system, as setting options(digits=16) results in the following on a machine running an Intel i7-6500U processor with Microsoft Windows 10:

> # what happens when we set digits = 16?
> options(digits=16)
> getOption("digits")
[1] 16
> head(theData)
        Date         X
1 2008-01-25 0.0023456
2 2008-12-23 0.0019870
> 

这篇关于R在读取文件时添加了额外的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆