data.table R在Red Hat Linux上的f错误 [英] data.table R fwrite bug on Red Hat Linux

查看:106
本文介绍了data.table R在Red Hat Linux上的f错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用data.table(v1.10),并注意到使用fwrite时的错误。一些背景。

I have been using data.table (v1.10) and noticed a bug when using fwrite. Some background.

sessionInfo()
R version 3.1.3 (2015-03-09)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux Server release 6.7 (Santiago)

有多核机器。


生成一些数据




#Generate some data
rows = 2500000
set.seed(Sys.time())
DF <- data.frame(index = 1:rows,
             catsA = sample((letters[1:10]),rows,replace=T),
             catsB = sample((letters[1:10]),rows,replace=T),
             catsC = sample((letters[1:10]),rows,replace=T),
             catsD = sample((letters[1:10]),rows,replace=T),
             catsE = sample((letters[1:10]),rows,replace=T),
             valueA = round(rnorm(rows),3),
             valueB = rpois(rows, lambda = 4))

#Convert to data.table
DT <- data.table(DF) 
#Create a new column
DT <- DT[,valueNew := valueA*valueB]

#Write
write.csv(DT,file="DT_write_csv.csv",row.names=F)
fwrite(DT, file = "DT_fwrite.csv",row.names=F)




in and join




#Read back in and join
DT_csv <- fread("DT_write_csv.csv")
DT_fwrite <- fread("DT_fwrite.csv")

setkey(DT_csv,"index")
setkey(DT_fwrite,"index")
join_DT <- DT_csv[DT_fwrite]




比较




nrow(join_DT[valueNew != i.valueNew])
[1] 1
join_DT[valueNew != i.valueNew,.(index,valueNew,i.valueNew)]
   index valueNew i.valueNew
1: 67097    2.855       5.71
DT[index==67097,.(valueNew)]
   valueNew
1:    2.855 

比较,原始DT有一个fwrite破坏。有时它是多行并且在现实生活中的例子传播跨越许多列。

From the Compare the original DT has the a that fwrite corrupts. Sometimes it is more than one row and in a real-life example propagated across many columns.

我对fwrite做错了什么?

Am I doing something wrong with the fwrite?

推荐答案

是否在 fwrite 中有错误。在上周固定在开发,我会尽快得到它的CRAN很快。请检查 新闻 链接修正项目3的顶部:

Yes there is a bug in fwrite. Fixed in dev last week and I'll try and get it to CRAN soon. Please check NEWS link at the top of homepage, bug fix item 3 :


fwrite()浮点值不正确,#1968 。 A
线程局部变量不正确的线程全局。这个变量的
使用寿命只有几个时钟周期,所以它需要大数据和
许多线程的几个线程重叠它们的使用和
导致的问题。非常感谢@mgahan和@jmosser查找和
报告。

fwrite() could write floating point values incorrectly, #1968. A thread-local variable was incorrectly thread-global. This variable's usage lifetime is only a few clock cycles so it needed large data and many threads for several threads to overlap their usage of it and cause the problem. Many thanks to @mgahan and @jmosser for finding and reporting.

请从dev尝试输入命令此处。我知道dev目前失败Travis(一个不相关的原因),这就是为什么安装命令已经设置安装提交从dev,因此应该是确定。

Please try from dev by typing the command here. I know that dev is currently failing Travis (an unrelated reason), which is why the installation command has been setup to install the last-passing commit from dev and therefore should be ok.

这篇关于data.table R在Red Hat Linux上的f错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆