使用sed从data.table :: fread产生的奇怪错误 [英] Strange error from data.table::fread using sed

查看:141
本文介绍了使用sed从data.table :: fread产生的奇怪错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认为这是一个准确的标题,但是如果有人认为可以用更好的措词,可以随时对其进行更改。我正在使用 data.table :: fread 运行以下命令。

I think this is an accurate title but feel free to change it if anyone thinks it can be worded better. I am running the following commands using data.table::fread.

fread("sed 's+0/0+0+g' R.test.txt > R.test.edit.txt")
fread("sed 's+0/1+1+g' R.test.edit.txt > R.test.edit2.txt")
fread("sed 's+1/1+2+g' R.test.edit2txt > R.test.edit3.txt")
fread("sed 's+./.+0.01+g' R.test3..edit3.txt > R.test.edit.final.txt")

每行之后以下消息

Warning messages:
1: In fread("sed 's+0/0+0+g' /R/R.test.small.txt > /R/R.test.edit.small.txt") :
  File '/path/to/tmp/RtmpwqJu82/file7e7e250b96bf' has size 0. Returning a NULL data.table.
2: In fread("sed 's+0/1+1+g' /R/R.test.edit.small.txt > /R/R.test.edit2.small.txt") :
  File '/path/to/tmp/RtmpwqJu82/file7e7e8456d82' has size 0. Returning a NULL data.table.
3: In fread("sed 's+1/1+2+g' /R/R.test.edit2.small.txt > /R/R.test.edit3.small.txt") :
  File '/path/to/tmp/RtmpwqJu82/file7e7e3f96bc35' has size 0. Returning a NULL data.table.
4: In fread("sed 's+./.+0.01+g' /R/R.test.edit3.small.txt > /R/R.test.edit.final.small.txt") :
  File '/path/to/tmp/RtmpwqJu82/file7e7e302a3cde' has size 0. Returning a NULL data.table.

所以这很奇怪... fread使我在笔记本电脑上运行时需要的所有文件但给出每个文件的错误。当我必须在群集上运行脚本时,脚本崩溃并给出以下消息。

So it is weird... fread makes all the files I need when I run it on my laptop but gives that error for each file. When I got to run the script on our cluster, the script crashes and gives the following message.

> fread("sed 's+0/0+0+g' /R/R.test.txt > /R/R.test.edit.txt")
Error in fread("sed 's+0/0+0+g' /R/R.test.txt > /R/R.test.edit.txt") : 
  File is empty: /dev/shm/file38d161d613c
Execution halted

我认为这与在笔记本电脑上运行脚本时得到的消息有关吗?我认为这是一个用户问题,但也许是一个错误。我想知道是否有人有任何想法。我想知道是否有人有什么想法?我想到了一种使用以下方法的解决方法

I think it has to do with the message I get when I run the script on my laptop? I think it a user issue but maybe it is a bug. I was wondering if anyone had any ideas. I was wondering if anyone had any ideas? I thought of a work around using the following

end_time <- Sys.time()
print(end_time)
peakRAM(system(paste("sed 's+0/0+0+g' /R/R.test.txt > /R/R.test.edit.txt")),
system(paste("sed 's+0/1+1+g' /R/R.test.edit.txt > /R/R.test.edit2.txt")),
system(paste("sed 's+1/1+2+g' /R/R.test.edit2.txt > /R/R.test.edit3.txt")),
system(paste("sed 's+./.+0.01+g' /R/R.test.edit3.txt > /R/R.test.edit.final.txt")))
end_time <- Sys.time()
print(end_time)

这很好用。因此,我认为sed或类似问题存在。我只是想知道当我使用 fread

And this works fine. So I think there's a problem with sed or anything like that. I am just wondering what I am doing wrong when I use fread

推荐答案

以上评论正确无误;我尝试在文档中查找 fread ,但没有找到对您有帮助的信息,因此我提交了一个有待改进的问题 ...谢谢!

Comments above are correct about what to do; I tried looking in the documentation for fread but didn't find anything helpful for you so I filed an issue to improve... thanks!

当您将终端命令传递给 fread ,它会在后台自动为您创建一个 tmp 文件。您可以在此处看到确切的行,,程式化:

When you pass a terminal command to fread, it creates a tmp file for you automatically in the background. You can see the exact line here, stylized:

system(paste0('(', cmd, ') > ', tmpFile<-tempfile(tmpdir=tmpdir))

然后将 fread 应用于如前所述,由命令加上> tmpFile 生成的文件的大小为0。

Then fread is applied to that file. As mentioned, the file resulting from your command with > tmpFile appended has size 0.

如果您实际上想保留这些中间文件(例如 R.test.edit.txt ),则有两个选择:(1)首先,运行 system('grep> R.test.edit.txt'),然后在输出中运行 fread ;或(2 )[目前仅在开发版本上可用;请参见安装Wiki ]提供 tmpdir 自变量 fread 并省略> R.test.edit.txt 部分; fread 将为您完成输出。

If you actually want to keep those intermediate files (e.g. R.test.edit.txt), you have two options: (1) first, run system('grep > R.test.edit.txt') then run fread on the output; or (2) [available on development version only for now; see Installation wiki] supply the tmpdir argument to fread and omit the > R.test.edit.txt part; fread will do the outputting itself for you.

如果您实际上并不关心中间文件,只需省略> R.test.edit.txt 部分和 fread 应该可以正常工作,例如:

If you don't actually care about the intermediate files, simply omit the > R.test.edit.txt part and fread should work as you were expecting, e.g.:

fread("sed 's+0/0+0+g' R.test.txt")

这篇关于使用sed从data.table :: fread产生的奇怪错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆