fread和带有尾部反斜杠的列 [英] fread and column with a trailing backslash

查看:243
本文介绍了fread和带有尾部反斜杠的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个问题,fread()读取一列目录路径使用\作为目录分隔符。问题是,尾随目录分隔符在fread()中引发错误。



对于下面的示例csv文件,

 档案,大小
windows\user,123


b $ b

fread()和read.csv()同意并将两者都转换为\到\\

 > fread(example.csv)
文件大小
1:windows \\user 123


b $ b

但是,对于下面的例子,fread()给出错误,而read.csv()很好。

  file,size 
windows\user\,123

read.csv ()给出

 > read.csv(example.csv)
文件大小
1 windows \\user\\ 123

虽然fread()错误如下所示:

  fread(example.csv,verbose = TRUE)
输入不包含\\\
。将此设置为打开文件名
文件打开,文件大小为0.000 GB
文件打开并映射好
检测到的eol为\r\\\
(CRLF),顺序为Windows标准。
使用第2行检测sep(第一个autostart中的最后一个非空行)... sep =','
找到2列
第一行有2个字段出现在行1(列名或第一行数据)
第1行上的所有字段都是字符字段。处理为列名。
第一个数据行之后的eol计数:2
从最后一个eol和任何尾随空行减去1,留下1个数据行
fread中出错(example.csv,verbose = TRUE) :
'在检测到以下类型时在第1行结束字段1:windows\user\,123

我真的想避免做

  DT = data.table(read.csv(example.csv ))




h2_lin>解决方案

现在在 GitHub 上修正为v1.9.3



  • fread()现在在引用的字段中接受尾随的反斜杠。感谢user2970844突出显示。




  $ cat示例。 csv 
文件,大小
windows \user\,123

> require(data.table)
> fread(example.csv)
文件大小
1:windows \\user\\ 123
> read.csv(example.csv)
文件大小
1 windows \\user\\ 123
>


I have a problem with fread() reading a column of directory paths using "\" as the directory separator. The issue is that the trailing directory separator throws an error in fread().

For the below example csv file,

file,size
"windows\user",123

both fread() and read.csv() agree and both convert the \ to \\

> fread("example.csv")
            file size
1: windows\\user  123

However, for the following example fread() gives an error while read.csv() is fine.

file,size
"windows\user\",123

read.csv() gives

> read.csv("example.csv")
             file size
1 windows\\user\\  123

While the fread() error looks like this

> fread("example.csv",verbose=TRUE)
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.000 GB
File is opened and mapped ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Using line 2 to detect sep (the last non blank line in the first 'autostart') ... sep=','
Found 2 columns
First row with 2 fields occurs on line 1 (either column names or first row of data)
All the fields on line 1 are character fields. Treating as the column names.
Count of eol after first data row: 2
Subtracted 1 for last eol and any trailing empty lines, leaving 1 data rows
Error in fread("example.csv", verbose = TRUE) : 
' ends field 1 on line 1 when detecting types: "windows\user\",123

I would really like to avoid doing

DT = data.table(read.csv("example.csv"))

if at all possible.

解决方案

Now fixed in v1.9.3 on GitHub.

  • fread() now accepts trailing backslash in quoted fields. Thanks to user2970844 for highlighting.

$ cat example.csv
file,size
"windows\user\",123

> require(data.table)
> fread("example.csv")
              file size
1: windows\\user\\  123
> read.csv("example.csv")
             file size
1 windows\\user\\  123
> 

这篇关于fread和带有尾部反斜杠的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆