当分隔符为空格且缺失值为空白时,如何读取? [英] How to read when delimiter is space and missing values are blank?
问题描述
我有一个用空格分隔的文件,有些列为空白,所以我们最终有多个空格,并且 fread 失败,并显示错误.但是 read.table 可以正常工作.参见示例:
I have a space delimited file and some columns are blank, so we end up having multiple spaces, and fread fails with error. But read.table works fine. See example:
library(data.table)
# R version 3.4.2 (2017-09-28)
# data.table_1.10.4-3
fread("A B C D
1 2 3
4 5 6 7", sep = " ", header = TRUE)
Error in fread("A B C D\n1 2 3\n4 5 6 7") :
Expected sep (' ') but new line, EOF (or other non printing character) ends field 2 when detecting types from point 0: 1 2 3
read.table(text ="A B C D
1 2 3
4 5 6 7", sep = " ", header = TRUE)
# A B C D
# 1 1 2 NA 3
# 2 4 5 6 7
我们如何使用 fread 进行阅读,我尝试设置 sep ="
和 na.string ="
,没有帮助
How do we read using fread, I tried setting sep = " "
and na.string = ""
, didn't help.
推荐答案
在 fread 函数中,默认情况下 strip.white
设置为 TRUE
,表示前导尾随空格已删除.这对于读取具有固定宽度或具有
In fread function, by default strip.white
is set to TRUE
, meaning leading trailing spaces are removed. That is useful to read files with fixed width or with irregular number of spaces as separator.
read.table strip.white
中的默认设置为 FALSE
.
fread("A B C D
1 2 3
4 5 6 7", sep = " ", header = TRUE, strip.white = FALSE)
# A B C D
# 1: 1 2 NA 3
# 2: 4 5 6 7
注意:,因为我找不到相关的帖子,所以提供了自我解答,这也使我在两次.
Note: Providing self-answer as I couldn't find relevant post, also this tripped me over once and twice.
This doesn't work anymore for data.table_1.12.2, related GitHub Issue.
这篇关于当分隔符为空格且缺失值为空白时,如何读取?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!