当分隔符为空格且缺失值为空白时,如何读取? [英] How to read when delimiter is space and missing values are blank?

查看:37
本文介绍了当分隔符为空格且缺失值为空白时,如何读取?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用空格分隔的文件,有些列为空白,所以我们最终有多个空格,并且 fread 失败,并显示错误.但是 read.table 可以正常工作.参见示例:

I have a space delimited file and some columns are blank, so we end up having multiple spaces, and fread fails with error. But read.table works fine. See example:

library(data.table)
# R version 3.4.2 (2017-09-28)
# data.table_1.10.4-3

fread("A B C D
1 2  3
4 5 6 7", sep = " ", header = TRUE)

Error in fread("A B C D\n1 2  3\n4 5 6 7") : 
  Expected sep (' ') but new line, EOF (or other non printing character) ends field 2 when detecting types from point 0: 1 2  3

read.table(text ="A B C D
1 2  3
4 5 6 7", sep = " ", header = TRUE)
#   A B  C D
# 1 1 2 NA 3
# 2 4 5  6 7

我们如何使用 fread 进行阅读,我尝试设置 sep =" na.string =" ,没有帮助

How do we read using fread, I tried setting sep = " " and na.string = "", didn't help.

推荐答案

fread 函数中,默认情况下 strip.white 设置为 TRUE ,表示前导尾随空格已删除.这对于读取具有固定宽度或具有

In fread function, by default strip.white is set to TRUE, meaning leading trailing spaces are removed. That is useful to read files with fixed width or with irregular number of spaces as separator.

read.table strip.white 中的默认设置为 FALSE .

fread("A B C D
1 2  3
4 5 6 7", sep = " ", header = TRUE, strip.white = FALSE)
#    A B  C D
# 1: 1 2 NA 3
# 2: 4 5  6 7


注意:,因为我找不到相关的帖子,所以提供了自我解答,这也使我在两次.


Note: Providing self-answer as I couldn't find relevant post, also this tripped me over once and twice.

对于data.table_1.12.2,

This doesn't work anymore for data.table_1.12.2, related GitHub Issue.

这篇关于当分隔符为空格且缺失值为空白时,如何读取?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆