读取R中数据集时出错 [英] Error in reading in data set in R

查看:293
本文介绍了读取R中数据集时出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在R中的数据集中读取如下:

When reading in my data set in R as follows:

Dataset.df <- read.table("C:\\dataset.txt", header=T)

我收到以下错误消息:

I get the following error message:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
   line 1 did not have 145 elements

这是什么意思,可以有人告诉我如何修复它?

What does this mean and can somebody tell me how to fix it?

推荐答案

这个错误是非常自明的,数据文件的第一行似乎有数据丢失(或第二行,视情况而定,因为您使用 header = TRUE )。

This error is pretty self-explanatory. There seem to be data missing in the first line of your data file (or second line, as the case may be since you're using header = TRUE).

这是一个迷你示例:

## Create a small dataset to play with
cat("V1 V2\nFirst 1 2\nSecond 2\nThird 3 8\n", file="test.txt")

R自动检测到它应该期望rownames加上两列(3元素),但它没有在第2行找到3个元素,所以你得到一个错误:

R automatically detects that it should expect rownames plus two columns (3 elements), but it doesn't find 3 elements on line 2, so you get an error:

read.table("test.txt", header = TRUE)
# Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#   line 2 did not have 3 elements

查看数据文件,看看确实有问题:

Look at the data file and see if there is indeed a problem:

cat(readLines("test.txt"), sep = "\n")
# V1 V2
# First 1 2
# Second 2
# Third 3 8

手动更正可能是或者我们可以假设第二行行中的值第一个值应在第一列中,其他值应为 NA 。如果是这种情况, fill = TRUE 足以解决您的问题。

Manual correction might be needed, or we can assume that the value first value in the "Second" row line should be in the first column, and other values should be NA. If this is the case, fill = TRUE is enough to solve your problem.

read.table("test.txt", header = TRUE, fill = TRUE)
#        V1 V2
# First   1  2
# Second  2 NA
# Third   3  8






R也很聪明,即使丢失了rownames,也需要多少元素:


R is also smart enough to figure it out how many elements it needs even if rownames are missing:

cat("V1 V2\n1\n2 5\n3 8\n", file="test2.txt")
cat(readLines("test2.txt"), sep = "\n")
# V1 V2
# 1
# 2 5
# 3 8
read.table("test2.txt", header = TRUE)
# Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#   line 1 did not have 2 elements
read.table("test2.txt", header = TRUE, fill = TRUE)
#   V1 V2
# 1  1 NA
# 2  2  5
# 3  3  8

这篇关于读取R中数据集时出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆