无法使用“#”读取文件和空间使用R中的read.table或read.csv [英] Cannot read file with "#" and space using read.table or read.csv in R
问题描述
我有一个文件,其中第一行是标题。标题可以有空格和#符号(也可能有其他特殊字符)。我试图使用read.csv或read.table读取此文件,但它一直让我犯错:
I have a file where the first row is a header. The header can have spaces and the # symbol (there may be other special characters as well). I am trying to read this file using read.csv or read.table but it keeps throwing me errors:
undefined columns selected
more columns than column names
我的制表符分隔的chromFile文件如下所示:
My tab-delimited chromFile file looks like:
Chromosome# Chr chr Size UCSC NCBI36/hg18 NCBIBuild36 NCBIBuild37
1 Chr1 chr1 247199719 247249719 247249719 249250621
2 Chr2 chr2 242751149 242951149 242951149 243199373
命令:
chromosomes <- read.csv(chromFile, sep="\t",skip =0, header = TRUE, )
我想首先寻找一种方法来读取文件,而不用替换空格或#和其他可读符号。
I want to first look for a way to read the file as it as without replacing the space or # with some other readable symbol.
推荐答案
从文档(?read.csv
):
comment.char 字符:a cha包含单个字符或空字符串的长度为1的racter向量。使用完全关闭注释的解释。
comment.char character: a character vector of length one containing a single character or an empty string. Use "" to turn off the interpretation of comments altogether.
默认值为 comment.char =#
这会给你带来麻烦。在文档之后,你应该使用 comment.char =
。
The default is comment.char = "#"
which is causing you trouble. Following the documentation, you should use comment.char = ""
.
标题中的空格是另一个问题正如mrdwab所指出的那样,可以通过设置 check.names = FALSE
来解决。
Spaces in the header is another issue which, as mrdwab kindly pointed out, can be addressed by setting check.names = FALSE
.
chromosomes <- read.csv(chromFile, sep = "\t", skip = 0, header = TRUE,
comment.char = "", check.names = FALSE)
这篇关于无法使用“#”读取文件和空间使用R中的read.table或read.csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!