如何阅读R中带有read.table的双引号转义值 [英] How to read \" double-quote escaped values with read.table in R
本文介绍了如何阅读R中带有read.table的双引号转义值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我无法读取包含类似于R中以下内容的行的文件.
I am having trouble to read a file containing lines like the one below in R.
"_:b5507F4C7x59005","Fabiana D\"atri"
有什么主意吗?如何使read.table理解\是引号的转义符?
Any idea? How can I make read.table understand that \" is the escape of quote?
干杯, 亚历山大(Alexandre)
Cheers, Alexandre
推荐答案
在我看来,read.table/read.csv
无法处理转义的引号.
It seems to me that read.table/read.csv
cannot handle escaped quotes.
...但是我认为我有一个(丑陋的)变通办法,它受到@nullglob的启发;
...But I think I have an (ugly) work-around inspired by @nullglob;
- 首先读取不带引号的文件.
(这不会处理@Ben Bolker指出的嵌入式
,
) - 然后遍历字符串列并删除引号:
- First read the file WITHOUT a quote character.
(This won't handle embedded
,
as @Ben Bolker noted) - Then go though the string columns and remove the quotes:
测试文件如下所示(为方便起见,我添加了一个非字符串列):
The test file looks like this (I added a non-string column for good measure):
13,"foo","Fab D\"atri","bar"
21,"foo2","Fab D\"atri2","bar2"
这是代码:
# Generate test file
writeLines(c("13,\"foo\",\"Fab D\\\"atri\",\"bar\"",
"21,\"foo2\",\"Fab D\\\"atri2\",\"bar2\"" ), "foo.txt")
# Read ignoring quotes
tbl <- read.table("foo.txt", as.is=TRUE, quote='', sep=',', header=FALSE, row.names=NULL)
# Go through and cleanup
for (i in seq_len(NCOL(tbl))) {
if (is.character(tbl[[i]])) {
x <- tbl[[i]]
x <- substr(x, 2, nchar(x)-1) # Remove surrounding quotes
tbl[[i]] <- gsub('\\\\"', '"', x) # Unescape quotes
}
}
输出正确:
> tbl
V1 V2 V3 V4
1 13 foo Fab D"atri bar
2 21 foo2 Fab D"atri2 bar2
这篇关于如何阅读R中带有read.table的双引号转义值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文