导入“csv”文件有多个字符分隔符到R? [英] Importing "csv" file with multiple-character separator to R?

查看:833
本文介绍了导入“csv”文件有多个字符分隔符到R?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个csv文本文件,其中每个字段由 \t&%$#分隔,我现在尝试导入到R中。 p>

sep = 参数 read.table() instists对单个字符。是否有快速方法直接导入此文件?



某些数据字段是用户提交的文本,其中包含选项卡,引号和其他杂乱的内容,因此更改分隔符到更简单的东西,似乎可能会产生其他问题。

解决方案

下面的代码将能够处理多个分隔符: / p>

  #fileName<  - 具有完全限定路径的文件名
#separators< - - '

read< - function(fileName,separator){
data< - readLines(con< - file(fileName))
close(con)
record< - sapply(data,strsplit,split = separator)
dataFrame < - data.frame(t(sapply(records,c)))
rownames(dataFrame) nrow(dataFrame)
return(as.data.frame(dataFrame,stringsAsFactors = FALSE))
}


I have a "csv" text file where each field is separated by \t&%$# which I'm now trying to import into R.

The sep= argument of read.table()instists on a single character. Is there a quick way to directly import this file?

Some of the data fields are user-submitted text which contain tabs, quotes, and other messy stuff, so changing the delimiter to something simpler seems like it could create other problems.

解决方案

The following code will be able to handle multiple separator chars:

#fileName <- file name with fully qualified path
#separators <- each of them separated by '|'

read <- function(fileName, separators) {
    data <- readLines(con <- file(fileName))
    close(con)
    records <- sapply(data, strsplit, split=separators)
    dataFrame <- data.frame(t(sapply(records,c)))
    rownames(dataFrame) <- 1: nrow(dataFrame)
    return(as.data.frame(dataFrame,stringsAsFactors = FALSE))
}

这篇关于导入“csv”文件有多个字符分隔符到R?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆