如何检查.csv文件是否以逗号或分号作为分隔符? [英] How to check if .csv-File has a comma or a semicolon as separator?

查看:679
本文介绍了如何检查.csv文件是否以逗号或分号作为分隔符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

标题已经表明:

我必须自动阅读很多.csv文件. 有些以逗号作为分隔符,然后我使用命令read.csv().

I have to read in a lot of .csv-Files automatically. Some have a comma as a delimiter, then i take the command read.csv().

有些使用分号作为分隔符,然后我使用read.csv2().

Some have a semicolon as a delimiter, then i take read.csv2().

我想写一段代码来识别.csv文件是否以逗号或分号作为定界符(在阅读之前),这样我就不必每次都更改代码. 我的方法是这样的:

I want to write a piece of code that recognizes if the .csv-File has a comma or a semicolon as a a delimiter(before i read it) so that I don´t have to change the code everytime. My approach would be something like this:

try to read.csv("xyz")
if error 
read.csv2("xyz")

有可能这样吗? 有人做过吗? 我该如何在没有实际看到的情况下检查是否有错误?

Is something like that possible? Has somebody done this before? How can i check if there was an error without actually seeing it?

我希望问题清楚. 对不起,我的英语

I hope the question is clear. Sorry for my English

预先感谢

推荐答案

这里有一些方法假设文件格式之间的唯一区别是分隔符是分号,小数点是逗号还是分隔符是逗号,小数点是点.

Here are a few approaches assuming that the only difference among the format of the files is whether the separator is semicolon and the decimal is a comma or the separator is a comma and the decimal is a point.

1)fread 如data.table包中注释fread所述,它将自动检测常见分隔符的分隔符,然后使用检测到的分隔符读取文件.这还可以处理格式上的某些其他更改,例如自动检测文件是否具有标题.

1) fread As mentioned in the comments fread in data.table package will automatically detect the separator for common separators and then read the file in using the separator it detected. This can also handle certain other changes in format such as automatically detecting whether the file has a header.

2)grepl 查看第一行,看看它是否包含逗号或分号,然后重新读取文件:

2) grepl Look at the first line and see if it has a comma or semicolon and then re-read the file:

L <- readLines("myfile", n = 1)
if (grepl(";", L)) read.csv2("myfile") else read.csv("myfile")

3)count.fields 如果可以假设每个文件中存在多个字段,则在sep =;"时如果存在一个字段我们知道分号不是分隔符.

3) count.fields If we can assume that that more than one field exists in each file then if there were one field when sep = ";" we know that semicolon is not the separarator.

L <- readLines("myfile", n = 1)
numfields <- count.fields(textConnection(L), sep = ";")
if (numfields == 1) read.csv("myfile") else read.csv2("myfile")

更新添加了(3)并对所有三个进行了改进.

Update Added (3) and made improvements to all three.

这篇关于如何检查.csv文件是否以逗号或分号作为分隔符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆