FasterCSV:几个分隔符 [英] FasterCSV: several separators

查看:93
本文介绍了FasterCSV:几个分隔符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的Rails3应用解析用户上传的CSV文件.
不出所料,用户上传了制表符分隔和逗号分隔的文件.
我要同时支持.

My Rails3 app parses user-uploaded CSV files.
As can be expected, users upload tab-separated AND comma-separated files.
I want to support both.

我的代码:

input = CSV.read(uploaded_io.tempfile, { encoding: "UTF-8", :col_sep => "\t"})

问题:如何将其更改为也支持逗号?

QUESTION:How to change it to support commas too?

FasterCSV的文档将col_sep描述为The String placed between each field.,因此:col_sep => ",\t"将不起作用.

FasterCSV's doc describes col_sep as The String placed between each field. so :col_sep => ",\t" won't work.

注意:里面的所有数据都是整数或标识符,因此某人在内容中使用\t,的可能性(不是定界符)为零.因此,我明确不想阻止在同一文件中使用两个不同的分隔符.

Note: All data inside are integers or identifiers, so the probability of someone using \t or , within the content (not a delimiter) is zero. So usage of the two different delimiters in the same file is not something I expressly want to prevent.

推荐答案

解决方案1:

一种简单的实现方法是让用户通过下拉列表选择他们在CSV文件中使用的分隔符,然后只需在CSV.read()调用中设置该值即可.但是我想您希望它是自动的. :-)

One simple way to do it is to let the user select with a drop-down which separator they use in their CSV file, and then you just set that value in the CSV.read() call. But I guess you want it automatic. :-)

解决方案2:

您可以使用常规File.read()读入CSV文件的第一行,并通过将第一行与/,/然后与/\t/进行匹配来对其进行分析... ...根据匹配的RegExp,选择CSV.read()调用中的分隔符可对相应的(单个)分隔符进行调用.然后,您相应地使用CSV.read(..., :col_sep => single_separator )读取文件.

You can read-in the first line of the CSV file with regular File.read() and analyze it by matching the first line against /,/ and then against /\t/ ... depending on which RegExp matches, you select the separator in the CSV.read() call to the according (single) separator. Then you read in the file with CSV.read(..., :col_sep => single_separator ) accordingly.

但要注意:

起初,想在方法调用中使用",\t"作为分隔符以允许两者都看起来很好看,但请注意,这会引入可能的讨厌的错误!

At first it looks nice and elegant to want to use ",\t" as the separator in the method call to allow both -- but please note this would introduce a possible nasty bug!

如果CVS文件在偶然或偶然的情况下同时包含制表符和逗号...您该怎么办? 两者分开吗?您如何确定?我认为这是一个错误,因为CSV分隔符在常规CSV文件中不会像这样混合"出现-始终为','"\t"

If a CVS file would contain both tabs and commas by accident or by chance ... what do you do then? Separate on both? How can you be sure? I think that would be a mistake, because CSV separators don't appear "mixed" like this in regular CSV files -- it's always either ',' or "\t"

因此,我认为您不应该使用",\t",这可能会引起巨大的问题,这可能就是为什么他们没有实现/允许col_sep选项接受RegExp的原因.

So I think you should not use ",\t" -- that could be causing huge problems, and that's probably the reason why they did not implement / allow the col_sep option to accept a RegExp.

这篇关于FasterCSV:几个分隔符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆