使用Ruby上传文件字符集转换 [英] Uploaded file char-set conversion with Ruby

查看:218
本文介绍了使用Ruby上传文件字符集转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个应用程序,我们让客户端将csv文件上传到我们的服务器。然后,我们处理并将来自csv的数据放入我们的数据库。我们遇到一些问题的字符集,特别是当我们处理JSON,特别是一些未转换的UTF-8字符打破IE上的JSON响应。

I have an application where we're having our clients upload a csv file to our server. We then process and put the data from the csv into our database. We're running into some issues with char-sets especially when we're dealing with JSON, in particular some non-converted UTF-8 characters are breaking IE on JSON responses.

有没有办法将上传的csv文件转换为UTF-8,然后我们开始处理它?有没有办法确定上传的文件的字符编码?我玩过iconv有点,但我们不总是确定上传的文件将有什么编码。感谢。

Is there a way to convert the uploaded csv file to UTF-8 before we start processing it? Is there a way to determine the character encoding of an uploaded file? I've played with iconv a bit but we're not always sure what encoding the uploaded file will have. Thanks.

推荐答案

这个解决方案可能不理想,但应该做这项工作。

This solution might be not ideal, but should do the job.

首先,成分:


  • chardet( sudo gem install chardet

  • fastercsv( sudo gem install
    fastercsv

  • chardet (sudo gem install chardet)
  • fastercsv (sudo gem install fastercsv)

现在实际代码(未测试):

Now the actual code (not tested):

require 'rubygems'
require 'UniversalDetector'
require 'fastercsv'
require 'iconv'

file_to_import = File.open("path/to/your.csv")
# determine the encoding based on the first 100 characters
chardet = UniversalDetector::chardet(file_to_import.read[0..100])
if chardet['confidence'] > 0.7
  charset = chardet['encoding']
else 
  raise 'You better check this file manually.'
end
file_to_import.each_line do |l| 
  converted_line = Iconv.conv('utf-8', charset, l)
  row = FasterCSV.parse(converted_line)[0]
  # do the business here
end

这篇关于使用Ruby上传文件字符集转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆