Ruby无法解析CSV文件:CSV :: MalformedCSVError(第1行中的非法引用) [英] Ruby unable to parse a CSV file: CSV::MalformedCSVError (Illegal quoting in line 1.)

查看:906
本文介绍了Ruby无法解析CSV文件:CSV :: MalformedCSVError(第1行中的非法引用)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Ubuntu 12.04 LTS



Ruby ruby​​ 1.9.3dev(2011-09-23 revision 33323)[i686-linux]



Rails 3.2.9



我收到的CSV文件:

 date / time,settlement id,type,order id SKU,描述,数量,市场,履行,订单城市,订单状态,订单邮政,产品销售 促销折扣,销售税收,销售费,fba费用,其他交易费,其他,总
2013年3月1日12:03:54 PST ,5481545091,Order,108-0938567-7009852,ALS2GL36LED,Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor,1,amazon.com, Amazon,Pasadena,CA,91104-1056,43.00,3.25,0, - 3.25,0, - 6.45, - 3.75 ,0,32.80

但是当我试图解析CSV文件获取错误:

  1.9.3dev:016> options = {col_sep:,,quote_char:''} 
=> {:col_sep =>,,:quote_char =>\}
$ b b 1.9.3dev:022> CSV.foreach(/ tmp / my_data.csv,options){| row | puts row}
CSV :: MalformedCSVError:第1行中的非法引用。
来自/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv .rb:1925:in'block(2 levels)in shift'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb: 1887:在`each'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in`block in shift'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in`loop'
从/ home / jigneshgohel / .rvm / rubies / ruby​​-1.9.3-rc1 / lib / ruby​​ / 1.9.1 / csv.rb:1849:在'shift'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9。 3-rc1 / lib / ruby​​ / 1.9.1 / csv.rb:1791:在`each'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9 .1 / csv.rb:1208:in`block in foreach'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb: 1354:在'open'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in`foreach'
from(irb):22
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in`< main>'

然后我尝试简化数据ie

 name,age,email
jignesh,30,jignesh@example.com

但我仍然得到相同的错误:

  1.9.3dev:023 > CSV.foreach(/ tmp / my_data.csv,options){| row | puts row} 
CSV :: MalformedCSVError:第1行中的非法引用。
来自/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv .rb:1925:in'block(2 levels)in shift'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb: 1887:在`each'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in`block in shift'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in`loop'
从/ home / jigneshgohel / .rvm / rubies / ruby​​-1.9.3-rc1 / lib / ruby​​ / 1.9.1 / csv.rb:1849:在'shift'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9。 3-rc1 / lib / ruby​​ / 1.9.1 / csv.rb:1791:在`each'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9 .1 / csv.rb:1208:in`block in foreach'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb: 1354:在'open'
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in`foreach'
from(irb):23
从/home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in`< main>'



再次我尝试简化数据,如下:

  name,age,email 
jignesh,30,jignesh @ example.com

and it works.See the output below:

  1.9.3dev:024> CSV.foreach(/ tmp / my_data.csv){| row | puts row} 
name
age
email
jignesh
30
jignesh@example.com
=> nil

但我会收到带有引用数据的CSV文件,所以删除引号解决方案实际上不是我我无法找出导致错误的原因: CSV:MalformedCSVError:第1行中的非法引用



我已验证在CSV中没有前导/尾随空格,只需在我的文本编辑器中启用显示空格字符和显示行尾。我已使用以下方式验证了编码。

  1.9.3dev:026> File.open(/ tmp / my_data.csv)。read.encoding 
=> #< Encoding:UTF-8>



注意:我尝试使用CSV.read但是同样的错误。



任何人都可以帮助我解决这个问题,并让我明白它在哪里出错了?



=====================



我刚刚找到以下帖子: http://www.ruby-forum.com/topic/448070 并尝试以下操作:

  file_data = file.read 
file_data.gsub!('',')
arr_of_arrs = CSV.parse (file_data)

arr_of_arrs.each do | arr |
Rails.logger.debug=======#{arr}
end

并获得以下输出:

  ======= [\xEF\xBB\xBF'date / time','settlement id','type','order id' '订单城市',订单状态,订单邮政,订单城市,订单状态,订单状态 产品销售,航运信用,礼品包装信用,促销折扣,销售税收,销售费, ,'其他交易费',其他,总] 
======= ['3月1日,2013 12:03:54 PST' ,'5481545091',订单,'108-0938567-7009852','ALS2GL36LED',太阳能双向36明亮白色LED安全泛光灯与运动激活传感器 1,amazon.com,Amazon,Pasadena,CA,91104-1056,43 .00 '0','-3.25','0','-6.45',-3.75','0','0','32.80' ]

因为使用的默认 col_sep 逗号字符。
但是,我尝试使用 quote_char 选项:

  arr_of_arrs = CSV.parse file_data,:quote_char =>')

,但最终出现以下错误: p>

  CSV :: MalformedCSVError(第1行中的非法引用):

感谢,
Jignesh

解决方案

 code> quote_chars =%w(|〜^& *)
begin
@report = CSV.read(csv_file,headers::first_row,quote_char:quote_chars.shift)
rescue CSV :: MalformedCSVError
quote_chars.empty??raise:retry
end



<



NB CSV.parse 使用相同的参数 CSV.read ,因此可以使用内存中的文件或数据


Ubuntu 12.04 LTS

Ruby ruby 1.9.3dev (2011-09-23 revision 33323) [i686-linux]

Rails 3.2.9

Following is the content of my received CSV file:

"date/time","settlement id","type","order id","sku","description","quantity","marketplace","fulfillment","order city","order state","order postal","product sales","shipping credits","gift wrap credits","promotional rebates","sales tax collected","selling fees","fba fees","other transaction fees","other","total"
"Mar 1, 2013 12:03:54 AM PST","5481545091","Order","108-0938567-7009852","ALS2GL36LED","Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor","1","amazon.com","Amazon","Pasadena","CA","91104-1056","43.00","3.25","0","-3.25","0","-6.45","-3.75","0","0","32.80"

However when I am trying to parse the CSV file I am getting error:

1.9.3dev :016 > options = { col_sep: ",", quote_char:'"' }
=> {:col_sep=>",", :quote_char=>"\""} 

1.9.3dev :022 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
CSV::MalformedCSVError: Illegal quoting in line 1.
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
    from (irb):22
    from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'

Then I tried simplifying the data i.e.

"name","age","email"
"jignesh","30","jignesh@example.com"

however still I am getting the same error:

      1.9.3dev :023 > CSV.foreach("/tmp/my_data.csv", options) { |row| puts row }
  CSV::MalformedCSVError: Illegal quoting in line 1.
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1925:in `block (2 levels) in shift'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `each'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1887:in `block in shift'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `loop'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1849:in `shift'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1791:in `each'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1208:in `block in foreach'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1354:in `open'
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/lib/ruby/1.9.1/csv.rb:1207:in `foreach'
      from (irb):23
      from /home/jigneshgohel/.rvm/rubies/ruby-1.9.3-rc1/bin/irb:16:in `<main>'

Again I tried simplifying the data like this:

name,age,email
jignesh,30,jignesh@example.com

and it works.See the output below:

  1.9.3dev :024 > CSV.foreach("/tmp/my_data.csv") { |row| puts row }
  name
  age
  email
  jignesh
  30
  jignesh@example.com
   => nil 

But I will be receiving the CSV files having quoted data so removing quotes solution is not actually I am looking for.I am unable to figure out what is causing the error: CSV::MalformedCSVError: Illegal quoting in line 1. in my earlier examples.

I have verified that in the CSV there are no leading/trailing spaces by enabling "Show whitespace characters" and "Show Line Endings" in my text editor.Also I have verified the encoding using following.

  1.9.3dev :026 > File.open("/tmp/my_data.csv").read.encoding
  => #<Encoding:UTF-8> 

Note: I tried using CSV.read too but same error with that method.

Can anybody please help me getting out of the problem and make me understand where it is going wrong?

=====================

I just found following post at: http://www.ruby-forum.com/topic/448070 and tried following:

  file_data = file.read
  file_data.gsub!('"', "'")
  arr_of_arrs = CSV.parse(file_data)

  arr_of_arrs.each do |arr|
    Rails.logger.debug "=======#{arr}"
  end

and got the following output:

   =======["\xEF\xBB\xBF'date/time'", "'settlement id'", "'type'", "'order id'", "'sku'", "'description'", "'quantity'", "'marketplace'", "'fulfillment'", "'order city'", "'order state'", "'order postal'", "'product sales'", "'shipping credits'", "'gift wrap credits'", "'promotional rebates'", "'sales tax collected'", "'selling fees'", "'fba fees'", "'other transaction fees'", "'other'", "'total'"]
    =======["'Mar 1", " 2013 12:03:54 AM PST'", "'5481545091'", "'Order'", "'108-0938567-7009852'", "'ALS2GL36LED'", "'Solar Two Directional 36 Bright White LED Security Flood Light with Motion Activated Sensor'", "'1'", "'amazon.com'", "'Amazon'", "'Pasadena'", "'CA'", "'91104-1056'", "'43.00'", "'3.25'", "'0'", "'-3.25'", "'0'", "'-6.45'", "'-3.75'", "'0'", "'0'", "'32.80'"]

which messed up reading the data properly as the default col_sep used is a comma character. However I tried using quote_char option like this:

  arr_of_arrs = CSV.parse(file_data, :quote_char => "'")

but it ended up the following error:

   CSV::MalformedCSVError (Illegal quoting in line 1.):

Thanks, Jignesh

解决方案

quote_chars = %w(" | ~ ^ & *)
begin
  @report = CSV.read(csv_file, headers: :first_row, quote_char: quote_chars.shift)
rescue CSV::MalformedCSVError
  quote_chars.empty? ? raise : retry 
end

it's not perfect but it works most of the time.

N.B. CSV.parse takes the same parameters as CSV.read, so either a file or data from memory can be used

这篇关于Ruby无法解析CSV文件:CSV :: MalformedCSVError(第1行中的非法引用)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆