如何处理数据以避免MySQL“不正确的字符串值”错误? [英] How can I process data to avoid MySQL "incorrect string value" error?

查看:1016
本文介绍了如何处理数据以避免MySQL“不正确的字符串值”错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用Rake任务将一些旧数据从MS Access迁移到MySQL。我在Windows XP上使用Ruby 1.8.6。

I am trying to use a Rake task to migrate some legacy data from MS Access to MySQL. I'm working on Windows XP, using Ruby 1.8.6.

我在中设置了Rails的编码为utf8database.yml

I have the encoding for Rails set as "utf8" in database.yml.

此外,MySQL的默认字符集是utf8。

Also, the default character set for MySQL is utf8.

99 %的数据是正常的,但每一次,我会得到一个列值,给我一个错误,像这样:

99% of the data is coming in fine, but every now and then I'll get a column value that gives me a error something like this:

Mysql::Error: Incorrect string value: '\x92 Comm...' for column 'name' 
  at row 1: 
  INSERT INTO `organizations` ( [...] ) 
  VALUES('Lawyers’ Committee', [...] )

似乎给MySQL的麻烦的事情是紧接在律师一词之后的撇号。

It looks as though the thing that's giving MySQL trouble is the apostrophe immediately after the "s" in the word "Lawyers".

这是另一个...

Mysql::Error: Incorrect string value: '\x99 aoc' for column 'department' 
  at row 1: 
  INSERT INTO `addresses` 
[...]
  'TRInfo™ aoc'
[....]

看起来它在TRInfo之后的TM上窒息。

Looks like it's choking on the "TM" after "TRInfo".

有没有Ruby或Rails方法我可以运行数据通过从它清除任何字符MySQL将窒息?

Is there any Ruby or Rails method that I can run the data through to cleanse from it any characters that MySQL will choke on?

理想情况下,它将是伟大的更换可口的字符替换撇号与单引号和TM符号与字符串(TM)。

Ideally, it would be great to replace them with more palatable characters -- replace the apostrophe with a single quote and the TM symbol with the string "(TM)".

或者,如果我能以某种方式配置MySQL存储这些字符没有错误

Or, if I could somehow configure MySQL to store those characters as-is without errors that would be great too.

推荐答案

看起来你的输入数据不是utf-8。

It looks like your input data is not in utf-8.

我做了一个小调查,在Lawyer's中使用的样式报价在Windows-1252编码中编码为\x92,但对UTF-8我解码它并将其编码为utf8,我有\xe2 \x80 \x99)。

I did a little investigating and the styled quote used in Lawyer's is encoded as \x92 in the Windows-1252 encoding, but would be nonsense for utf-8 (when I decoded it and encoded it into utf8, I got \xe2\x80\x99).

因此,您将需要转换输入字符串从窗口-1252到utf-8(或到unicode)。

Thus you will need to convert the input strings from windows-1252 to utf-8 (or to unicode).

这篇关于如何处理数据以避免MySQL“不正确的字符串值”错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆