将文本导入MySQL:奇怪的格式 [英] Importing text to MySQL: strange format

查看:147
本文介绍了将文本导入MySQL:奇怪的格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用mysqlimport将.txt文件中的某些数据导入MySQL数据库表中.似乎可以导入OK(没有错误消息),但是显示时看起来很奇怪,并且无法按预期进行搜索.

I'm importing some data from a .txt file into a MySQL database table, using mysqlimport. It seems to import OK (no error messages) but looks very odd when displayed, and can't be searched as expected.

这是详细信息.原始文本文件保存在UTF-8中,其记录看起来(在文本编辑器中)是这样的.第二个字段包括换行符:

Here are the details. The original text file is saved in UTF-8, with records that look (in a text editor) like this. The second field includes line breaks:

WAR-16,52 ~~~~~ Lorem ipsum dolor sit.
Lorem ipsum dolor sit.
~~~~~ ENDOFRECORD
WAR-16,53~~~~~Lorem ipsum dolor sit.
Lorem ipsum dolor sit.
Lorem ipsum dolor sit.
Lorem ipsum dolor sit.

~~~~~ ENDOFRECORD

我要导入的数据库表非常简单:

The database table into which I am importing is very simple:

+-------+---------------+------+-----+---------+-------+
| Field | Type          | Null | Key | Default | Extra |
+-------+---------------+------+-----+---------+-------+
| id    | varchar(100)  | YES  |     | NULL    |       |
| text  | varchar(5000) | YES  |     | NULL    |       |
+-------+---------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

当我导入文件时,这是我使用的命令:

When I import the file, here's the command I use:

$ mysqlimport -u root db textfile.txt --fields-terminated-by="~~~~~" --lines-terminated-by="ENDOFTHELINE" --default-character-set='utf8'
db.records_list: Records: 18778  Deleted: 0  Skipped: 0  Warnings: 18787

这就是我随后要求MySQL显示记录的情况:

Here's what I see if I then ask MySQL to display the records:

mysql> select * from textfile;
| 
 W A R - 1 6 , 5 2 |  L o r e m  i p s u m  d o l o r  s i t .
L o r e m  i p s u m  d o l o r  s i t .
(etc)

因此,看起来好像在文本中添加了空格或一些奇怪的编码附加功能.

So, it looks as though spaces, or some strange encoding extras, are being added to the text.

这是数据库查询的问题:

And here's the problem with the database query:

mysql> select * from textfile where id like "%WAR%";

什么也不返回;也不添加空格:

returns nothing; nor does adding spaces:

mysql> select * from textfile where id like "%W A R%";

仅此命令返回任何内容

mysql> select * from textfile where id like "%W%";

有人能猜出会发生什么吗?我觉得这一定是编码问题,但我无法解决.

Can anyone guess what might be happening? I feel like it must be an encoding problem, but I can't work it out.

------更新--------

------ UPDATE --------

好的,我已经检查了数据库和连接编码.

OK, I've checked the database and connection encoding.

mysql> show variables like "character_set_%";
+--------------------------+----------------------------------------+
| Variable_name            | Value                                  |
+--------------------------+----------------------------------------+
| character_set_client     | latin1                                 |
| character_set_connection | latin1                                 |
| character_set_database   | latin1                                 |
| character_set_filesystem | binary                                 |
| character_set_results    | latin1                                 |
| character_set_server     | latin1                                 |
| character_set_system     | utf8                                   |
| character_sets_dir       | /usr/local/mysql/share/mysql/charsets/ |
+--------------------------+----------------------------------------+
8 rows in set (0.01 sec)

show table status说表是latin1_swedish_ci.

我已经在西方(Windows Latin 1)"中重新保存了文本文件(使用Snow Leopard上的TextEdit),并尝试使用与上述相同的命令将其导入.但是我仍然有相同的编码问题.

I have re-saved the text file in "Western (Windows Latin 1)" (using TextEdit on Snow Leopard) and tried to import it using the same command as above. However I still have the same encoding problem.

我也尝试过,再次没有运气:

I also tried, again with no luck:

  • 使用UTF-8创建新表并导入现有文件
  • 复制和将文本粘贴到我之前已经导入好的另一个文本文件中,然后尝试导入.

还是完全困惑:((((

Still totally baffled :(((

推荐答案

如您重复的问题所述,请确保您的表和连接都使用UTF-8.这是此问题的常见原因之一.

As noted in your duplicate question, make sure that your table and connection are both using UTF-8. That is one of the common sources of this issue.

http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html

这篇关于将文本导入MySQL:奇怪的格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆