如何在MySQL中绕过无效的utf8字符串 [英] How to bypass invalid utf8 character string in mysql
问题描述
我有一个很大的文本文件,其中包含阿拉伯文本数据.当我尝试将其加载到MySQL表中时,出现错误消息Error code 1300: invalid utf8 character string
.到目前为止,这是我尝试过的:
I have a large text file containing Arabic text data. When I try to load it into a MySQL table, I get error saying Error code 1300: invalid utf8 character string
. This is what I have tried so far:
LOAD DATA INFILE '/var/lib/mysql-files/text_file.txt'
IGNORE INTO TABLE tblTest
FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n';
我试图忽略此错误,但是它不起作用.我已经尝试过LOCAL INFILE
,但是也没有用.我的数据库是使用DEFAULT CHAR SET UTF8
和DEFAULT COLLATE utf8_general_ci
创建的.文本文件是utf-8
编码的.
I tried to ignore this error, but it does not work. I have tried LOCAL INFILE
but it did not work, too. My database was created using DEFAULT CHAR SET UTF8
and DEFAULT COLLATE utf8_general_ci
. The text file is utf-8
encoded.
我不希望包含无效utf8字符的记录.那么如何忽略忽略包含无效字符的记录来加载数据?
I do not want the records which contain invalid utf8 characters. So how I can load the data with ignoring the records containing such invalid chars?
提前谢谢!
推荐答案
使用顽皮字符的十六进制会有所帮助.
It would help to have the HEX of the naughty character.
一种阅读所有文本然后处理任何不良字符的可能方法:
A possible approach to reading all the text, then dealing with any bad characters:
-
读入
VARBINARY
或BLOB
类型的列.
遍历行,尝试复制到VARCHAR
或TEXT
列.
Loop through the rows, trying to copy to a VARCHAR
or TEXT
column.
另一个计划是使用utf8mb4而不是utf8.坏字符可能是可以在utf8mb4中使用的表情符号或汉字,但不适用于utf8.
Another plan is to use utf8mb4 instead of utf8. It could be that the bad character is an Emoji or Chinese character that will work in utf8mb4, but not utf8.
忽略错误
此可能让您忽略错误:
SET @save := @@sql_mode;
LOAD DATA ...;
SET @@sql_mode := @save;
这篇关于如何在MySQL中绕过无效的utf8字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!