在mysql中导出和导入utf8数据:最佳做法 [英] export and import utf8 data in mysql: best practices

查看:313
本文介绍了在mysql中导出和导入utf8数据:最佳做法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们经常面临需要将数据文件发送给我们的客户,并提供他/她需要翻译的数据库中的数据。大部分时间此导出是CSV或XLS。
大多数时候,我们用phpmyadmin创建一个csv转储,并获得一个xls文件,以返回与翻译的数据。问题是大部分时间的数据是UTF8,当文件返回为xls时,每次我们再次将数据加载到mysql中时,我们最终会出现utf8问题,字符不正确显示等等。 p>

我们已经将mysql中的所有内容从my.conf加载到列字符集,并且所有内容都正确设置为UTF8。



我的问题是不是如何解决编码问题,因为这已经解决了,但我们如何最好地继续处理这种情况?我们应该交出什么样的出口格式?我们应该如何导入(只是mysql加载数据infile或我们自己的处理脚本)。如何处理这种情况的普遍共识是什么?



如果可能,我们希望继续使用excel,因为几乎每个人都期望包括我们客户的翻译公司的格式。我们客户的易用性是这里最重要的因素,每次都不会给我们带来重大问题。两个世界最好的:)

解决方案

我目前正在开发的应用程序包括数据导入的功能。数据主要是以utf-8编码。



我的方法是将导入的CSV(或制表符分隔)(任何编码)文件预处理为正确的utf-8在客户端脚本(Python)中编码临时CSV文件,并使用 LOAD DATA INFILE 语句。



文件的编码由 character_set_database 系统变量(变量应在服务器级设置),从MySQL 5.1开始。 17可以被LOAD DATA INFILE的 CHARACTER SET 子句覆盖。



唯一应该知道的是,MySQL存储多达3个每个字符的字节而不是4(这可能是东方语言的问题)。



为了大量数据传输,您可以使用 SELECT ... INTO OUTFILE 语句。


We're often faced with the need to send a data file to one of our clients with data from the database he/she needs to translate. Most of the time this export is CSV or XLS. Most of the time we create a csv dump with phpmyadmin and get an xls file in return with the translated data. The problem is that most of the time the data is UTF8 and when the file is returned as xls each and every time we load the data into mysql again we end up with utf8 problems, characters not being displayed properly, etc ...

We've already doublechecked everything in mysql from my.conf to column charactersets and everything is set correctly to UTF8.

My question is not how to fix the encoding issue since that's been solved but how we would best proceed in the future handling this situation? What export format should we hand over? How should we import (just mysql load data infile or our own processing scripts). What is the general consensus on how to handle this situation?

We would like to continue using excel if possible since that's the format almost everybody expects including our clients' translation agencies. Our clients' ease of use is the most important factor here, without overloading us with major issues each time. The best of both worlds :)

解决方案

The application I am currently working on includes the functionality of data import as well. The data is mostly encoded in utf-8.

My approach is to preprocess the imported CSV (or tab delimited)(in any encoding) file to a correct utf-8 encoded temporary CSV file in client script (Python) and load the contents of the file using LOAD DATA INFILE statement.

The encoding of the file is controled by character_set_database system variable (the variable should be set on the server level) and starting from MySQL 5.1.17 can be overridden by the CHARACTER SET clause of the LOAD DATA INFILE.

The only thing one should know is that MySQL stores up to 3 bytes for each character instead of 4 (that might be a problem for orient languages).

To export lots of data efficienly you can use SELECT ... INTO OUTFILE statement.

这篇关于在mysql中导出和导入utf8数据:最佳做法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆