当我使用csv文件导入古吉拉特语数据时,时间数据显示为? [英] When I importing gujarati data using csv file that time data show like?
问题描述
我正在使用db2数据库,当我导入古吉拉特语数据时,时间数据会显示一些符号。
我尝试设置UTF-8,但仍然是显示符号。
Db2-server平台是Windows。
如何导入古吉拉特语数据??
我了解您的意图是将数据存储为UTF-8和 DB2文档说:
Unicode支持以下印度语脚本:印地语,古吉拉特语,卡纳达语,康卡尼语,马拉地语,旁遮普语,梵语,泰米尔语和泰卢固语。
ie我们可以为古吉拉特语使用任何UTF-8数据库(代码页1252)。 根据Wikipedia,它具有91个代码点,从U + 0A81到U + 0AD0。这意味着在内部他们将需要3个字节的编码为UTF-8的存储空间(
让我们尝试使用ગુજરાતી(古吉拉特语)为例。
它由7个字符组成:
U + 0A97 GUJARATI LETTER GA utf-8 0xE0AA97
U + 0AC1 GUJARATI VOWEL SIGN U utf-8 0xE0AB81
U + 0A9C GUJARATI LETTER JA utf-8 0xE0AA9C
U + 0AB0 GUJARATI LETTER RA utf-8 0xE0AAB0
U + 0ABE GUJARATI VOWEL 8 0xE0AABE
U + 0AA4 GUJARATI字母TA utf-8 0xE0AAA4
U + 0AC0 GUJARATI VOWEL SIGN II utf-8 0xE0AB80
让我们测试:
db2创建表gujarati_tab(c1 int,c2 varchar(10 codeunits32))
db2插入gujarati_tab值(1,'ગુજરાતી')
要确保数据存储正确,我们可以检查列的二进制结构:
db2从中选择hex(c2) gujarati_tab
1
----------------------------------- --------
E0AA97E0AB81E0AA9CE0AAB0E0AABEE0AAA4E0AB80
现在可以将其拆分为7个3字节结构,每个结构都匹配给定字符的预期字节集:
E0AA97 E0AB81 E0AA9C E0AAB0 E0AABE E0AAA4 E0AB80
表示数据已正确存储在数据库中。如果客户端仍然存在问题,那将完全是客户端应用程序无法解决数据库返回的正确UFT-8数据的问题。
I am using db2 database and when I importing gujarati data that time data show some symbols. I try to set UTF-8 but still it's show symbol. Db2-server platform is windows. How to importing gujarati data.?
It is not clear from the problem description whether there is an issue with the client or the database, so I will show universal steps to troubleshoot an issue of this kind. I understand that your intention is to store the data as UTF-8 and Db2 documentation says:
The following Indic scripts are supported through Unicode: Hindi, Gujarati, Kannada, Konkani, Marathi, Punjabi, Sanskrit, Tamil and Telugu.
i.e. we can use any UTF-8 database (code page 1252) for Gujarati. It has 91 code points assigned according to Wikipedia, from U+0A81 to U+0AD0. This implies internally they will need 3 bytes for storage encoded as UTF-8 (which also means first byte will be always 0xE).
Let's try to use "ગુજરાતી" (Gujarati) as an example. It consists of 7 characters:
U+0A97 GUJARATI LETTER GA utf-8 0xE0AA97
U+0AC1 GUJARATI VOWEL SIGN U utf-8 0xE0AB81
U+0A9C GUJARATI LETTER JA utf-8 0xE0AA9C
U+0AB0 GUJARATI LETTER RA utf-8 0xE0AAB0
U+0ABE GUJARATI VOWEL SIGN AA utf-8 0xE0AABE
U+0AA4 GUJARATI LETTER TA utf-8 0xE0AAA4
U+0AC0 GUJARATI VOWEL SIGN II utf-8 0xE0AB80
Let's test:
db2 "create table gujarati_tab(c1 int, c2 varchar(10 codeunits32))"
db2 "insert into gujarati_tab values(1, 'ગુજરાતી')"
To make sure data is stored properly we can examine the binary structure of our column:
db2 "select hex(c2) from gujarati_tab"
1
-------------------------------------------
E0AA97E0AB81E0AA9CE0AAB0E0AABEE0AAA4E0AB80
Now you can split that into 7 3-byte structures each matching expected set of bytes for given characters:
E0AA97 E0AB81 E0AA9C E0AAB0 E0AABE E0AAA4 E0AB80
which implies data is stored correctly in the database. If there is still an issue on the client end, it will be strictly a problem of client application that is not interpreting correct UFT-8 data returned by the database.
这篇关于当我使用csv文件导入古吉拉特语数据时,时间数据显示为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!