当我使用csv文件导入古吉拉特语数据时,时间数据显示为? [英] When I importing gujarati data using csv file that time data show like?

查看:130
本文介绍了当我使用csv文件导入古吉拉特语数据时,时间数据显示为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用db2数据库,当我导入古吉拉特语数据时,时间数据会显示一些符号。
我尝试设置UTF-8,但仍然是显示符号。
Db2-server平台是Windows。
如何导入古吉拉特语数据??



我了解您的意图是将数据存储为UTF-8和 DB2文档说


Unicode支持以下印度语脚本:印地语,古吉拉特语,卡纳达语,康卡尼语,马拉地语,旁遮普语,梵语,泰米尔语和泰卢固语。


ie我们可以为古吉拉特语使用任何UTF-8数据库(代码页1252)。 根据Wikipedia,它具有91个代码点,从U + 0A81到U + 0AD0。这意味着在内部他们将需要3个字节的编码为UTF-8的存储空间



让我们尝试使用ગુજરાતી(古吉拉特语)为例。
它由7个字符组成:

  U + 0A97 GUJARATI LETTER GA utf-8 0xE0AA97 
U + 0AC1 GUJARATI VOWEL SIGN U utf-8 0xE0AB81
U + 0A9C GUJARATI LETTER JA utf-8 0xE0AA9C
U + 0AB0 GUJARATI LETTER RA utf-8 0xE0AAB0
U + 0ABE GUJARATI VOWEL 8 0xE0AABE
U + 0AA4 GUJARATI字母TA utf-8 0xE0AAA4
U + 0AC0 GUJARATI VOWEL SIGN II utf-8 0xE0AB80

让我们测试:

  db2创建表gujarati_tab(c1 int,c2 varchar(10 codeunits32)) 
db2插入gujarati_tab值(1,'ગુજરાતી')

要确保数据存储正确,我们可以检查列的二进制结构:

  db2从中选择hex(c2) gujarati_tab 

1
----------------------------------- --------
E0AA97E0AB81E0AA9CE0AAB0E0AABEE0AAA4E0AB80

现在可以将其拆分为7个3字节结构,每个结构都匹配给定字符的预期字节集:

  E0AA97 E0AB81 E0AA9C E0AAB0 E0AABE E0AAA4 E0AB80 

表示数据已正确存储在数据库中。如果客户端仍然存在问题,那将完全是客户端应用程序无法解决数据库返回的正确UFT-8数据的问题。


I am using db2 database and when I importing gujarati data that time data show some symbols. I try to set UTF-8 but still it's show symbol. Db2-server platform is windows. How to importing gujarati data.?

解决方案

It is not clear from the problem description whether there is an issue with the client or the database, so I will show universal steps to troubleshoot an issue of this kind. I understand that your intention is to store the data as UTF-8 and Db2 documentation says:

The following Indic scripts are supported through Unicode: Hindi, Gujarati, Kannada, Konkani, Marathi, Punjabi, Sanskrit, Tamil and Telugu.

i.e. we can use any UTF-8 database (code page 1252) for Gujarati. It has 91 code points assigned according to Wikipedia, from U+0A81 to U+0AD0. This implies internally they will need 3 bytes for storage encoded as UTF-8 (which also means first byte will be always 0xE).

Let's try to use "ગુજરાતી" (Gujarati) as an example. It consists of 7 characters:

U+0A97 GUJARATI LETTER GA       utf-8 0xE0AA97
U+0AC1 GUJARATI VOWEL SIGN U    utf-8 0xE0AB81
U+0A9C GUJARATI LETTER JA       utf-8 0xE0AA9C
U+0AB0 GUJARATI LETTER RA       utf-8 0xE0AAB0
U+0ABE GUJARATI VOWEL SIGN AA   utf-8 0xE0AABE
U+0AA4 GUJARATI LETTER TA       utf-8 0xE0AAA4
U+0AC0 GUJARATI VOWEL SIGN II   utf-8 0xE0AB80

Let's test:

db2 "create table gujarati_tab(c1 int, c2 varchar(10 codeunits32))"
db2 "insert into gujarati_tab values(1, 'ગુજરાતી')"

To make sure data is stored properly we can examine the binary structure of our column:

db2 "select hex(c2) from gujarati_tab"

1                                          
-------------------------------------------
E0AA97E0AB81E0AA9CE0AAB0E0AABEE0AAA4E0AB80 

Now you can split that into 7 3-byte structures each matching expected set of bytes for given characters:

E0AA97 E0AB81 E0AA9C E0AAB0 E0AABE E0AAA4 E0AB80

which implies data is stored correctly in the database. If there is still an issue on the client end, it will be strictly a problem of client application that is not interpreting correct UFT-8 data returned by the database.

这篇关于当我使用csv文件导入古吉拉特语数据时,时间数据显示为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆