如何获取UNI code十进制再在含印地文文本字符串中的字符的presentation? [英] How to retrieve the unicode decimal representation of the chars in a string containing hindi text?

查看:126
本文介绍了如何获取UNI code十进制再在含印地文文本字符串中的字符的presentation?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Visual Studio 2010的C#转换成文本UNI codeS。像我有一个字符串ABC =मेरा。 有这串4个字符。我需要所有四个UNI code字符。 请帮我。

I am using visual studio 2010 in c# for converting text into unicodes. Like i have a string abc= "मेरा" . there are 4 characters in this string. i need all the four unicode characters. Please help me.

推荐答案

当你写一个code像字符串ABC =मेरा; ,你已经有了它作为统一code(具体而言,UTF-16),因此您不必任何转换。如果你想访问的奇异人物,您可以在使用普通指数做:如 ABC [1] 就是人品े。

When you write a code like string abc= "मेरा";, you already have it as Unicode (specifically, UTF-16), so you don't have to convert anything. If you want to access the singular characters, you can do that using normal index: e.g. abc[1] is the character "े".

。例如

abc.Select(c => (int)c)

给出的数字2350,2375,2352的序列,2366.如果你想看到的十六进制重新这些数字presentation,使用的ToString()

abc.Select(c => ((int)c).ToString("x4"))

返回字符串092e的顺序,0947,0930,093e。

returns the sequence of strings "092e", "0947", "0930", "093e".

请注意,当我说的数字小重presentations,我实际使用UTF-16意味着它们的编码。对于在基本多文种平面的人物,这是同他们的Uni code code点。使用的字符绝大多数存在于BMP,这里包括psented的4印地文字符$ P $。

Note that when I said numberic representations, I actually meant their encoding using UTF-16. For characters in the Basic Multilingual Plane, this is the same as their Unicode code point. The vast majority of used characters lie in BMP, including those 4 Hindi characters presented here.

如果你想处理在其他位面的字符也一样,你可以用code像下面这样。

If you wanted to handle characters in other planes too, you could use code like the following.

byte[] bytes = Encoding.UTF32.GetBytes(abc);

int codePointCount = bytes.Length / 4;

int[] codePoints = new int[codePointCount];

for (int i = 0; i < codePointCount; i++)
    codePoints[i] = BitConverter.ToInt32(bytes, i * 4);

由于UTF-32连接$直接的所有(21位),code点C $ CS,这会给你他们。 (也许还有一个更简单的解决方案,但我还没有找到一个。)

Since UTF-32 encodes all (21-bit) code points directly, this will give you them. (Maybe there is a more straightforward solution, but I haven't found one.)

这篇关于如何获取UNI code十进制再在含印地文文本字符串中的字符的presentation?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆