如何获取UNI code十进制再在含印地文文本字符串中的字符的presentation? [英] How to retrieve the unicode decimal representation of the chars in a string containing hindi text?
问题描述
我使用Visual Studio 2010的C#转换成文本UNI codeS。像我有一个字符串ABC =मेरा。 有这串4个字符。我需要所有四个UNI code字符。 请帮我。
I am using visual studio 2010 in c# for converting text into unicodes. Like i have a string abc= "मेरा" . there are 4 characters in this string. i need all the four unicode characters. Please help me.
推荐答案
当你写一个code像字符串ABC =मेरा;
,你已经有了它作为统一code(具体而言,UTF-16),因此您不必任何转换。如果你想访问的奇异人物,您可以在使用普通指数做:如 ABC [1]
就是人品े。
When you write a code like string abc= "मेरा";
, you already have it as Unicode (specifically, UTF-16), so you don't have to convert anything. If you want to access the singular characters, you can do that using normal index: e.g. abc[1]
is the character "े".
。例如
abc.Select(c => (int)c)
给出的数字2350,2375,2352的序列,2366.如果你想看到的十六进制重新这些数字presentation,使用的ToString()
:
abc.Select(c => ((int)c).ToString("x4"))
返回字符串092e的顺序,0947,0930,093e。
returns the sequence of strings "092e", "0947", "0930", "093e".
请注意,当我说的数字小重presentations,我实际使用UTF-16意味着它们的编码。对于在基本多文种平面的人物,这是同他们的Uni code code点。使用的字符绝大多数存在于BMP,这里包括psented的4印地文字符$ P $。
Note that when I said numberic representations, I actually meant their encoding using UTF-16. For characters in the Basic Multilingual Plane, this is the same as their Unicode code point. The vast majority of used characters lie in BMP, including those 4 Hindi characters presented here.
如果你想处理在其他位面的字符也一样,你可以用code像下面这样。
If you wanted to handle characters in other planes too, you could use code like the following.
byte[] bytes = Encoding.UTF32.GetBytes(abc);
int codePointCount = bytes.Length / 4;
int[] codePoints = new int[codePointCount];
for (int i = 0; i < codePointCount; i++)
codePoints[i] = BitConverter.ToInt32(bytes, i * 4);
由于UTF-32连接$直接的所有(21位),code点C $ CS,这会给你他们。 (也许还有一个更简单的解决方案,但我还没有找到一个。)
Since UTF-32 encodes all (21-bit) code points directly, this will give you them. (Maybe there is a more straightforward solution, but I haven't found one.)
这篇关于如何获取UNI code十进制再在含印地文文本字符串中的字符的presentation?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!