如何输出UNI code字符串RTF(使用C#) [英] How to output unicode string to RTF (using C#)

查看:304
本文介绍了如何输出UNI code字符串RTF(使用C#)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图输出UNI code字符串转换成RTF格式。 (使用C#和WinForms)

I'm trying to output unicode string into RTF format. (using c# and winforms)

维基百科

如果需要统一code转义,控制字\\使用U,接着是16位有符号的十进制整数给出统一code $ C $连接点数量。对于没有统一code支持计划的利益,这必须跟这个角色在指定的code页面最近重新presentation。例如,\\ u1576?会给阿拉伯字母BEH,指定旧的程序不具有统一code支持应该呈现为一个问号代替。

If a Unicode escape is required, the control word \u is used, followed by a 16-bit signed decimal integer giving the Unicode codepoint number. For the benefit of programs without Unicode support, this must be followed by the nearest representation of this character in the specified code page. For example, \u1576? would give the Arabic letter beh, specifying that older programs which do not have Unicode support should render it as a question mark instead.

我不知道该如何统一code字符转换成统一code $ C $连接点(\\ u1576)。
转换为UTF-8,UTF-16和类似的很容易,但我不知道如何转换为$ C $连接点。

I don't know how to convert Unicode character into Unicode codepoint ("\u1576"). Conversion to UTF 8, UTF 16 and similar is easy, but I don't know how to convert to codepoint.

场景中,我用这个:


  • 我看了现有的RTF文件转换成字符串(我读模板)

  • 与string.replace#TOKEN#与MyUni codeString的(模板填充数据)

  • 结果写入到另一个RTF文件。

问题,当出现统一code字赶到

Problem, arise when Unicode characters arrived

推荐答案

前提是,你在的基本多文种平面(这是不可能的,你需要什么了),那么一个简单的UTF-16编码就足够了。

Provided that all the characters that you're catering for exist in the Basic Multilingual Plane (it's unlikely that you'll need anything more), then a simple UTF-16 encoding should suffice.

百科:

从U + 0000的所有可能code点
  到U + 10FFFF,除了
  代理code点U + D800-U + DFFF
  (这是不字符),是
  用UTF-16唯一映射不管
  在code点的当前或未来的
  字符的分配和使用。

All possible code points from U+0000 through U+10FFFF, except for the surrogate code points U+D800–U+DFFF (which are not characters), are uniquely mapped by UTF-16 regardless of the code point's current or future character assignment or use.

下面的示例程序说明做沿着你想要的东西线:

The following sample program illustrates doing something along the lines of what you want:

static void Main(string[] args)
{
    // ë
    char[] ca = Encoding.Unicode.GetChars(new byte[] { 0xeb, 0x00 });
    var sw = new StreamWriter(@"c:/helloworld.rtf");
    sw.WriteLine(@"{\rtf
{\fonttbl {\f0 Times New Roman;}}
\f0\fs60 H" + GetRtfUnicodeEscapedString(new String(ca)) + @"llo, World!
}"); 
    sw.Close();
}

static string GetRtfUnicodeEscapedString(string s)
{
    var sb = new StringBuilder();
    foreach (var c in s)
    {
    	if (c <= 0x7f)
    		sb.Append(c);
    	else
    		sb.Append("\\u" + Convert.ToUInt32(c) + "?");
    }
    return sb.ToString();
}

最重要的一点是 Convert.ToUInt32(C)基本上返回有问题的字符code点值。为UNI code中的RTF逃生需要一个小数UNI code值。在 System.Text.Encoding.Uni code 编码对应为UTF-16根据MSDN文档。

The important bit is the Convert.ToUInt32(c) which essentially returns the code point value for the character in question. The RTF escape for unicode requires a decimal unicode value. The System.Text.Encoding.Unicode encoding corresponds to UTF-16 as per the MSDN documentation.

这篇关于如何输出UNI code字符串RTF(使用C#)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆