在Delphi 7中获取char值 [英] Getting char value in Delphi 7

查看:94
本文介绍了在Delphi 7中获取char值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Delphi 7中编写一个程序,该程序应该将unicode字符串编码为html实体字符串.例如,"ABCģķī"将导致" ABCģķī "

I am making a program in Delphi 7, that is supposed to encode a unicode string into html entity string. For example, "ABCģķī" would result in "ABCģķī"

现在有2项基本要求:

  1. Delphi 7是非Unicode的,所以我不能直接在代码中直接编写unichar字符以对其进行编码.
  2. 代码页包含255个条目,每个条目包含一个特定于该代码页的字符,除了前127个字符外,所有代码页均相同.

所以-如何获取一个介于1-255范围内的char值?

我尝试过 Ord(Integer),但它也返回超过255的值.基本上,一切都很好(A返回65等等),直到我的字符串到达​​非拉丁Unicode为止.

I tried Ord(Integer), but it also returns values way past 255. Basically, everything is fine (A returns 65 an so on) until my string reaches non-Latin unicode.

还有其他返回char值的方法吗?任何帮助表示赞赏

Is there any other method for returning char value? Any help appreciated

推荐答案

在HTML 4中,数字字符引用相对于HTML使用的字符集.无论是通过< meta> 标签在HTML本身中指定该字符集,还是通过HTTP/MIME Content-Type 标头或其他方式进行带外指定,没关系.这样,仅在HTML使用UTF的情况下,"ABC&#291;&#311;ī" 才是ABCģķī" 的准确表示.-16.如果HTML使用的是UTF-8,则正确的表示将是"ABC&#196;&#163;&#196;&#183;&#196;«""ABC&#xC4;&#xA3;&#xC4;·&#xC4;& #xAB;" 代替.大多数其他字符集不支持那些特定的Unicode字符.

In HTML 4, numeric character references are relative to the charset used by the HTML. Whether that charset is specified in the HTML itself via a <meta> tag, or out-of-band via an HTTP/MIME Content-Type header or other means, it does not matter. As such, "ABC&#291;&#311;&#299;" would be an accurate representation of "ABCģķī" only if the HTML were using UTF-16. If the HTML were using UTF-8, the correct representation would be either "ABC&#196;&#163;&#196;&#183;&#196;&#171;" or "ABC&#xC4;&#xA3;&#xC4;&#xB7;&#xC4;&#xAB;" instead. Most other charsets do no support those particular Unicode characters.

在HTML 5中,数字字符引用包含原始Unicode代码点值,而与HTML使用的字符集无关.这样,ABCģķī" 将表示为"ABC#291;&#311;ī" "ABCģ&#x0137;&#x012B;".

In HTML 5, numeric character references contain original Unicode codepoint values regardless of the charset used by the HTML. As such, "ABCģķī" would be represented as either "ABC#291;&#311;&#299;" or "ABC&#x0123;&#x0137;&#x012B;".

因此,要回答您的问题,首先要做的就是确定是否需要使用HTML 4或HTML 5语义来引用数字字符.然后,您需要将Unicode数据分配给使用UTF-16的 WideString (这是Delphi 7本机支持的唯一Unicode字符串类型),然后:

So, to answer your question, the first thing you have to do is decide whether you need to use HTML 4 or HTML 5 semantics for numeric character references. Then, you need to assign your Unicode data to a WideString (which is the only Unicode string type that Delphi 7 natively supports), which uses UTF-16, then:

  1. 如果您需要HTML 4:

  1. if you need HTML 4:

A.如果HTML字符集不是UTF-16,则使用 WideCharToMultiByte()(或等效方法)将 WideString 转换为该字符集,然后循环遍历结果值,输出未保留的字符保留值的原样和字符引用,对于小数点表示法使用 IntToStr(),对于十六进制表示法则使用 IntToHex().

A. if the HTML charset is not UTF-16, then use WideCharToMultiByte() (or equivalent) to convert the WideString to that charset, then loop through the resulting values outputting unreserved characters as-is and character references for reserved values, using IntToStr() for decimal notation or IntToHex() for hex notation.

B.如果HTML字符集为UTF-16,则只需遍历 WideString 中的每个 WideChar ,使用原样输出未保留的字符和保留值的字符引用.IntToStr()用于十进制表示法,或者 IntToHex()用于十六进制表示法.

B. if the HTML charset is UTF-16, then simply loop through each WideChar in the WideString, outputting unreserved characters as-is and character references for reserved values, using IntToStr() for decimal notation or IntToHex() for hex notation.

如果您需要HTML 5:

If you need HTML 5:

A.如果 WideString 不包含任何代理对,则只需遍历 WideString 中的每个 WideChar ,按原样输出未保留的字符和字符引用对于保留值,请使用 IntToStr()十进制表示法,或者将 IntToHex()十六进制表示法.

A. if the WideString does not contain any surrogate pairs, then simply loop through each WideChar in the WideString, outputting unreserved characters as-is and character references for reserved values, using IntToStr() for decimal notation or IntToHex() for hex notation.

B.否则,使用 WideStringToUCS4String() WideString 转换为UTF-32,然后循环遍历结果值,使用 IntToStr()(十进制表示法)或 IntToHex()(十六进制表示法).

B. otherwise, convert the WideString to UTF-32 using WideStringToUCS4String(), then loop through the resulting values outputting unreserved codepoints as-is and character references for reserved codepoints, using IntToStr() for decimal notation or IntToHex() for hex notation.

这篇关于在Delphi 7中获取char值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆