RFC:C中字符集支持的状态 [英] RFC: the state of charset support in C

查看:84
本文介绍了RFC:C中字符集支持的状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经花了两天时间深入研究字符集支持状态
C中的
,我写了一篇博文,总结了我的发现。

http://blog.reverberate.org/2007 /04/...ntures-with-c/


我是这个东西的新手,我非常感谢温柔
$ b关于我所犯的任何错误或误解的$ b更正!


我也很感激听到Windows上的情况以及更多内容

模糊的UNIX 。例如,iconv()是否可用于Windows

程序员?


谢谢,

Josh

I''ve spent the last two days delving into the state of charset support
in C, and I wrote a blog post summarizing my findings.

http://blog.reverberate.org/2007/04/...ntures-with-c/

I''m new to this stuff, and I would very much appreciate gentle
corrections about any mistakes or misconceptions I''ve made!

I''d also appreciate hearing about the situation on Windows and more
obscure UNIXes. For example, is iconv() available to Windows
programmers?

Thanks,
Josh

推荐答案

2007年4月21日14:25:19 -0700,Joshua Haberman< jh ******* @ gmail.com>

在comp.lang.c中写道:
On 21 Apr 2007 14:25:19 -0700, Joshua Haberman <jh*******@gmail.com>
wrote in comp.lang.c:

我花了两天时间钻研charset支持状态
$ C中的b $ b,我写了一篇博文,总结了我的发现。

http://blog.reverberate.org/2007/04/...ntures-with-c/


我是这些东西的新手,我非常感谢温和的

更正我所犯的任何错误或误解!
I''ve spent the last two days delving into the state of charset support
in C, and I wrote a blog post summarizing my findings.

http://blog.reverberate.org/2007/04/...ntures-with-c/

I''m new to this stuff, and I would very much appreciate gentle
corrections about any mistakes or misconceptions I''ve made!



说实话,你所犯的最大误解是

这是comp.lang.c的主题,因为它是不是。

To tell you the truth, the biggest misconception you have made is that
this is topical on comp.lang.c, because it is not.


我也很高兴听到有关Windows的情况以及更多的内容。例如,iconv()是否可用于Windows

程序员?
I''d also appreciate hearing about the situation on Windows and more
obscure UNIXes. For example, is iconv() available to Windows
programmers?



简单检查C标准会告诉你它包含

没有名为iconv.h的标题或名为iconv()的函数。既然它是一个非标准的(从C角度来看)扩展,那么这里不是热门话题

,以及哪些平台可能有这样的扩展,以及什么

扩展名可能适用于平台特定组。


C保证8位字符的数值在范围内

为0到255(含)。它允许(但不要求)支持更广泛的字符类型。其他一切都是实现定义的。


-

Jack Klein

主页: http://JK-Technology.Com



comp的常见问题解答。 lang.c http://c-faq.com/

comp.lang.c ++ http://www.parashift.com/ c ++ - faq-lite /

alt.comp.lang.learn.c-c ++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html

A simple check of the C standard would have told you that it contains
no header named iconv.h or function named iconv(). Since it is a
non-standard (from a C point of view) extension, it is not topical
here, and what platforms might have such an extension, and what that
extension might do, is for platform specific groups.

C guarantees for 8-bit characters having numeric values in the range
of 0 to 255 inclusive. It allows, but does not require, support for
wider character types. Everything else is implementation-defined.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html


Joshua Haberman< jh ******* @ gmail.comwrote:
Joshua Haberman <jh*******@gmail.comwrote:

我花了最近两天的时间深入研究字符集支持状态在C中的
,我写了一篇博文,总结了我的发现。
I''ve spent the last two days delving into the state of charset support
in C, and I wrote a blog post summarizing my findings.

http://blog.reverberate.org/2007/04/...ntures-with-c/


我对这些东西很陌生,我非常感谢你对我所犯的任何错误或误解进行了温和的修改!
I''m new to this stuff, and I would very much appreciate gentle
corrections about any mistakes or misconceptions I''ve made!



charset支持并不全面,而且恕我直言恕不另行支付

明显意图。


宽字符界面是不够的,因为可用的例程

预先假设一个字符集的质量,许多字符集都无法遵守
。在许多情况下,你不能做出关键的决定(例如

isalpha)给出一个wchar_t对象(无论宽度如何)。我建议你花一些时间在unicode.org上了解问题

你自己。


全面,正确和可论便携式字符集操作,

我建议使用ICU库。但这在这里是偏离主题的。除此之外,如果你削减了

的要求,你可以通过使用标准的C接口来混淆
;即增加不透明度,使得你不需要b $ b b做出某些区别(如isalpha / iswalpha),和/或在

中使用UTF-8等它在C的范围内起作用的方式。语言环境。我说

混乱,但也许这是一个不必要的解释性特征

,因为调整范围通常是解决问题的最佳方法。我简单地说,b $ b意味着剥夺那些相信标准C真的可以支持的人

全面的I18N文本处理这个概念。

charset support is not comprehensive, and franky broken IMHO given the
apparent intentions.

The wide-character interface is insufficient, because the routines available
pre-suppose qualities of a character set that many character sets are unable
to abide by. In many cases, a you cannot make critical determinations (like
"isalpha") given soley a single wchar_t object (regardless of the width). I
suggest you spend some time over at unicode.org understanding the issues
yourself.

For comprehensive, correct and arguably portable character set manipulation,
I suggest the ICU library. But that''s off-topic here. That aside, you can
muddle through using standard C interfaces if you cut back on your
requirements; i.e. increase the level of opacity such that you don''t need to
make certain distinctions (like isalpha/iswalpha), and/or employ UTF-8 in
such a way that it works within the confines of the "C" locale. I said
"muddle", but maybe that''s an unnecessarily deragatory characterization
since adjusting scope is often the best way to address an issue. I simply
mean to dispossess those who believe standard C really can support
comprehensive I18N text manipulation of that notion.


Jack Klein写道:

[...]
Jack Klein wrote:
[...]

对于具有数值的8位字符的C保证范围

为0到255(含)。它允许(但不要求)支持更广泛的字符类型。其他一切都是实现定义的。
C guarantees for 8-bit characters having numeric values in the range
of 0 to 255 inclusive. It allows, but does not require, support for
wider character types. Everything else is implementation-defined.



< pedant>

标准保证0到255,_或_128到127,

,因为它没有对未加修饰的char进行签名?

< / pedant>


-

+ ------------------------- + ------------------- - + ----------------------- +

| Kenneth J. Brody | www.hvcomputer.com | #include |

| kenbrody / at\spamcop.net | www.fptech.com | < std_disclaimer.h |

+ ------------------------- + --------- ----------- + ----------------------- +

不要 - 邮寄给我:< mailto:Th ************* @ gmail.com>

<pedant>
Doesn''t the Standard guarantee 0 through 255, _or_ -128 through 127,
as it doesn''t impose a signedness on unadorned "char"?
</pedant>

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h|
+-------------------------+--------------------+-----------------------+
Don''t e-mail me at: <mailto:Th*************@gmail.com>


这篇关于RFC:C中字符集支持的状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆