威尔士语 - ISO-8859-1还是Unicode? [英] Welsh language - ISO-8859-1 or Unicode ?

查看:56
本文介绍了威尔士语 - ISO-8859-1还是Unicode?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好 -


我正在为一个计划为一个现有的大型系统增加威尔士语支持的团队工作基于网络和

到目前为止仅限英语。我听说威尔士语中有2个字符

(w-circumflex和y-circumflex)在我们的默认ISO-8859-1

字符集中不受支持,所以部分转移到Unicode以便内部存储文本

可能是必需的。


我还没有找到使用这些2的威尔士语网站字符,

所以它们在威尔士语中实际使用得多吗?不支持他们可能会导致问题吗?


谢谢

解决方案

"西蒙" < ds ******* @ eeee.invalid.comwrote in message

news:48 ********************* **@news.gradwell.net ..


你好 -


我正在为一个计划为一个团队添加威尔士语言支持的团队工作/>
大型现有IT系统,部分基于网络,目前为止只有英语版b
。我听说威尔士语中有2个字符

(w-circumflex和y-circumflex)不支持我们的默认



ISO -8859-1


字符集,因此可能需要部分移动到Unicode以便内部存储文本




我还没有找到使用这2个字符的威尔士语网站,

所以他们在威尔士语中实际使用的很多吗?不支持他们可能会导致问题吗?


谢谢



我已经刚刚在第三个

段的末尾找到了一个使用y-circumflex的网页,所以不可能不常见:
http://news.bbc.co.uk/welsh/hi/newsi...00 /7462534.stm


此网页使用ISO-8859-1和y-circumflex实体。在我的应用程序中使用

实体会非常混乱,所以如果需要支持这些

字符,我将不得不使用Unicode。

我想我的问题仍然是:不支持这两个字符

被认为是威尔士语商业应用程序的不良做法吗?


Scripsit Simon:


我正在组建一个计划为现有的大型IT系统添加威尔士语支持

的团队。部分基于网络和

到目前为止仅限英语。



您打算稍后添加其他语言吗?这是关于名字还是

也是关于散文的?毕竟,ISO-8859-1即使对于正常的英文散文也不够;想想破折号和正确的引号。


我听说威尔士语中有2个字符

(w-circumflex和y-circumflex )我们的默认值不支持

ISO-8859-1字符集,



对。它们包含在ISO-8859-14(又名ISO拉丁文8,或

Celtic)中,但这在WWW上不是一个可行的选项(IE没有

识别编码)。


因此可能需要部分移动到Unicode内部

存储文本。



这可能很简单,也可能非常复杂。但那个'b
真的超出了这些群体的范围。至于WWW创作是否值得关注,Unicode(特别是UTF-8)是一个不错的选择,但是你可以继续使用ISO-8859-1代表那些字母。使用字符

像带有抑扬符的w一样引用。但是你可能不得不用所涉及的数据库的编码问题来处理

,例如,带数据输入的



我还没有找到使用这2个

字符的威尔士语网站,那么它们在威尔士语中实际使用得多吗?



我不知道威尔士语,但我希望这些字符是如此罕见以至于使用一些笨拙的符号来表示这些字符引用他们不会是一个主要问题。


不支持他们可能导致问题吗?



有些人可能会说省略旋律是可以忍受的,但

它可能是独特的(即其他区别之间的唯一区别

相同的单词,认为上下文通常会解决问题)。在

2008中,我认为将语言支持添加到IT

系统而不正确支持它们是不合适的,需要所有字符

正确的写作。


-

Jukka K. Korpela(Yucca)
http://www.cs.tut.fi/~jkorpela/


" Jukka K. Korpela" < jk ****** @ cs.tut.fiwrote in message

news:lh ****************** @ reader1.news .saunalahti。 fi ...


Scripsit Simon:


我正在筹划一个正在筹划的团队将威尔士语言支持

添加到现有的大型IT系统中,该系统部分基于网络,目前为止只有英语版本的b



您打算稍后添加其他语言吗?这是关于名字还是

也是关于散文的?毕竟,ISO-8859-1即使对于正常的英文散文也不够;想想破折号和正确的引号。


我听说威尔士语中有2个字符

(w-circumflex和y-circumflex )我们的默认值不支持

ISO-8859-1字符集,



对。它们包含在ISO-8859-14(又名ISO拉丁文8,或

Celtic)中,但这在WWW上不是一个可行的选项(IE没有

识别编码)。


因此可能需要部分移动到Unicode内部

存储文本。



这可能很简单,也可能非常复杂。但那个'b
真的超出了这些群体的范围。至于WWW创作是否值得关注,Unicode(特别是UTF-8)是一个不错的选择,但是你可以继续使用ISO-8859-1代表那些字母。使用字符

像带有抑扬符的w一样引用。但是你可能不得不用所涉及的数据库的编码问题来处理

,例如,带数据输入的



我还没有找到使用这2个

字符的威尔士语网站,那么它们在威尔士语中实际使用得多吗?



我不知道威尔士语,但是我希望这些字符非常罕见

使用一些笨拙的符号,如字符引用他们不会是一个主要问题。


不支持他们可能导致问题吗?



有些人可能会说省略旋律是可以容忍的,但

它可能是独特的(即其他区别之间的唯一区别

相同的单词,认为上下文通常会解决问题)。在

2008中,我认为将语言支持添加到IT

系统而不正确支持它们是不合适的,需要所有字符

正确的写作。


-

Jukka K. Korpela(Yucca)
http://www.cs.tut.fi/~jkorpela/



感谢您的回复。


不幸的是,多语言支持并没有成为

系统设计中的优先考虑事项到目前为止,

虽然它一直是未来可能的要求。该系统是

数据库,Windows应用程序和Web应用程序的复杂混合体。我相信我们使用的所有

数据库和编程

语言已经支持Unicode,所以我的目标是使用那个

支持,而不是字符

就像你说的那样笨拙。


Hello -

I''m working on a team that is planning to add Welsh language support to a
large existing IT system which is partially web-based and
English-language-only so far. I''ve heard that 2 characters in Welsh
(w-circumflex and y-circumflex) are not supported in our default ISO-8859-1
character set, so a partial move to Unicode for internal storage of text
might be required.

I haven''t yet found a Welsh-language website that uses these 2 characters,
so are they actually used much in Welsh? Is not supporting them likely to
cause problems?

Thanks

解决方案

"Simon" <ds*******@eeee.invalid.comwrote in message
news:48***********************@news.gradwell.net.. .

Hello -

I''m working on a team that is planning to add Welsh language support to a
large existing IT system which is partially web-based and
English-language-only so far. I''ve heard that 2 characters in Welsh
(w-circumflex and y-circumflex) are not supported in our default

ISO-8859-1

character set, so a partial move to Unicode for internal storage of text
might be required.

I haven''t yet found a Welsh-language website that uses these 2 characters,
so are they actually used much in Welsh? Is not supporting them likely to
cause problems?

Thanks

I''ve just found a webpage that uses y-circumflex at the end of the third
paragraph, so it can''t be that uncommon:
http://news.bbc.co.uk/welsh/hi/newsi...00/7462534.stm

This webpage uses ISO-8859-1 with entities for the y-circumflex. Using
entities would be very messy in my application, so if support for these
characters is needed, I would have to go for Unicode.
I guess my question still is: would not supporting these 2 characters be
considered bad practice for a Welsh-language business application?


Scripsit Simon:

I''m working on a team that is planning to add Welsh language support
to a large existing IT system which is partially web-based and
English-language-only so far.

Do you plan to add other languages later? Is this about names only or
also about prose texts? After all, ISO-8859-1 is insufficient even for
normal English prose; think about dashes and proper quotations marks.

I''ve heard that 2 characters in Welsh
(w-circumflex and y-circumflex) are not supported in our default
ISO-8859-1 character set,

Right. They are included in ISO-8859-14 (a.k.a. ISO Latin 8, or
"Celtic"), but that?s not a feasible option on the WWW (IE does not
recognize that encoding).

so a partial move to Unicode for internal
storage of text might be required.

That might be easy, or it might be extremely complicated. But that''s
really beyond the scope of these groups. As far as WWW authoring is
concerned, Unicode - specifically UTF-8 - is a good option, but you
could keep using ISO-8859-1 and represent those letters using character
references like ŵ for w with circumflex. But you might have to deal
with the encoding problem of the data bases involved, for example, and
with data entry.

I haven''t yet found a Welsh-language website that uses these 2
characters, so are they actually used much in Welsh?

I don''t know Welsh, but I expect those characters to be so rare that
using some clumsy notation like character references for them wouldn''t
be a major problem.

Is not supporting them likely to cause problems?

Some people might say that it is tolerable to omit the circumflex, but
it may be distinctive (i.e. the only difference between otherwise
identical words, thought the context usually resolves the issue). And in
2008, I think it is inappropriate to add support to languages to IT
systems without supporting them properly, with all the characters needed
for their correct writing.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/


"Jukka K. Korpela" <jk******@cs.tut.fiwrote in message
news:lh******************@reader1.news.saunalahti. fi...

Scripsit Simon:

I''m working on a team that is planning to add Welsh language support
to a large existing IT system which is partially web-based and
English-language-only so far.


Do you plan to add other languages later? Is this about names only or
also about prose texts? After all, ISO-8859-1 is insufficient even for
normal English prose; think about dashes and proper quotations marks.

I''ve heard that 2 characters in Welsh
(w-circumflex and y-circumflex) are not supported in our default
ISO-8859-1 character set,


Right. They are included in ISO-8859-14 (a.k.a. ISO Latin 8, or
"Celtic"), but that?s not a feasible option on the WWW (IE does not
recognize that encoding).

so a partial move to Unicode for internal
storage of text might be required.


That might be easy, or it might be extremely complicated. But that''s
really beyond the scope of these groups. As far as WWW authoring is
concerned, Unicode - specifically UTF-8 - is a good option, but you
could keep using ISO-8859-1 and represent those letters using character
references like ŵ for w with circumflex. But you might have to deal
with the encoding problem of the data bases involved, for example, and
with data entry.

I haven''t yet found a Welsh-language website that uses these 2
characters, so are they actually used much in Welsh?


I don''t know Welsh, but I expect those characters to be so rare that
using some clumsy notation like character references for them wouldn''t
be a major problem.

Is not supporting them likely to cause problems?


Some people might say that it is tolerable to omit the circumflex, but
it may be distinctive (i.e. the only difference between otherwise
identical words, thought the context usually resolves the issue). And in
2008, I think it is inappropriate to add support to languages to IT
systems without supporting them properly, with all the characters needed
for their correct writing.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/

Thanks for your reply.

Unfortunately multi-lingual support has not really been a priority in the
system design up to now,
although it has always been a possible future requirement. The system is a
complex mixture of
databases, Windows applications and web applications. I believe all the
databases and programming
languages we use already support Unicode , so I would aim to use that
support, rather than character
references which would be clumsy as you say.


这篇关于威尔士语 - ISO-8859-1还是Unicode?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆