UTF-8和Latin-1字符 [英] UTF-8 and Latin-1 characters

查看:98
本文介绍了UTF-8和Latin-1字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于我是瑞典语,我使用瑞典语

并使用charset iso-8859-1编写网站内容。我(仅用于测试)试图在测试页面上使用

utf-8( http://w1.978.telia.com/~u97802964/test.html

但特殊的瑞典字符不要''如果我不为他们使用

实体,那就出来吧。


有问题的瑞典字符是:

拉丁文字母a上面有戒指=& aring; (?)

带有diaeresis的拉丁字母a =& auml; (?)

拉丁字母o with diaeresis =& ouml; (?)


我意识到我可以使用这些实体,但我找到了一个包含瑞典语的页面

内容( http://w1.318.comhem.se/~u31827122/scsiguide.html )其中

utf-8被使用,即使没有使用

的实体,角色也是正确的。


所以我向任何人提问谁能给出答案,是怎么回事?

我失败了,其他人可以做到吗?我无法看到任何不同的

这个可能使其成为可能的其他页面:(


-

/ Arne

Since I am Swedish, I write website content mostly in Swedish language
and using charset iso-8859-1. I have (just for testing) tried to use
utf-8 on a test page ( http://w1.978.telia.com/~u97802964/test.html )
but the special Swedish characters don''t come out right if I dont use
entities for them.

The Swedish characters in question is:
Latin letter a with ring above = å (?)
Latin letter a with diaeresis = ä (?)
Latin letter o with diaeresis = ö (?)

I realize I can use the entities, but I found a page with Swedish
content ( http://w1.318.comhem.se/~u31827122/scsiguide.html ) where
utf-8 is used and the characters come out right even without the use
of entities.

So my question to anybody who can give an answer, is how it comes that
I fail and somebody else can do it? I can''t see anything different in
this other page that could make it possible :(

--
/Arne

推荐答案

Arne写道:
Arne wrote:
由于我是瑞典语,我主要用瑞典语写网站内容
并使用charset iso-8859-1。我(仅用于测试)试图在测试页面上使用
utf-8( http://w1.978.telia.com/~u97802964/test.html
但是特别的如果我不使用
实体,瑞典字符就不会出现。


要使用UTF-8,你需要告诉你的编辑实际保存文件

为UTF-8。在< meta />标签中简单声明它实际上并没有将它编码为UTF-8。例如,在记事本中在WinXP上,选择

文件>保存Asa?|并在编码列表中选择UTF-8。其他编辑器在

支持UTF-8将在某处提供该选项。这个W3C I18N

文件解释了更多。

http://www.w3.org/International/ques...lications.html


目前,您的测试页面保存为ISO-8859-1,或者至少

某些东西(可能是windows-1252)共享相同的字符代码

这些特殊字符:有问题的瑞典字符是:
拉丁字母a,上面有戒指=& aring; (?¥)
拉丁字母a with diaeresis =& auml; (?¤)
拉丁字母o with diaeresis =& ouml; (??)
Since I am Swedish, I write website content mostly in Swedish language
and using charset iso-8859-1. I have (just for testing) tried to use
utf-8 on a test page ( http://w1.978.telia.com/~u97802964/test.html )
but the special Swedish characters don''t come out right if I dont use
entities for them.
To use UTF-8, you need to tell your editor to actually save the file
as UTF-8. Simple declaring it in the <meta/> tag does not actually make
it be encoded as UTF-8. For example, in Notepad on WinXP, choose
File>Save Asa?| and select UTF-8 in the Encoding list. Other editors that
support UTF-8 will have the option available somewhere. This W3C I18N
document explains more.

http://www.w3.org/International/ques...lications.html

Currently, your test page is saved as ISO-8859-1, or at least
something (maybe windows-1252) that shares the same character codes for
these special characters: The Swedish characters in question is:
Latin letter a with ring above = &aring; (?¥)
Latin letter a with diaeresis = &auml; (?¤)
Latin letter o with diaeresis = &ouml; (??)




如果您使用浏览器手动设置字符编码为

iso-8859-1,则字符似乎显示正确。但是,

是您的页面需要解决的其他问题。


这是来自测试页面的源代码:


<?xml version =" 1.0" encoding =" iso-8859-1"?>

<!DOCTYPE html PUBLIC" - // W3C // DTD XHTML 1.1 // EN"

http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

< html xmlns =" http://www.w3.org/1999 / xhtml">

< head>

< META http-equiv =" Content-Type"含量=" text / html的; charset = UTF-8">

....


1.<?xml?>中的编码处理指令不匹配

< meta />元件。无论如何,最好省略它,

,因为它将IE置于怪癖模式,特别是当文档是

作为text / html提供时。


2.文档类型是XHTML 1.1,但它被用作text / html。

规范明确指出它*必须不*,并且应该作为

application / xhtml + xml,text / xml或application / xml。所以,除非你有

能够设置内容协商并提供HTML4.01或XHTML

1.0严格到IE(以及其他只支持text / html的内容)和其他支持application / xhtml + xml的其他人一样,对于其他人来说,我建议你写HTML 4.01或XHTML 1.0严格。

>
-

Lachlan Hunt
http://www.lachy.id.au/
la ********** @ lachy.id.au.update.virus.scan ners


删除.update.virus.scanners到给我发电子邮件,

没有垃圾邮件和没有病毒!!!



If you set the character encoding manually with your browser to
iso-8859-1, the characters seem to display correctly. However, there
are other problems with your page that need addressing as well.

This is from the source code of your test page:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<META http-equiv="Content-Type" content="text/html; charset=UTF-8">
....

1. The encoding in the <?xml?> processing instruction does not match
that in the <meta/> element. It''s probably better to omit it anyway,
because it puts IE into quirks mode, especially when the document is
being served as text/html.

2. The doctype is XHTML 1.1, but it''s being served as text/html. The
spec explicitly says that it *must not*, and should be served as
application/xhtml+xml, text/xml or application/xml. So, unless you have
the ability to set up content negotiation and serve HTML4.01 or XHTML
1.0 Strict to IE (and anything else that only supports text/html) and
XHTML 1.1 to others that do support application/xhtml+xml, then I
recommend you either write HTML 4.01 or XHTML 1.0 strict.

--
Lachlan Hunt
http://www.lachy.id.au/
la**********@lachy.id.au.update.virus.scanners

Remove .update.virus.scanners to email me,
NO SPAM and NO VIRUSES!!!


Arne写道:
Arne wrote:
从我是瑞典语,我主要用瑞典语写网站内容
并使用charset iso-8859-1。我(仅用于测试)试图在测试页面上使用
utf-8( http://w1.978.telia.com/~u97802964/test.html
但如果我不使用特殊的瑞典角色就不会出来
对他们来说实体。

所以我的问题是任何能给出答案的人,是怎么回事?
我失败了,其他人可以做到这一点?我无法看到其他页面可能有任何不同之处:(
Since I am Swedish, I write website content mostly in Swedish language
and using charset iso-8859-1. I have (just for testing) tried to use
utf-8 on a test page ( http://w1.978.telia.com/~u97802964/test.html )
but the special Swedish characters don''t come out right if I dont use
entities for them.

So my question to anybody who can give an answer, is how it comes that
I fail and somebody else can do it? I can''t see anything different in
this other page that could make it possible :(




1)正确<?xml version =" 1.0" encoding =" iso-8859-1"?>

(这不是问题,但需要改变)


我认为它'你的文本编辑器,你在用什么?检查它设置为

utf-8模式。


字符在我的文本编辑器中显示为中文,在

我的浏览器。


-

马特

----- =通过Newsfeeds.Com发布,未经审查的Usenet新闻= -----
http://www.newsfeeds.com - 世界排名第一的新闻组服务!

----- ==超过100,000个新闻组--19个不同的服务器! = -----



1) Correct <?xml version="1.0" encoding="iso-8859-1"?>
(that''s not the problem, but needs changing)

I think it''s your text editor, what are you using? Check it is set to
utf-8 mode.

The characters appear chinese in my text editor, and ''missing symbol'' in
my browser.

--
Matt
-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 100,000 Newsgroups - 19 Different Servers! =-----


Lachlan Hunt< la ********** @ lachy.id.au.update.virus.scanners>写道:
Lachlan Hunt <la**********@lachy.id.au.update.virus.scanners> wrote:
2。 doctype是XHTML 1.1,但它被用作text / html。
规范明确说明*它必须不*
2. The doctype is XHTML 1.1, but it''s being served as text/html. The
spec explicitly says that it *must not*




不正确。


-

Spartanicus



Incorrect.

--
Spartanicus


这篇关于UTF-8和Latin-1字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆