将纯文本页面视为HTML时 [英] When plain text page is treated as HTML

查看:64
本文介绍了将纯文本页面视为HTML时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能离主题太远,但是我在看这个页面
http://www.hixie.ch/advocacy/xhtml 关于IHT Hickson的XHTML问题。


根据Firefox的说法,它以text / plain的形式提供br />
响应标题 - http://www.hixie.ch/宣传/ xhtml


日期:2005年11月23日星期三21:36:06 GMT

服务器:Apache / 1.3.33(Unix) DAV / 1.0.3 mod_fastcgi / 2.4.2

mod_gzip / 1.3.26.1a PHP / 4.3.10 mod_ssl / 2.8.22 OpenSSL / 0.9.7e

变化:接受-Encoding,User-agent

X-Pingback: http:// tracking .damowmow.com /

内容 - 语言:en-GB-Hixie

最后修改日期:2005年9月17日星期六12:16:19 GMT

Etag:" 17063c7-4a12-432c0913"

接受范围:字节

保留 - 活着:超时= 15,最大= 100

连接:Keep-Alive

内容类型:text / plain; charset = utf-8

内容编码:gzip

内容长度:7452


200 OK


页面显示在Firefox中,在Opera中显示文本是由预标签包围的

。在Safari 2中,页面显示为单个长

(但是自动换行)字符串,就像Safari将其视为HTML标记一样。


有趣的指向我的是显示的内容是

不完整。 Safari有内容,因为查看来源确认。

内容未显示的地方是


< script type =" text / javascript" ;><! - // - ><![CDATA [//><! -

...

// - ><!]]>< / script>


替换为*


< script type = " text / javascript"><! - // - ><![CDATA [//><! -

...

// - ><!]]>< / script>


未被任何内容取代。


显示的文件截断下一段,当它遇到



< script>和< style>


鉴于脚本元素永远不会关闭,隐藏

内容似乎是合理的。


所以我的问题是,如果浏览器以Firefox和Opera的方式显示文本/普通文件

,或浏览器应该深入了解

用于HTML(或其他标签)的文件与Safari一样吗?


或者它应该使用一些启发式来猜测服务器,给定

服务器数量没有正确识别内容类型?


如果浏览器只关注

服务器提供的内容类型,它应该怎么做关于file.css作为text / html而不是

of text / css?或者当css文件可能被认为包含在调用它的html文件中时,这不是问题吗?


-
http://www.ericlindsay.com

解决方案

2005年11月24日星期四,Eric Lindsay写道:

这可能离主题太远,但是我在看这个页面
http://www.hixie.ch/advocacy/xhtml关于Ian Hickson的XHTML问题。


我经常看到来自Hixie的纯文本文件,但我必须承认

我没看过他们的标题。

根据Firefox
响应标题 - http://www.hixie.ch/advocacy/xhtml

[...]不同:接受编码,用户代理
[... ] Content-Type:text / plain; charset = utf-8


哪个至少*暗示*可能有其他变种

可用,虽然我们不知道它们是什么...


但访问 http: //www.hixie.ch/advocacy/ 显示了传统的

目录列表。如果有任何替代版本供应给其他浏览器或其他字符编码,那么它必须是某种服务器转换完成的b
... ? *请注意

接受语言*不是*根据

到该Vary标题的协商维度之一,即使看起来有法语
目录列表中提供
翻译。

页面显示在Firefox中,在Opera中显示文本被预标签包围。


唔不,它显示为纯文本。两种断言之间存在很大的差异

,当材料包含标记和

& -notations时 - 这就是。

在Safari中2,页面显示为单个长(但包装有字)的字符串,就像Safari将其视为HTML标记一样。


Booooooh!

有趣的是,显示的内容不完整。 Safari有内容,看源确认。
内容不显示的地方是

< script type =" text / javascript">< ;! - // - ><![CDATA [//><! -
...
// - ><!]]>< / script>


这是有趣的东西,但你真的不能让自己如此粗暴地从制作真实的网页转移到b $ b,或者你会冒险最终像我一样

- 发布过多关于迂腐的细节,并且永远不会到处更新我可悲的过时的网页。不好。

所以我的问题是,浏览器应该像Firefox和Opera一样显示文件/文本/文件,


当然。

或浏览器应该像Safari一样深入了解HTML(或其他标签)的文件?


叹息。我一直在为RFC2616的任务而苦苦挣扎,但

不知何故它似乎没有沉没回家。

表格下面的注释。 type.html#browconftarget =_ blank> http://ppewww.ph.gla.ac.uk/~flavell/....html#browconf


现在直接带你到相关部分(W3C的

HTML-ised副本)RFC2616 -
http://www.w3.org/Protocols/rfc2616/....html#sec7.2.1

或者它应该使用一些启发式来猜测服务器,


绝对完全没有。 RFC2616禁止它。

给出了无法正确识别内容类型的服务器数量?


浏览器仍然可以向其用户说借口

我,这个内容似乎是错误的类型。你的安全性存在风险,我可以试着猜测一下,你准备好接受这个机会了吗?&#。 RFC2616排除的是,客户代理应该单独进行单边猜测,而不是通知其用户的b $ b同意。无论如何,这是我最好的解释。

如果浏览器只关注
服务器提供的内容类型,它应该如何处理file.css text / html而不是text / css?




根据RFC2616,它被强制忽略它,即呈现HTML

没有它,而Mozilla也这样做[1]:这是正确的行为。

不幸的是,其他一些浏览器并不那么谨慎。网站将会是一个更好的地方。


[1]至少在其标准模式下。


2005年11月23日星期三,Alan J. Flavell写道:

Vary:Accept-Encoding,User-agent


如果有其他浏览器提供的替代版本或其他字符编码的
,则必须通过某种类型的服务器转换来完成...?




对不起,我在那一点上我的嘴太快了。它不是b $ b" accept-charset"在该标题中,它是accept-encoding。这就是为什么

他的服务器发送了gzip-ed内容,因为浏览器说它是b / b
愿意接受这种编码。与

字符编码(" charset")无关。对不起 - 发现我的错误

太晚了!


-

匆匆发布,闲暇时悔改.. 。


Eric Lindsay< NO ********** @ ericlindsay.com>写道:

页面显示在Firefox中,而在Opera中,好像文本被预标签包围。在Safari 2中,页面显示为一个长的
(但是自动换行)字符串,好像Safari将其视为HTML标记。




提交的错误#4353871,at:< http://bugreporter.apple.com>。

sherm--


-

Perl中的Cocoa编程: http://camelbones.sourceforge.net

雇用我!我的简历: http://www.dot-app.org


This may be too far off topic, however I was looking at this page
http://www.hixie.ch/advocacy/xhtml about XHTML problems by Ian Hickson.

It is served as text/plain, according to Firefox
Response Headers - http://www.hixie.ch/advocacy/xhtml

Date: Wed, 23 Nov 2005 21:36:06 GMT
Server: Apache/1.3.33 (Unix) DAV/1.0.3 mod_fastcgi/2.4.2
mod_gzip/1.3.26.1a PHP/4.3.10 mod_ssl/2.8.22 OpenSSL/0.9.7e
Vary: Accept-Encoding,User-agent
X-Pingback: http://tracking.damowmow.com/
Content-Language: en-GB-Hixie
Last-Modified: Sat, 17 Sep 2005 12:16:19 GMT
Etag: "17063c7-4a12-432c0913"
Accept-Ranges: bytes
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/plain; charset=utf-8
Content-Encoding: gzip
Content-Length: 7452

200 OK

The page displays in Firefox, and in Opera as if the text were
surrounded by pre tags. In Safari 2, the page displays as a single long
(but word wrapped) string, as if Safari were treating it as HTML markup.

The interesting point to me is that the displayed contents are
incomplete. Safari has the contents, as looking at source confirms.
The places where the contents are not displayed are

<script type="text/javascript"><!--//--><![CDATA[//><!--
...
//--><!]]></script>

which is replaced by *

<script type="text/javascript"><!--//--><![CDATA[//><!--
...
//--><!]]></script>

which is not replaced by anything.

The document as displayed truncates on the next paragraph, when it
encounters

<script> and <style>

Given that the script element never closes, it seems reasonable to hide
the contents.

So my question is, should a browser display a file served as text/plain
the way Firefox and Opera do, or should a browser look deep inside the
file for HTML (or other tags) the way Safari does?

Or should it use some heuristic to second guess the server, given the
number of servers that do not correctly identify content-type?

If a browser pays attention only to the content-type as provided by the
server, what should it do about a file.css served as text/html instead
of text/css? Or isn''t that a problem when the css file could be
considered to be included in the html file that calls it?

--
http://www.ericlindsay.com

解决方案

On Thu, 24 Nov 2005, Eric Lindsay wrote:

This may be too far off topic, however I was looking at this page
http://www.hixie.ch/advocacy/xhtml about XHTML problems by Ian Hickson.
I''ve often seen plain-text documents from Hixie, but I must admit
I hadn''t looked at their headers.
It is served as text/plain, according to Firefox
Response Headers - http://www.hixie.ch/advocacy/xhtml
[...] Vary: Accept-Encoding,User-agent [...] Content-Type: text/plain; charset=utf-8
Which is at least *suggestive* that there might be other variants
available, although we don''t know what they are...

But a visit to http://www.hixie.ch/advocacy/ shows a conventional
directory listing. If there''s any alternative version served out to
other browsers or in other character encodings, it would have to be
done by some kind of server conversion...? *Do* note that
accept-language is *not* one of the negotiation dimensions according
to that Vary header, even though there appears to be a French
translation available in the directory listing.
The page displays in Firefox, and in Opera as if the text were
surrounded by pre tags.
Well no, it displays "as plain text". There are big differences
between the two assertions, when the material contains markup and
&-notations - which this does.
In Safari 2, the page displays as a single long (but word wrapped)
string, as if Safari were treating it as HTML markup.
Booooooh!
The interesting point to me is that the displayed contents are
incomplete. Safari has the contents, as looking at source confirms.
The places where the contents are not displayed are

<script type="text/javascript"><!--//--><![CDATA[//><!--
...
//--><!]]></script>
This is fun stuff, but you really mustn''t let yourself be so grossly
diverted from making real web pages, or you''ll risk ending up like me
- posting too much about pedantic detail, and never getting around to
updating my sadly obsolescent web pages. Not good.
So my question is, should a browser display a file served as
text/plain the way Firefox and Opera do,
Of course.
or should a browser look deep inside the
file for HTML (or other tags) the way Safari does?
Sigh. I''ve been battering on about the mandate of RFC2616, but
somehow it doesn''t seem to have sunk home. See the notes below the
table at
http://ppewww.ph.gla.ac.uk/~flavell/....html#browconf ,
which now take you directly to the relevant section of (the W3C''s
HTML-ised copy of) RFC2616 -
http://www.w3.org/Protocols/rfc2616/....html#sec7.2.1
Or should it use some heuristic to second guess the server,
Absolutely and utterly not. RFC2616 forbids it.
given the number of servers that do not correctly identify
content-type?
It would still be permissible for a browser to say to its user "excuse
me, this content seems to be the wrong type. At some risk to your
security, I could try to guess this, are you prepared to take that
chance?". What RFC2616 is ruling out is that a client agent should
take it upon itself to unilaterally second-guess, without informed
consent from its user. That''s my best interpretation, anyway.
If a browser pays attention only to the content-type as provided by the
server, what should it do about a file.css served as text/html instead
of text/css?



Per RFC2616, it''s mandated to ignore it, i.e to render the HTML
without it, and Mozilla does so[1]: that''s correct behaviour.
Unfortunately, some other browsers are not so cautious. The web would
be a better place if they were.

[1] at least in its Standards mode.


On Wed, 23 Nov 2005, Alan J. Flavell wrote:

Vary: Accept-Encoding,User-agent

If there''s any alternative version served out to other browsers or
in other character encodings, it would have to be done by some kind
of server conversion...?



Sorry, I shot my mouth off too quickly on that point. It wasn''t
"accept-charset" in that header, it was "accept-encoding". That''s why
his server has sent gzip-ed content, because the browser said it was
willing to accept that encoding. Nothing to do with
character-encoding ("charset"). Sorry for that - spotted my mistake
just too late!

--
Post in haste, repent at leisure...


Eric Lindsay <NO**********@ericlindsay.com> writes:

The page displays in Firefox, and in Opera as if the text were
surrounded by pre tags. In Safari 2, the page displays as a single long
(but word wrapped) string, as if Safari were treating it as HTML markup.



Filed bug #4353871, at: <http://bugreporter.apple.com>.

sherm--

--
Cocoa programming in Perl: http://camelbones.sourceforge.net
Hire me! My resume: http://www.dot-app.org


这篇关于将纯文本页面视为HTML时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆