经典的ASP文本替换和UTF-8编码 [英] Classic ASP text substitution and UTF-8 encoding
问题描述
我们有一个使用经典的ASP网站。
We have a website that uses Classic ASP.
我们的发布过程的一部分替换值在文件中,我们发现了一个bug在那里将写入文件出来为UTF-8。
Part of our release process substitutes values in a file and we found a bug in it where it will write the file out as UTF-8.
这进而导致我们的应用程序,开始吐出垃圾。撇号得到返回一些连接codeD字符。
This then causes our application to start spitting out garbage. Apostrophes get returned as some encoded characters.
如果我们然后去为remove,说这个文件是UTF-8,然后这是pviously呈现为垃圾$ P $现在可以正确显示文本的BOM。
If we then go an remove the BOM that says this file is UTF-8 then the text that was previously rendered as garbage is now displayed correctly.
有一些IIS做不同的,当它遇到UTF-8的文件?
Is there something that IIS does differently when it encounters UTF-8 a file?
推荐答案
UTF-8不使用物料清单;它是在使他们有一些微软的软件一个恼人的不好的特性。你需要找到你的发布过程中哪一步是把一个UTF-8-CN codeD BOM在你的文件,并修复它 - 你应该停止,即使你的是的使用UTF-8,这实际上这些天是最好的。
UTF-8 does not use BOMs; it is an annoying misfeature in some Microsoft software that puts them there. You need to find what step of your release process is putting a UTF-8-encoded BOM in your files and fix it — you should stop that even if you are using UTF-8, which really these days is best.
但我怀疑这是IIS引起显示问题。更有可能的是,浏览器在猜测最终的页面显示的字符集,当它看到,看上去就像他们是UTF-8 EN codeD字节猜测的整个页面是UTF-8。您应该能够通过使用HTTP头阻止它这样做,通过阐明一个明确的字符集:
But I doubt it's IIS causing the display problem. More likely the browser is guessing the charset of the final displayed page, and when it sees bytes that look like they're UTF-8 encoded it guesses the whole page is UTF-8. You should be able to stop it doing that by stating a definitive charset by using an HTTP header:
Content-Type: text/html;charset=iso-8859-1
和/或HTML meta元素
and/or a meta element in the HTML
<meta http-equiv="Content-Type" content="text/html;charset=iso-8859-1" />
现在(假设ISO-8859-1实际上是字符集的数据中),它应该显示确定。但是,如果你的文件确实有在启动一个UTF-8-CN codeD BOM,你现在看到在你的页面,这是因为我»¿'是什么,这些字节看起来像ISO-8859- 1。所以,你仍然需要摆脱misBOM的。
Now (assuming ISO-8859-1 is actually the character set your data are in) it should display OK. However if your file really does have a UTF-8-encoded BOM at the start, you'll now see that as ‘’ in your page, which is what those bytes look like in ISO-8859-1. So you still need to get rid of that misBOM.
这篇关于经典的ASP文本替换和UTF-8编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!