如何从十六进制值识别编码? [英] How to identify encoding from hex values?
问题描述
我在网站上显示的文字如下:o¨
而不是ö
I have text on a website that displays like that: o¨
instead of ö
我从CMS中提取了文本并分析了它的十六进制值:
I extracted the text out of the CMS and analysed it's hex values:
- 显示正确的
c3 b6
-UTF-8 - 显示不正确的ö
6f cc 88
- the ö's that are displays correctly have
c3 b6
- UTF-8 - the ö's that are displayed incorrect have
6f cc 88
我找不到这是什么编码。识别编码的好方法是什么?
I couldn't find out what encoding this is. What's a good way to identify the encoding?
推荐答案
6F
是UTF-8(ASCII)编码为 o,没什么特别的。
CC 88
是 U + 0308,将DIAERESIS合并。
6F
is the UTF-8 (ASCII) encoding of "o", nothing spectacular.
CC 88
is the UTF-8 encoding of U+0308, COMBINING DIAERESIS.
您只是在查看o-umlaut的分解形式。应该直观地显示一个组合的日耳曼字符 ,并与先前的字符进行 combined 。如果您的系统没有执行此操作,则意味着它无法正确处理Unicode,并且/或者您选择的字体有些破损。也许您必须将您的字符串 normalise 替换为组合的Unicode形式,您的系统可以正确处理它。
You're simply looking at the decomposed form of the o-umlaut. A combining diaereses character should visually be rendered, well, combined with the previous character. If your system doesn't do that, it means it doesn't treat Unicode correctly, and/or the font you have chosen is somewhat broken. Perhaps you have to normalise your strings into the composed Unicode form instead for your system to handle it correctly.
这篇关于如何从十六进制值识别编码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!